Quickest way to deindex a large number of pages
-
Our site was recently hacked by spammers posting fake content and bringing down our servers, etc. After a few months, we finally figured out what was going on and fixed the issue. However, it turns out that Google has indexed 26K+ spammy pages and we've lost page rank and search engine rankings as a result.
What is the best and fastest way to get these pages out of Google's index?
-
Given that I'm sure you've removed these pages from your site, there will be no page to which to add a meta-noindex tag.
Disallowing these pages in robots.txt in no way signals to the search engines that they should be removed from the index, just that they should no longer be crawled. Given that they're already indexed, blocking in robots.txt would potentially save some "crawl budget" but wouldn't do anything to remove them from the index.
So submitting them to the URL Removal Tool would be by far the most effective, along with an explanation.
You'll also want to keep a very close watch on your penalty warnings within Webmaster Tools. If you get flagged, you'll want a complete history of the issue and the steps you've taken to address it in order to prepare a reinclusion request.
Lastly, don't forget to submit these same URLs to the Bing Webmaster Tools Block URLs tool. You may not get a massive amount of traffic from Bing, but there's no sense throwing it away, since you've already prepared the URL removal list anyway.
Hope that helps?
Paul
-
Yup. Just wanted to add as well that if these pages are in a particular directory, then you can deindex the entire directory in one command using the URL removal tool.
-
Disallow in robots.txt
Add a noindex meta tag to these pages
Request Google to remove the URLs from their index via WMT URL removal request
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What does it exactly means when Google brings the "brand name" to the beggining of the page title in search results when it was actually given at the end?
We see many times...page titles starts with "brand name: page for etc" where actually "brand name" has been given at the end and keywords at beginning. Why does Google make this change? I noticed this happens when similar title tags are used by multiple websites for high difficulty keywords. Thanks
Algorithm Updates | | vtmoz0 -
ATTN SEO MINDS: Is there a way/tool to categorize keywords from an Omniture/GA report?
So ideally I would like to take the list of keywords I am currently ranking for, and group these based on what the user intent was in making that query. For example if I am a Thai delivery chain and I am currently receiving traffic from the queries "vegan dish" and "tofu thai food", I would want to have a column in a keyword report that says these queries fall into the VEGETARIAN category. I think what I want to know is how can I filter a massive list by a range of keywords? I want to know does this cell contain, "keyword A" or "keyword B" or "keyword Z". If so list the corresponding category. This way I can look at keyword performance by category or user intent/motivation. Is there a tool out there that will help me accomplish this, or is there a good solution in excel I can use?
Algorithm Updates | | Jonathan.Smith0 -
How to content marketing: Should my blog posts link to my sales page?
Hi, I've been doing a weekly blog making sure that each blog post contains my money keywords in the text, sometimes in h2 tags etc. My blog posts never contain any links to my actual sales page. Should I link each blog post to my sale page or is it overdoing it? Will internal linking of all my blog posts to my sales page will improve its page authority or have any SEO benefits? What about using exact match anchor text on these internal links? I couldn't find any resource online about this matter. Thank you for your opinion and help! -Marc
Algorithm Updates | | marcandre0 -
Long term plan for a large htaccess file with 301 redirects
We setup a pretty large htaccess file in February for a site that involved over 2,000 lines of 301 redirects from old product url's to new ones. The 'old urls' still get a lot of traffic from product review sites and other pretty good sites which we can't change. We are now trying to reduce the page load times and we're ticking all of the boxes apart from the size of the htaccess file which seems to be causing a considerable hang on load times. The file is currently 410kb big! My question is, what should I do in terms of a long terms strategy and has anyone came across a similar problem? At the moment I am inclined to now remove the 2,000 lines of individual redirects and put in a 'catch all' whereby anything from the old site will go to the new site homepage. Example code: RedirectMatch 301 /acatalog/Manbi_Womens_Ear_Muffs.html /manbi-ear-muffs.html
Algorithm Updates | | gavinhoman
RedirectMatch 301 /acatalog/Manbi_Wrist_Guards.html /manbi-wrist-guards.html There is no consistency between the old urls and the new ones apart from they all sit in the subfolder /acatalog/0 -
Google indexing my website's Search Results pages. Should I block this?
After running the SEOmoz crawl test, i have a spreadsheet of 11,000 urls of which 6381 urls are search results pages from our website that have been indexed. I know I've read that /search should be blocked from the engines, but can't seem to find that information at this point. Does anyone have facts behind why they should be blocked? Or not blocked?
Algorithm Updates | | Jenny10 -
Help on Page Load Time
I'm trying to track page load time of the visits on my site and GA only says to me that it's equal to zero and page load sample is aways zero too. I've made a research, and I found that GA is used to track page load time automatically, isn't it?
Algorithm Updates | | ivan.precisodisso0 -
Today all of our internal pages all but completely disappeared from google search results. Many of them, which had been optimized for specific keywords, had high rankings. Did google change something?
We had optimized internal pages, targeting specific geographic markets. The pages used the keywords in the url title, the h1 tag, and within the content. They scored well using the SEOmoz tool and were increasing in rank every week. Then all of a sudden today, they disappeared. We had added a few links from textlink.com to test them out, but that's about the only change we made. The pages had a dynamic url, "?page=" that we were about to redirect to a static url but hadn't done it yet. The static url was redirecting to the dynamic url. Does anyone have any idea what happened? Thanks!
Algorithm Updates | | h3counsel0 -
Google said that low-quality pages on your site may affect rankings on other parts
One of my sites got hit pretty hard during the latest Google update. It lost about 30-40% of its US traffic and the future does not look bright considering that Google plans a worldwide roll-out. Problem is, my site is a six year old heavy linked, popular Wordpress blog. I do not know why the article believes that it is low quality. The only reason I came up with is the statement that low-quality pages on a site may affect other pages (think it was in the Wired article). If that is so, would you recommend blocking and de-indexing of Wordpress tag, archive and category pages from the Google index? Or would you suggest to wait a bit more before doing something that drastically. Or do you have another idea what I could to do? I invite you to take a look at the site www.ghacks.net
Algorithm Updates | | badabing0