Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best practice for deindexing large quantities of pages
-
We are trying to deindex a large quantity of pages on our site and want to know what the best practice for doing that is. For reference, the reason we are looking for methods that could help us speed it up is we have about 500,000 URLs that we want deindexed because of mis-formatted HTML code and google indexed them much faster than it is taking to unindex them unfortunately.
We don't want to risk clogging up our limited crawl log/budget by submitting a sitemap of URLs that have "noindex" on them as a hack for deindexing. Although theoretically that should work, we are looking for white hat methods that are faster than "being patient and waiting it out", since that would likely take months if not years with Google's current crawl rate of our site.
-
Unfortunately, I don't think there's any easy/fast way to do this. I just ran a test to see how long it take Google to actually obey a noindex tag, and it's taken a little over 2 months for them all to be removed. I had 2 WP blogs that I added the noindex tag to all category, tag, and author pages and monitored the index count 4 or 5 times per week by running site:example.com inurl:/category/ queries. There was a lot of fluctuation at the beginnning, but eventually took hold after about 2 months. On one of the sites, I did add an XML sitemap with only the noindexed URLs on it, submitted it via Search Console, but that didn't seem to have an impact on how quickly they were dropped out.
See the screenshot below of my plotting of indexed pages per subfolder:
-
Hey,
you might be interested in this thread for getting your question answered.
https://moz.com/community/q/quickest-way-to-deindex-a-large-number-of-pages
Hope it helps. Cheers, Martin
-
Hi,
I have never tested the method that I'm sharing here. Please check once it might be helpful in your case.
https://www.searchcommander.com/how-to-bulk-remove-urls-google/
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Multiple Markups on The Same Page - Best Solution?
Hi there! I have a website that is build in react javascript, and I'm trying to use markup on my pages. They are mostly articles about general topics with common questions (about the topic), and for most articles I would like to use two markups: article markup + FAQ Markup ( for the questions in the article) article markup + how-to markup Can I do this or will Google get confused? Since I have two @type at the same time, for example @type": "FAQPage" and "@type": "Article". How should I think? I'm using https://schema.dev/ right now. Thanks!
Intermediate & Advanced SEO | | Leowa0 -
Few pages without SSL
Hi, A website is not fully secured with a SSL certificate.
Intermediate & Advanced SEO | | AdenaSEO
Approx 97% of the pages on the website are secured. A few pages are unfortunately not secured with a SSL certificate, because otherwise some functions on those pages do not work. It's a website where you can play online games. These games do not work with an SSL connection. Is there anything we have to consider or optimize?
Because, for example when we click on the secure lock icon in the browser, the following notice.
Your connection to this site is not fully secured Can this harm the Google ranking? Regards,
Tom1 -
What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
We are now introducing 5 links in all our category pages for different sorting options of category listings.
Intermediate & Advanced SEO | | lcourse
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon. Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization? On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.0 -
On 1 of our sites we have our Company name in the H1 on our other site we have the page title in our H1 - does anyone have any advise about the best information to have in the H1, H2 and Page Tile
We have 2 sites that have been set up slightly differently. On 1 site we have the Company name in the H1 and the product name in the page title and H2. On the other site we have the Product name in the H1 and no H2. Does anyone have any advise about the best information to have in the H1 and H2
Intermediate & Advanced SEO | | CostumeD0 -
Best Practices for Moving a Sub-Domain to a Sub-Folder
One of my clients is moving their subdomain to a subfolder on their main domain. (ie. blog.example.com to example.com/blog) I just wanted to get everyone's thoughts on some best practices for things we should be doing/looking for when making this move.? ie WMT, .htaccess, 301s etc? Thanks.
Intermediate & Advanced SEO | | DarinPirkey0 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
Do 404 pages pass link juice? And best practices...
Last year Google said bad links to 404 pages wouldn't hurt your site. Could that still be the case in light of recent Google updates to try and combat spammy links and negative SEO? Can links to 404 pages benefit a website and pass link juice? I'd assume at the very least that any link juice will pass through links FROM the 404 page? Many websites have great 404 pages that get linked to: http://www.opensiteexplorer.org/links?site=http%3A%2F%2Fretardzone.com%2F404 - that was the first of four I checked from the "60 Really Cool...404 Pages" that actually returned the 404 HTTP Status! So apologies if you find the word 'retard' offensive. According to Open Site Explorer it has a decent Page Authority and number of backlinks - but it doesn't show in Google's SERPs. I'd never do it, but if you have a particularly well-linked to 404 page, is there an argument for giving it 200 OK Status? Finally, what are the best practices regarding 404s and address bar links? For example, if
Intermediate & Advanced SEO | | Alex-Harford
www.examplesite.com/3rwdfs returns a 404 error, should I make that redirect to
www.examplesite.com/404 or leave it as is? Redirecting to www.examplesite.com/404 might not be user-friendly as people won't be able to correct the URL in the address bar. But if I have a great 404 page that people link to, I don't want links going to loads of random pages do I? Is either way considered best practice? If I did a 301 redirect I guess it would send the wrong signal to the crawlers? Should I use a 302 redirect, or even a 304 Not Modified redirect?1 -
Are there any negative effects to using a 301 redirect from a page to another internal page?
For example, from http://www.dog.com/toys to http://www.dog.com/chew-toys. In my situation, the main purpose of the 301 redirect is to replace the page with a new internal page that has a better optimized URL. This will be executed across multiple pages (about 20). None of these pages hold any search rankings but do carry a decent amount of page authority.
Intermediate & Advanced SEO | | Visually0