undefined
Skip to content
Moz logo Menu open Menu close
  • Products
    • Moz Pro
    • Moz Pro Home
    • Moz Local
    • Moz Local Home
    • STAT
    • Moz API
    • Moz API Home
    • Compare SEO Products
    • Moz Data
  • Free SEO Tools
    • Domain Analysis
    • Keyword Explorer
    • Link Explorer
    • Competitive Research
    • MozBar
    • More Free SEO Tools
  • Learn SEO
    • Beginner's Guide to SEO
    • SEO Learning Center
    • Moz Academy
    • SEO Q&A
    • Webinars, Whitepapers, & Guides
  • Blog
  • Why Moz
    • Agency Solutions
    • Enterprise Solutions
    • Small Business Solutions
    • Case Studies
    • The Moz Story
    • New Releases
  • Log in
  • Log out
  • Products
    • Moz Pro

      Your all-in-one suite of SEO essentials.

    • Moz Local

      Raise your local SEO visibility with complete local SEO management.

    • STAT

      SERP tracking and analytics for enterprise SEO experts.

    • Moz API

      Power your SEO with our index of over 44 trillion links.

    • Compare SEO Products

      See which Moz SEO solution best meets your business needs.

    • Moz Data

      Power your SEO strategy & AI models with custom data solutions.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Free SEO Tools
    • Domain Analysis

      Get top competitive SEO metrics like DA, top pages and more.

    • Keyword Explorer

      Find traffic-driving keywords with our 1.25 billion+ keyword index.

    • Link Explorer

      Explore over 40 trillion links for powerful backlink data.

    • Competitive Research

      Uncover valuable insights on your organic search competitors.

    • MozBar

      See top SEO metrics for free as you browse the web.

    • More Free SEO Tools

      Explore all the free SEO tools Moz has to offer.

    What is your Brand Authority?
    Moz

    What is your Brand Authority?

    Check yours now
  • Learn SEO
    • Beginner's Guide to SEO

      The #1 most popular introduction to SEO, trusted by millions.

    • SEO Learning Center

      Broaden your knowledge with SEO resources for all skill levels.

    • On-Demand Webinars

      Learn modern SEO best practices from industry experts.

    • How-To Guides

      Step-by-step guides to search success from the authority on SEO.

    • Moz Academy

      Upskill and get certified with on-demand courses & certifications.

    • SEO Q&A

      Insights & discussions from an SEO community of 500,000+.

    Unlock flexible pricing & new endpoints
    Moz API

    Unlock flexible pricing & new endpoints

    Find your plan
  • Blog
  • Why Moz
    • Small Business Solutions

      Uncover insights to make smarter marketing decisions in less time.

    • Agency Solutions

      Earn & keep valuable clients with unparalleled data & insights.

    • Enterprise Solutions

      Gain a competitive edge in the ever-changing world of search.

    • The Moz Story

      Moz was the first & remains the most trusted SEO company.

    • Case Studies

      Explore how Moz drives ROI with a proven track record of success.

    • New Releases

      Get the scoop on the latest and greatest from Moz.

    Surface actionable competitive intel
    New Feature

    Surface actionable competitive intel

    Learn More
  • Log in
    • Moz Pro
    • Moz Local
    • Moz Local Dashboard
    • Moz API
    • Moz API Dashboard
    • Moz Academy
  • Avatar
    • Moz Home
    • Notifications
    • Account & Billing
    • Manage Users
    • Community Profile
    • My Q&A
    • My Videos
    • Log Out

The Moz Q&A Forum

  • Forum
  • Questions
  • Users
  • Ask the Community

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

  1. Home
  2. SEO Tactics
  3. Intermediate & Advanced SEO
  4. Mass Removal Request from Google Index

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Mass Removal Request from Google Index

Intermediate & Advanced SEO
4
8
1.8k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as question
Log in to reply
This topic has been deleted. Only users with question management privileges can see it.
  • ioannisa
    ioannisa last edited by Apr 12, 2016, 7:00 PM

    Hi,

    I am trying to cleanse a news website.  When this website was first made, the people that set it up copied all kinds of articles they had as a newspaper, including tests, internal communication, and drafts.  This site has lots of junk, but this kind of junk was on the initial backup, aka before 1st-June-2012.  So, removing all mixed content prior to that date, we can have pure articles starting June 1st, 2012!

    Therefore

    1. My dynamic sitemap now contains only articles with release date between 1st-June-2012 and now
    2. Any article that has release date prior to 1st-June-2012 returns a custom 404 page with "noindex" metatag, instead of the actual content of the article.

    The question is how I can remove from the google index all this junk as fast as possible that is not on the site anymore, but still appears in google results?

    I know that for individual URLs I need to request removal from this link
    https://www.google.com/webmasters/tools/removals

    The problem is doing this in bulk, as there are tens of thousands of URLs I want to remove.  Should I put the articles back to the sitemap so the search engines crawl the sitemap and see all the 404?  I believe this is very wrong.  As far as I know this will cause problems because search engines will try to access non existent content that is declared as existent by the sitemap, and return errors on the webmasters tools.

    Should I submit a DELETED ITEMS SITEMAP using the <expires>tag? I think this is for custom search engines only, and not for the generic google search engine.
    https://developers.google.com/custom-search/docs/indexing#on-demand-indexing</expires>

    The site unfortunatelly doesn't use any kind of "folder" hierarchy in its URLs, but instead the ugly GET params, and a kind of folder based pattern is impossible since all articles (removed junk and actual articles) are of the form:
    http://www.example.com/docid=123456

    So, how can I bulk remove from the google index all the junk... relatively fast?

    1 Reply Last reply Reply Quote 0
    • KristinaKledzik
      KristinaKledzik @ioannisa last edited by Apr 15, 2016, 5:45 PM Apr 15, 2016, 5:45 PM

      Hi Ioannis,

      What about the first suggestion? Can you create a page linking to all of the pages that you'd like to remove, then have Google crawl that page?

      Best,

      Kristina

      1 Reply Last reply Reply Quote 0
      • ioannisa
        ioannisa last edited by Apr 15, 2016, 4:18 AM Apr 15, 2016, 4:05 AM

        Thank you Kristina,

        I know about the URL structure, I have been trying the past few months to cleanse this site that I was not involved in its creation.  It has several more SEO problems that have either been fixed or not yet, but we are talking about more than 50 SEO problems I've found so far - most of these critical.

        On the sitemap that I built, the junk pages do not exist, and because this is sitemap I have written myself, I can easily make another containing the articles that I have removed (just reverse a part of my select query for the sitemap to get the ones I have removed).

        http://www.neakriti.gr/webservices/sitemap-index.aspx

        So far I implemented the last of your suggestions and here is an example:

        This is a valid article page
        http://www.neakriti.gr/?page=newsdetail&DocID=1314221 - (Status Code: 200)

        This is a non existent article page (never existed at the first place) - (Status Code: 404)
        http://www.neakriti.gr/?page=newsdetail&DocID=12345678

        This is one of the articles that I removed from sitemap and site - (Status Code: 410)
        http://www.neakriti.gr/?page=newsdetail&DocID=894052

        Also I would like you to take a look at another question about the same site and see that it can relate to this question with garbage articles too...
        https://moz.com/community/q/multiple-instances-of-the-same-article

        Thank you so much!

        KristinaKledzik 1 Reply Last reply Apr 15, 2016, 5:45 PM Reply Quote 0
        • KristinaKledzik
          KristinaKledzik last edited by May 11, 2016, 7:55 AM Apr 14, 2016, 2:32 PM

          Hi Ioannis,

          You're in quite a bind here, without a good URL structure! I don't think there's any one perfect option, but I think all of these will work:

          • Create a page on your site that links to every article you would like to delete, keeping those articles 404/410ed. Then, use the Fetch as Googlebot tool, and ask Google to crawl the page plus all of its links. This will get Google to quickly crawl all of those pages, see that they're gone, and remove them from their index. Keep in mind that if you just use a 404, Google may keep the page around for a bit to make sure you didn't just mess up. As Eric said, a 410 is more of a sure thing.
          • Create an XML sitemap of those deleted articles, and have Google crawl it. Yes, this will create errors in GSC, but errors in GSC mean that they're concerned you've made a mistake, not that they're necessarily penalizing you. Just mark those guys as fixed and take the sitemap down once Google's crawled it.
          • 410 these pages, remove all internal links to them (use a tool like Screaming Frog to make sure you didn't miss any links!), and remove them from your sitemap. That'll distance you from that old, crappy content, and Google will slowly realize that it's been removed as it checks in on its old pages. This is probably the least satisfying option, but it's an option that'll get the job done eventually.

          Hope this helps! Let us know what you decide to do.

          Best,

          Kristina

          1 Reply Last reply Reply Quote 1
          • ioannisa
            ioannisa last edited by Apr 13, 2016, 2:15 PM Apr 13, 2016, 2:13 PM

            Thank you,

            so you suggest that based on my date based query, instead of blocking everything before that date blindly, keep blocking it with 410, while anything that doesn't exist anyway return 404.

            Also another question, about the blocked articles that return 410, should I put their URLs back on the xml sitemap or not?

            1 Reply Last reply Reply Quote 0
            • GlobeRunner
              GlobeRunner last edited by May 11, 2016, 7:55 AM Apr 13, 2016, 12:29 PM

              Any article that has release date prior to 1st-June-2012 should return a custom 410 page with "noindex" metatag, instead of the actual content of the article.

              The error returned should be a "410 gone" and not just a 404. That way Google will treat it differently, and may remove it from the index faster than just returning a 404. Also, you can use the Google removal tool, as well. Don't forget the robots.txt file, as well, there may be directories with the content that you need to disallow.

              But overall, using a 410 is going to be better and most likely faster.

              1 Reply Last reply Reply Quote 2
              • ioannisa
                ioannisa last edited by Apr 13, 2016, 8:45 AM Apr 13, 2016, 5:44 AM

                Thank you for your response.

                I defenintelly cannot use noindex because as I explained I changed all articles prior to the minimum given date to return 404.  So this content is not visibly available on the web in order to contain a noindex directive.  Unless you mean to have it at my custom 404 page, where yes its there.

                Also there is no folder to associate in robots, since they are in ugly form of GET params like DOCID=12345.  So given that, there are thousands of DocIDs that are junk and removed, and thousands that are the actuall articles.

                So I assumed that creating a "deleted articles" sitemap where each <url>will contain an <expires>2016-06-01</expires> tag seemed the most logical thing, but I am afraid its for "custom search engines", rather than for normal de-index requests as its provided bellow</url>

                https://developers.google.com/custom-search/docs/indexing#on-demand-indexing

                1 Reply Last reply Reply Quote 0
                • Martijn_Scheijbeler
                  Martijn_Scheijbeler last edited by Apr 13, 2016, 5:02 AM Apr 13, 2016, 5:02 AM

                  Sitemaps is definitely not the way to go for this as you can't just have an expires tag in there and it would make pages go away. The best option to go with is the meta robots and then put them either on nonindex, nofollow, or noindex, follow. With this approach and hopefully with a relative high crawl rate you can make sure that the data from these pages will be removed from the Google Index as soon as possible.

                  If you still want these pages to be indexed but maybe just not have them crawled anymore, which I don't think you'd like to do based on your explanation then go with robots.txt and excluding the pages in there that you'd like to.

                  1 Reply Last reply Reply Quote 2
                  • 1 / 1
                  1 out of 8
                  • First post
                    1/8
                    Last post

                  Got a burning SEO question?

                  Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                  Start my free trial


                  Browse Questions

                  Explore more categories

                  • Moz Tools

                    Chat with the community about the Moz tools.

                  • SEO Tactics

                    Discuss the SEO process with fellow marketers

                  • Community

                    Discuss industry events, jobs, and news!

                  • Digital Marketing

                    Chat about tactics outside of SEO

                  • Research & Trends

                    Dive into research and trends in the search industry.

                  • Support

                    Connect on product support and feature requests.

                  • See all categories

                  Related Questions

                  • chelseaskirtinguk

                    My product category pages are not being indexed on google can someone help?

                    My website has been indexed on google and all of its pages can be found on google except for the product category pages - which are where we want our traffic heading to, so this is a big problem for us. Our website is www.skirtinguk.com And an example of a page that isn't being indexed is https://www.skirtinguk.com/product-category/mdf-skirting-board/

                    Intermediate & Advanced SEO | Jul 1, 2022, 1:09 PM | chelseaskirtinguk
                    0
                  • Tylerj

                    Should I use noindex or robots to remove pages from the Google index?

                    I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?

                    Intermediate & Advanced SEO | Feb 28, 2017, 12:24 PM | Tylerj
                    0
                  • SolveWebMedia

                    My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help

                    Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance

                    Intermediate & Advanced SEO | Nov 23, 2015, 1:10 PM | SolveWebMedia
                    0
                  • bondhoward

                    Google indexed wrong pages of my website.

                    When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?

                    Intermediate & Advanced SEO | Feb 26, 2015, 9:48 PM | bondhoward
                    0
                  • IrvCo_Interactive

                    301s being indexed

                    A client website was moved about six months ago to a new domain. At the time of the move, 301 redirects were setup from the pages on the old domain to point to the same page on the new domain. New pages were setup on the old domain for a different purpose. Now almost six months later when I do a query in google on the old domain like site:example.com 80% of the pages returned are 301 redirects to the new domain. I would have expected this to go away by now. I tried removing these URLs in webmaster tools but the removal requests expire and the URLs come back. Is this something we should be concerned with?

                    Intermediate & Advanced SEO | Sep 25, 2014, 12:20 AM | IrvCo_Interactive
                    0
                  • khi5

                    Will Google View Using Google Translate As Duplicate?

                    If I have a page in English, which exist on 100 other websites, we have a case where my website has duplicate content. What if I use Google Translate to translate the page from English to Japanese, as the only website doing this translation will my page get credit for producing original content? Or, will Google view my page as duplicate content, because Google can tell it is translated from an original English page, which runs on 100+ different websites, since Google Translate is Google's own software?

                    Intermediate & Advanced SEO | Feb 8, 2014, 9:13 AM | khi5
                    0
                  • nicole.healthline

                    Best way to permanently remove URLs from the Google index?

                    We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt. I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt). What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login. What is the next best solution?

                    Intermediate & Advanced SEO | May 16, 2013, 12:17 AM | nicole.healthline
                    0
                  • BeTheBoss

                    Removing Dynamic "noindex" URL's from Index

                    6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google. It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index. I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?

                    Intermediate & Advanced SEO | Oct 15, 2012, 6:21 PM | BeTheBoss
                    0

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy

                  Looks like your connection to Moz was lost, please wait while we try to reconnect.