Skip to content
    Moz logo Menu open Menu close
    • Products
      • Moz Pro
      • Moz Pro Home
      • Moz Local
      • Moz Local Home
      • STAT
      • Moz API
      • Moz API Home
      • Compare SEO Products
      • Moz Data
    • Free SEO Tools
      • Domain Analysis
      • Keyword Explorer
      • Link Explorer
      • Competitive Research
      • MozBar
      • More Free SEO Tools
    • Learn SEO
      • Beginner's Guide to SEO
      • SEO Learning Center
      • Moz Academy
      • MozCon
      • Webinars, Whitepapers, & Guides
    • Blog
    • Why Moz
      • Digital Marketers
      • Agency Solutions
      • Enterprise Solutions
      • Small Business Solutions
      • The Moz Story
      • New Releases
    • Log in
    • Log out
    • Products
      • Moz Pro

        Your all-in-one suite of SEO essentials.

      • Moz Local

        Raise your local SEO visibility with complete local SEO management.

      • STAT

        SERP tracking and analytics for enterprise SEO experts.

      • Moz API

        Power your SEO with our index of over 44 trillion links.

      • Compare SEO Products

        See which Moz SEO solution best meets your business needs.

      • Moz Data

        Power your SEO strategy & AI models with custom data solutions.

      Track AI Overviews in Keyword Research
      Moz Pro

      Track AI Overviews in Keyword Research

      Try it free!
    • Free SEO Tools
      • Domain Analysis

        Get top competitive SEO metrics like DA, top pages and more.

      • Keyword Explorer

        Find traffic-driving keywords with our 1.25 billion+ keyword index.

      • Link Explorer

        Explore over 40 trillion links for powerful backlink data.

      • Competitive Research

        Uncover valuable insights on your organic search competitors.

      • MozBar

        See top SEO metrics for free as you browse the web.

      • More Free SEO Tools

        Explore all the free SEO tools Moz has to offer.

      NEW Keyword Suggestions by Topic
      Moz Pro

      NEW Keyword Suggestions by Topic

      Learn more
    • Learn SEO
      • Beginner's Guide to SEO

        The #1 most popular introduction to SEO, trusted by millions.

      • SEO Learning Center

        Broaden your knowledge with SEO resources for all skill levels.

      • On-Demand Webinars

        Learn modern SEO best practices from industry experts.

      • How-To Guides

        Step-by-step guides to search success from the authority on SEO.

      • Moz Academy

        Upskill and get certified with on-demand courses & certifications.

      • MozCon

        Save on Early Bird tickets and join us in London or New York City

      Unlock flexible pricing & new endpoints
      Moz API

      Unlock flexible pricing & new endpoints

      Find your plan
    • Blog
    • Why Moz
      • Digital Marketers

        Simplify SEO tasks to save time and grow your traffic.

      • Small Business Solutions

        Uncover insights to make smarter marketing decisions in less time.

      • Agency Solutions

        Earn & keep valuable clients with unparalleled data & insights.

      • Enterprise Solutions

        Gain a competitive edge in the ever-changing world of search.

      • The Moz Story

        Moz was the first & remains the most trusted SEO company.

      • New Releases

        Get the scoop on the latest and greatest from Moz.

      Surface actionable competitive intel
      New Feature

      Surface actionable competitive intel

      Learn More
    • Log in
      • Moz Pro
      • Moz Local
      • Moz Local Dashboard
      • Moz API
      • Moz API Dashboard
      • Moz Academy
    • Avatar
      • Moz Home
      • Notifications
      • Account & Billing
      • Manage Users
      • Community Profile
      • My Q&A
      • My Videos
      • Log Out

    The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. Home
    2. SEO Tactics
    3. Intermediate & Advanced SEO
    4. Removing Dynamic "noindex" URL's from Index

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Removing Dynamic "noindex" URL's from Index

    Intermediate & Advanced SEO
    5
    9
    3461
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with question management privileges can see it.
    • BeTheBoss
      BeTheBoss last edited by

      6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google.

      It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index.

      I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?

      1 Reply Last reply Reply Quote 0
      • Dr-Pete
        Dr-Pete Staff @BeTheBoss last edited by

        Hooray! Usually, I just give my advice and then run away, so it's always nice to hear I was actually right about something 😉 Seriously, glad you got it sorted out.

        1 Reply Last reply Reply Quote 1
        • BeTheBoss
          BeTheBoss @Dr-Pete last edited by

          Just a follow up to your suggestion.

          I created sitemaps for the pages I want removed using the google spreadsheet importXML functions, which saved a lot of time.

          It took a couple weeks but all of the pages, and similar pages, have successfully been removed from the index. Even the similar pages I didn't get a chance to put in the sitemap yet (importXML limits the results to 100).

          Your suggestion worked!

          Dr-Pete 1 Reply Last reply Reply Quote 0
          • BeTheBoss
            BeTheBoss @benjaminspak last edited by

            I can't 404 dynamic search pages.

            1 Reply Last reply Reply Quote 0
            • BeTheBoss
              BeTheBoss @Dr-Pete last edited by

              There are a mix of search pages and old mobile pages.

              The search pages I've been testing out having the canonical point to the default search page. I've seen a slight drop in these pages - but I guess I just have to be more patient.

              For the other pages the path is no longer there like you were mentioning. I like the idea of setting up the XML sitemap, I never even thought of making a bad/indexed page sitemap. I will give that a shot! Thankfully this will be a quick job with the importXml function in google spreadsheets! Great tip, hopefully it'll work.

              1 Reply Last reply Reply Quote 0
              • Dr-Pete
                Dr-Pete Staff last edited by

                Is there a crawl path to them currently? One issue I see a lot is that a bunch of pages get indexed, the path is found and cut off, NOINDEX (canonical, 301, etc.) is added, but then the pages never get re-crawled. Since they don't get recrawled, the page-level directive never gets honored.

                If there's a URL parameter involved, you could use parameter-handling in GWT - it's not a perfect solution, but it sometimes seems to work without a re-crawl.

                The other option would be to create a new XML sitemap with all of the bad/indexed URLs. This may push Google to re-crawl them and then see the tags to deindex. It's a bit safer than re-opening the crawl paths.

                If they are being crawled and Google is just ignoring the NOINDEX for some reason, I'd try to 301 or canonical those pages to a primary search page, if that's feasible (probably canonical, since you don't want the users to 301). Sometimes, if a signal isn't working for that long, you just have to shake Google and try a different signal. Even following their exact recommendations, it rarely works as planned at large scale.

                BeTheBoss 2 Replies Last reply Reply Quote 2
                • MagicDude4Eva
                  MagicDude4Eva last edited by

                  Don't use GWMT's removal tool to remove URLs which should not be in the index (unless those expose sensitive information). Best practise is to exclude them in robots.txt and to also ensure that the pages either 404 or have a noindex,noarchive tag.

                  1 Reply Last reply Reply Quote 0
                  • benjaminspak
                    benjaminspak last edited by

                    Change the site structure and let the pages 404, Google will deindex them if they are not being linked to.

                    BeTheBoss 1 Reply Last reply Reply Quote 0
                    • AgentsofValue
                      AgentsofValue last edited by

                      You could try adding the pages you want to remove to your robots.txt file.  Since you're not linking to them, and it's very unlikely that Googlebot will index those pages naturally now, this might be a better way of telling it which pages to explicitly not index.

                      I'm not really sure how quickly this will trigger Google to remove those pages from the index - but they do reference robots.txt on the actual "Remove URLs" page of WMT ---> "Use **robots.txt **to specify how search engines should crawl your site, or request **removal **of URLs from Google's search results ..."

                      For that technique, you'd want to add something like this for all of the pages you want to remove:

                      Disallow: /oldpage1toremove.php

                      That should work.  If it doesn't, then I would probably just submit the requests through the "Remove URLs" tool.

                      1 Reply Last reply Reply Quote 1
                      • 1 / 1
                      • First post
                        Last post

                      Got a burning SEO question?

                      Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                      Start my free trial


                      Browse Questions

                      Explore more categories

                      • Moz Tools

                        Chat with the community about the Moz tools.

                      • SEO Tactics

                        Discuss the SEO process with fellow marketers

                      • Community

                        Discuss industry events, jobs, and news!

                      • Digital Marketing

                        Chat about tactics outside of SEO

                      • Research & Trends

                        Dive into research and trends in the search industry.

                      • Support

                        Connect on product support and feature requests.

                      • See all categories

                      Related Questions

                      • 94501

                        Sanity Check: NoIndexing a Boatload of URLs

                        Hi, I'm working with a Shopify site that has about 10x more URLs in Google's index than it really ought to. This equals thousands of urls bloating the index. Shopify makes it super easy to make endless new collections of products, where none of the new collections has any new content... just a new mix of products. Over time, this makes for a ton of duplicate content. My response, aside from making other new/unique content, is to select some choice collections with KW/topic opportunities in organic and add unique content to those pages. At the same time, noindexing the other 90% of excess collections pages. The thing is there's evidently no method that I could find of just uploading a list of urls to Shopify to tag noindex. And, it's too time consuming to do this one url at a time, so I wrote a little script to add a noindex tag (not nofollow) to pages that share various identical title tags, since many of them do. This saves some time, but I have to be careful to not inadvertently noindex a page I want to keep. Here are my questions: Is this what you would do? To me it seems a little crazy that I have to do this by title tag, although faster than one at a time. Would you follow it up with a deindex request (one url at a time) with Google or just let Google figure it out over time? Are there any potential negative side effects from noindexing 90% of what Google is already aware of? Any additional ideas? Thanks! Best... Mike

                        Intermediate & Advanced SEO | | 94501
                        0
                      • McTaggart

                        How to stop URLs that include query strings from being indexed by Google

                        Hello Mozzers Would you use rel=canonical, robots.txt, or Google Webmaster Tools to stop the search engines indexing URLs that include query strings/parameters. Or perhaps a combination? I guess it would be a good idea to stop the search engines crawling these URLs because the content they display will tend to be duplicate content  and of low value to users. I would be tempted to use a combination of canonicalization and robots.txt for every page I do not want crawled or indexed, yet perhaps Google Webmaster Tools is the best way to go / just as effective??? And I suppose some use meta robots tags too. Does Google take a position on being blocked from web pages. Thanks in advance, Luke

                        Intermediate & Advanced SEO | | McTaggart
                        0
                      • NeatIT

                        6 .htaccess Rewrites: Remove index.html, Remove .html, Force non-www, Force Trailing Slash

                        i've to give some information about my website Environment 1. i have static webpage in the root. 2. Wordpress installed in sub-dictionary www.domain.com/blog/ 3. I have two .htaccess , one in the root and one in the wordpress
                        folder. i want to www to non on all URLs Remove index.html from url Remove all .html extension / Re-direct 301 to url
                        without .html extension Add trailing slash to the static webpages / Re-direct 301 from non-trailing slash Force trailing slash to the Wordpress Webpages / Re-direct 301 from non-trailing slash Some examples domain.tld/index.html >> domain.tld/ domain.tld/file.html >> domain.tld/file/ domain.tld/file.html/ >> domain.tld/file/ domain.tld/wordpress/post-name >> domain.tld/wordpress/post-name/ My code in ROOT htaccess is <ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews RewriteEngine On
                        RewriteBase / #removing trailing slash
                        RewriteCond %{REQUEST_FILENAME} !-d
                        RewriteRule ^(.*)/$ $1 [R=301,L] #www to non
                        RewriteCond %{HTTP_HOST} ^www.(([a-z0-9_]+.)?domain.com)$ [NC]
                        RewriteRule .? http://%1%{REQUEST_URI} [R=301,L] #html
                        RewriteCond %{REQUEST_FILENAME} !-f
                        RewriteCond %{REQUEST_FILENAME} !-d
                        RewriteRule ^([^.]+)$ $1.html [NC,L] #index redirect
                        RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
                        RewriteRule ^index.html$ http://domain.com/ [R=301,L]
                        RewriteCond %{THE_REQUEST} .html
                        RewriteRule ^(.*).html$ /$1 [R=301,L]</ifmodule> The above code do 1. redirect www to non-www
                        2. Remove trailing slash at the end (if exists)
                        3. Remove index.html
                        4. Remove all .html
                        5. Redirect 301 to filename but doesn't add trailing slash at the end

                        Intermediate & Advanced SEO | | NeatIT
                        0
                      • RickyShockley

                        Is Chamber of Commerce membership a "paid" link, breaking Google's rules?

                        Hi guys, This drives me nuts. I hear all the time that any time value is exchanged for a link that it technically violates Google's guidelines. What about real organizations, chambers of commerce, trade groups, etc. that you are a part of that have online directories with DO-follow links. On one hand people will say these are great links with real value outside of search and great for local SEO..and on the other hand some hardliners are saying that these technically should be no-follow. Thoughts???

                        Intermediate & Advanced SEO | | RickyShockley
                        0
                      • peteboyd

                        URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?

                        A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.

                        Intermediate & Advanced SEO | | peteboyd
                        0
                      • esiow2013

                        May know what's the meaning of these parameters in .htaccess?

                        Begin HackRepair.com Blacklist RewriteEngine on Abuse Agent Blocking RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Bolt\ 0 [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} CazoodleBot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Default\ Browser\ 0 [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^DIIbot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} discobot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ecxi [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} GT::WWW [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} heritrix [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} HTTP::Lite [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} IDBot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} id-search [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} id-search.org [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} IRLbot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ISC\ Systems\ iRc\ Search\ 2.1 [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Java [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} libwww [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Link [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} lwp-trivial [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Maxthon$ [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} MFC_Tear_Sample [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^microsoft.url [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Missigua\ Locator [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Mozilla.NEWT [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Nutch [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} panscient.com [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} PECL::HTTP [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^PeoplePal [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} PHPCrawl [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} PleaseCrawl [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Rippers\ 0 [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} SBIder [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^SeaMonkey$ [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Snoopy [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Steeler [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Toata\ dragostea\ mea\ pentru\ diavola [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} URI::Fetch [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} urllib [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} User-Agent [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} webalta [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} WebCollage [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} Wells\ Search\ II [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} WEP\ Search [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WWW-Mechanize [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} zermelo [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^Zeus [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ^(.)Zeus.Webster [NC,OR]
                        RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC]
                        RewriteRule ^. - [F,L] Abuse bot blocking rule end End HackRepair.com Blacklist

                        Intermediate & Advanced SEO | | esiow2013
                        1
                      • jlane9

                        How to prevent 404's from a job board ?

                        I have a new client with a job listing board on their site. I am getting a bunch of 404 errors as they delete the filled jobs. Question: Should we leave the the jobs pages up for extra content and entry points to the site and put a notice like this job has been filled, please search our other job listings ? Or should I no index - no follow these pages ? Or any other suggestions - it is an employment agency site. Overall what would be the best practice going forward - we are looking at probably 20 jobs / pages per month.

                        Intermediate & Advanced SEO | | jlane9
                        0
                      • AndrewY

                        Blocking Dynamic URLs with Robots.txt

                        Background: My e-commerce site uses a lot of layered navigation and sorting links.  While this is great for users, it ends up in a lot of URL variations of the same page being crawled by Google.  For example, a standard category page: www.mysite.com/widgets.html ...which uses a "Price" layered navigation sidebar to filter products based on price also produces the following URLs which link to the same page: http://www.mysite.com/widgets.html?price=1%2C250 http://www.mysite.com/widgets.html?price=2%2C250 http://www.mysite.com/widgets.html?price=3%2C250 As there are literally thousands of these URL variations being indexed, so I'd like to use Robots.txt to disallow these variations. Question: Is this a wise thing to do?  Or does Google take into account layered navigation links by default, and I don't need to worry. To implement, I was going to do the following in Robots.txt: User-agent: * Disallow: /*? Disallow: /*= ....which would prevent any dynamic URL with a '?" or '=' from being indexed.  Is there a better way to do this, or is this a good solution? Thank you!

                        Intermediate & Advanced SEO | | AndrewY
                        1

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy

                      Looks like your connection to Moz was lost, please wait while we try to reconnect.