undefined
Skip to content
Moz logo Menu open Menu close
  • Products
    • Moz Pro
    • Moz Pro Home
    • Moz Local
    • Moz Local Home
    • STAT
    • Moz API
    • Moz API Home
    • Compare SEO Products
    • Moz Data
  • Free SEO Tools
    • Domain Analysis
    • Keyword Explorer
    • Link Explorer
    • Competitive Research
    • MozBar
    • More Free SEO Tools
  • Learn SEO
    • Beginner's Guide to SEO
    • SEO Learning Center
    • Moz Academy
    • SEO Q&A
    • Webinars, Whitepapers, & Guides
  • Blog
  • Why Moz
    • Agency Solutions
    • Enterprise Solutions
    • Small Business Solutions
    • Case Studies
    • The Moz Story
    • New Releases
  • Log in
  • Log out
  • Products
    • Moz Pro

      Your all-in-one suite of SEO essentials.

    • Moz Local

      Raise your local SEO visibility with complete local SEO management.

    • STAT

      SERP tracking and analytics for enterprise SEO experts.

    • Moz API

      Power your SEO with our index of over 44 trillion links.

    • Compare SEO Products

      See which Moz SEO solution best meets your business needs.

    • Moz Data

      Power your SEO strategy & AI models with custom data solutions.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Free SEO Tools
    • Domain Analysis

      Get top competitive SEO metrics like DA, top pages and more.

    • Keyword Explorer

      Find traffic-driving keywords with our 1.25 billion+ keyword index.

    • Link Explorer

      Explore over 40 trillion links for powerful backlink data.

    • Competitive Research

      Uncover valuable insights on your organic search competitors.

    • MozBar

      See top SEO metrics for free as you browse the web.

    • More Free SEO Tools

      Explore all the free SEO tools Moz has to offer.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Learn SEO
    • Beginner's Guide to SEO

      The #1 most popular introduction to SEO, trusted by millions.

    • SEO Learning Center

      Broaden your knowledge with SEO resources for all skill levels.

    • On-Demand Webinars

      Learn modern SEO best practices from industry experts.

    • How-To Guides

      Step-by-step guides to search success from the authority on SEO.

    • Moz Academy

      Upskill and get certified with on-demand courses & certifications.

    • MozCon

      Save on Early Bird tickets and join us in London or New York City

    Unlock flexible pricing & new endpoints
    Moz API

    Unlock flexible pricing & new endpoints

    Find your plan
  • Blog
  • Why Moz
    • Small Business Solutions

      Uncover insights to make smarter marketing decisions in less time.

    • Agency Solutions

      Earn & keep valuable clients with unparalleled data & insights.

    • Enterprise Solutions

      Gain a competitive edge in the ever-changing world of search.

    • The Moz Story

      Moz was the first & remains the most trusted SEO company.

    • Case Studies

      Explore how Moz drives ROI with a proven track record of success.

    • New Releases

      Get the scoop on the latest and greatest from Moz.

    Surface actionable competitive intel
    New Feature

    Surface actionable competitive intel

    Learn More
  • Log in
    • Moz Pro
    • Moz Local
    • Moz Local Dashboard
    • Moz API
    • Moz API Dashboard
    • Moz Academy
  • Avatar
    • Moz Home
    • Notifications
    • Account & Billing
    • Manage Users
    • Community Profile
    • My Q&A
    • My Videos
    • Log Out

The Moz Q&A Forum

  • Forum
  • Questions
  • Users
  • Ask the Community

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

  1. Home
  2. SEO Tactics
  3. Technical SEO
  4. Allow or Disallow First in Robots.txt

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst weโ€™re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Allow or Disallow First in Robots.txt

Technical SEO
7
12
30.8k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as question
Log in to reply
This topic has been deleted. Only users with question management privileges can see it.
  • irvingw
    irvingw last edited by May 4, 2012, 12:35 PM

    If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command?

    example:

    Allow: /models/ford///page*

    Disallow: /models////page

    1 Reply Last reply Reply Quote 0
    • Net66SEO
      Net66SEO last edited by Feb 10, 2015, 7:53 AM Feb 10, 2015, 7:53 AM

      Just caught this a bit late and probably to late to add something but my two pence is test it in Webmaster Tools, via Crawl -> Robot.txt tester - if you've not used this before simply add the url you want to test and Google highlights the directive that allows or disallows it.

      1 Reply Last reply Reply Quote 0
      • topic:timeago_earlier,about a year
      • fablau
        fablau @Cyrus-Shepard last edited by Dec 16, 2013, 6:00 PM Dec 16, 2013, 6:00 PM

        Thank you Cyrus, yes, I have tried your suggested robots.txt checker and despite it validates the file, it shows me a couple of warnings about the "unusual" use of wildcard. It is my understanding that I would probably need to discuss all this with Google folks directly.

        Thank you for you answer... and, yes Keri, I know this is a old thread, but still useful today!

        Thanks ๐Ÿ™‚

        1 Reply Last reply Reply Quote 0
        • Cyrus-Shepard
          Cyrus-Shepard @fablau last edited by Dec 16, 2013, 4:17 PM Dec 16, 2013, 4:17 PM

          Can't say with 100% confidence, but sounds like it might work. You could always upload it to a server and use a robots.txt checker to validate, although sometimes the validator tools may incorporate slight differences in edge cases like this that make them moot.

          fablau 1 Reply Last reply Dec 16, 2013, 6:00 PM Reply Quote 1
          • KeriMorgret
            KeriMorgret @fablau last edited by Dec 16, 2013, 4:02 PM Dec 16, 2013, 4:02 PM

            Just a quick note, this question is actually from spring of 2012.

            1 Reply Last reply Reply Quote 0
            • fablau
              fablau last edited by Dec 16, 2013, 3:53 PM Dec 16, 2013, 3:53 PM

              What about something like:

              allow: /directory/$

              disallow: /directory/*

              Where I want this to be indexed:

              http://www.mysite.com/directory/

              But not this:

              http://www.mysite.com/directory/sub-directory/

              Ideas?

              KeriMorgret Cyrus-Shepard 2 Replies Last reply Dec 16, 2013, 4:17 PM Reply Quote 0
              • topic:timeago_earlier,2 years
              • irvingw
                irvingw @Cyrus-Shepard last edited by May 6, 2012, 9:18 AM May 6, 2012, 9:18 AM

                I really appreciate all that effort you put in to ensure your method was correct. many thanks.

                1 Reply Last reply Reply Quote 0
                • Cyrus-Shepard
                  Cyrus-Shepard last edited by May 6, 2012, 9:18 AM May 6, 2012, 3:05 AM

                  Interesting question - I've had this discussion a couple of times with different SEOs. Here's my best understanding: There are actually 2 different answers - one if you are talking about Google, and one for every other search engine.

                  For most search engines, the "Allow" should come first. This is because the first matching pattern always wins, for the reasons Geoff stated.

                  But Google is different. They state:

                  "At a group-member level, in particular for allow and disallow directives, the most specific rule based on the length of the [path] entry will trump the less specific (shorter) rule. The order of precedence for rules with wildcards is undefined."

                  Robots.txt Specifications - Webmasters โ€” Google Developers

                  So for Google, order is not important, only the specificity of the rule based on the length of the entry. But the order of precedence for rules with wildcards is undefined.

                  This last part is important, because your directives contain wildcards. If I'm reading this right, your particular directives:

                  Allow: /models/ford///page*

                  Disallow: /models////pageSo if it's "undefined" which directive will Google follow, if order isn't important? Fortunately, there's a simple way to find out.Google Webmaster allows you to test any robots.txt file. I created a dummy file based on your rules, In this case, your directives worked perfectly no matter what order I put them in.

                  | http://cyrusshepard.com/models/ford/test/test/pages | Allowed by line 2: Allow: /models/ford///page* | Allowed by line 2: Allow: /models/ford///page* |
                  | http://cyrusshepard.com/models/chevy/test/test/pages | Blocked by line 3: Disallow: /models////page | Blocked by line 3: Disallow: /models////page |

                  So, to summarize:1. Always put Allow directives first, as most search engines follow the "first rule counts" rule.2. Google doesn't care about order, but rather the specificity based on the length of the entry.3. The order of precedence for rules with wildcards is undefined.4. When in doubt, check your robots.txt file in Google Webmaster tools.Hope this helps.(sorry for the very long answer which basically says you were right all along ๐Ÿ™‚

                  irvingw 1 Reply Last reply May 6, 2012, 9:18 AM Reply Quote 3
                  • NakulGoyal
                    NakulGoyal @irvingw last edited by May 4, 2012, 1:40 PM May 4, 2012, 1:38 PM

                    I understand your concern. I am basing my answer based on the fact that if you don't have a robots.txt at all, Google will still crawl you, which means its an allow by default. So all that matters in my opinion is the disallow, but because you need an allow from the wildcard disallow, you could allow that and disallow next.

                    Honestly, I don't think it matters. If you think the way a bot would work, it's not like robots.txt 1 line is read, then the bot goes crawling and then comes back reads the next line and so on. Does that make sense ? It reads all the lines in the robots.txt and then follows the directives. But to be sure, you can do either of the scenarios and see for yourself. I am sure the results would be same either way.

                    1 Reply Last reply Reply Quote 1
                    • zigojacko
                      zigojacko last edited by May 4, 2012, 1:40 PM May 4, 2012, 1:34 PM

                      The allow directives need to come before the disallow directives for the same directory/file paths. (I have never personally tested this although it makes logical sense to instruct a robot to access one particular path within a directory structure before it sees that it is blocked from crawling that directory).

                      For example:-

                      Allow: /profiles

                      Disallow: /s2/profiles/me

                      Allow: /s2/profiles

                      Allow: /s2/photos

                      Allow: /s2/static

                      Disallow: /s2

                      As per how Google have formatted their robots.txt.

                      1 Reply Last reply Reply Quote 2
                      • irvingw
                        irvingw @NakulGoyal last edited by May 4, 2012, 1:31 PM May 4, 2012, 1:31 PM

                        Thanks. I want to make sure I get this right in a syntax universally understood by all engines. I have seen webmasters all over the place on this one with some saying that crawlers use a first matching rule and others that say that crawlers use a last matching rule. I am almost thinking to have the allow command twice - before and after, to cover all bases.

                        NakulGoyal 1 Reply Last reply May 4, 2012, 1:38 PM Reply Quote 0
                        • NakulGoyal
                          NakulGoyal last edited by May 4, 2012, 1:21 PM May 4, 2012, 1:21 PM

                          I don't think it matters, but I think I would disallow first, because by default everything is an Allow.

                          irvingw 1 Reply Last reply May 4, 2012, 1:31 PM Reply Quote 0
                          • 1 / 1
                          1 out of 12
                          • First post
                            1/12
                            Last post

                          Got a burning SEO question?

                          Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                          Start my free trial


                          Browse Questions

                          Explore more categories

                          • Moz Tools

                            Chat with the community about the Moz tools.

                          • SEO Tactics

                            Discuss the SEO process with fellow marketers

                          • Community

                            Discuss industry events, jobs, and news!

                          • Digital Marketing

                            Chat about tactics outside of SEO

                          • Research & Trends

                            Dive into research and trends in the search industry.

                          • Support

                            Connect on product support and feature requests.

                          • See all categories

                          Related Questions

                          • btreloar

                            Robots.txt Syntax for Dynamic URLs

                            I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page= Which is the proper syntax?
                            Disallow: ?Page=
                            Disallow: ?Page=*
                            Disallow: ?Page=
                            Or something else?

                            Technical SEO | Mar 29, 2017, 11:30 AM | btreloar
                            0
                          • sjbridle

                            Do I need a separate robots.txt file for my shop subdomain?

                            Hello Mozzers! Apologies if this question has been asked before, but I couldn't find an answer so here goes... Currently I have one robots.txt file hosted at https://www.mysitename.org.uk/robots.txt We host our shop on a separate subdomain https://shop.mysitename.org.uk Do I need a separate robots.txt file for my subdomain? (Some Google searches are telling me yes and some no and I've become awfully confused!

                            Technical SEO | Sep 6, 2016, 11:16 AM | sjbridle
                            0
                          • allstatetransmission

                            Robots.txt and Multiple Sitemaps

                            Hello, I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file. Example: User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J

                            Technical SEO | Feb 5, 2014, 7:40 PM | allstatetransmission
                            0
                          • Dan-Lawrence

                            Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)

                            Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan

                            Technical SEO | Aug 23, 2013, 11:41 AM | Dan-Lawrence
                            0
                          • zeepartner

                            Block Domain in robots.txt

                            Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.

                            Technical SEO | Jul 8, 2013, 8:16 AM | zeepartner
                            0
                          • ShearingsGroup

                            Should I block robots from URLs containing query strings?

                            I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL. Might there be any downside to this that I need to consider? Appreciate your help / experiences on this one. Thanks Jenni

                            Technical SEO | Aug 9, 2012, 1:05 PM | ShearingsGroup
                            0
                          • Nightwing

                            Does Bing ignore robots txt files?

                            Bonjour from "Its a miracle is not raining" Wetherby Uk ๐Ÿ™‚ Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome ๐Ÿ™‚

                            Technical SEO | Jul 20, 2012, 5:35 AM | Nightwing
                            0
                          • tylerfraser

                            Can I Disallow Faceted Nav URLs - Robots.txt

                            I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000
                            and
                            /category.html?price=1%2C1000&product_material=88 Thanks!

                            Technical SEO | Dec 24, 2011, 4:11 AM | tylerfraser
                            0

                          Get started with Moz Pro!

                          Unlock the power of advanced SEO tools and data-driven insights.

                          Start my free trial
                          Products
                          • Moz Pro
                          • Moz Local
                          • Moz API
                          • Moz Data
                          • STAT
                          • Product Updates
                          Moz Solutions
                          • SMB Solutions
                          • Agency Solutions
                          • Enterprise Solutions
                          Free SEO Tools
                          • Domain Authority Checker
                          • Link Explorer
                          • Keyword Explorer
                          • Competitive Research
                          • Brand Authority Checker
                          • Local Citation Checker
                          • MozBar Extension
                          • MozCast
                          Resources
                          • Blog
                          • SEO Learning Center
                          • Help Hub
                          • Beginner's Guide to SEO
                          • How-to Guides
                          • Moz Academy
                          • API Docs
                          About Moz
                          • About
                          • Team
                          • Careers
                          • Contact
                          Why Moz
                          • Case Studies
                          • Testimonials
                          Get Involved
                          • Become an Affiliate
                          • MozCon
                          • Webinars
                          • Practical Marketer Series
                          • MozPod
                          Connect with us

                          Contact the Help team

                          Join our newsletter
                          Moz logo
                          ยฉ 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                          • Accessibility
                          • Terms of Use
                          • Privacy

                          Looks like your connection to Moz was lost, please wait while we try to reconnect.