undefined
Skip to content
Moz logo Menu open Menu close
  • Products
    • Moz Pro
    • Moz Pro Home
    • Moz Local
    • Moz Local Home
    • STAT
    • Moz API
    • Moz API Home
    • Compare SEO Products
    • Moz Data
  • Free SEO Tools
    • Domain Analysis
    • Keyword Explorer
    • Link Explorer
    • Competitive Research
    • MozBar
    • More Free SEO Tools
  • Learn SEO
    • Beginner's Guide to SEO
    • SEO Learning Center
    • Moz Academy
    • MozCon
    • Webinars, Whitepapers, & Guides
  • Blog
  • Why Moz
    • Digital Marketers
    • Agency Solutions
    • Enterprise Solutions
    • Small Business Solutions
    • The Moz Story
    • New Releases
  • Log in
  • Log out
  • Products
    • Moz Pro

      Your all-in-one suite of SEO essentials.

    • Moz Local

      Raise your local SEO visibility with complete local SEO management.

    • STAT

      SERP tracking and analytics for enterprise SEO experts.

    • Moz API

      Power your SEO with our index of over 44 trillion links.

    • Compare SEO Products

      See which Moz SEO solution best meets your business needs.

    • Moz Data

      Power your SEO strategy & AI models with custom data solutions.

    Let your reputation grow with Reviews AI
    Moz Local

    Let your reputation grow with Reviews AI

    Learn more
  • Free SEO Tools
    • Domain Analysis

      Get top competitive SEO metrics like DA, top pages and more.

    • Keyword Explorer

      Find traffic-driving keywords with our 1.25 billion+ keyword index.

    • Link Explorer

      Explore over 40 trillion links for powerful backlink data.

    • Competitive Research

      Uncover valuable insights on your organic search competitors.

    • MozBar

      See top SEO metrics for free as you browse the web.

    • More Free SEO Tools

      Explore all the free SEO tools Moz has to offer.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Learn SEO
    • Beginner's Guide to SEO

      The #1 most popular introduction to SEO, trusted by millions.

    • SEO Learning Center

      Broaden your knowledge with SEO resources for all skill levels.

    • On-Demand Webinars

      Learn modern SEO best practices from industry experts.

    • How-To Guides

      Step-by-step guides to search success from the authority on SEO.

    • Moz Academy

      Upskill and get certified with on-demand courses & certifications.

    • MozCon

      Save on Early Bird tickets and join us in London or New York City

    Unlock flexible pricing & new endpoints
    Moz API

    Unlock flexible pricing & new endpoints

    Find your plan
  • Blog
  • Why Moz
    • Digital Marketers

      Simplify SEO tasks to save time and grow your traffic.

    • Small Business Solutions

      Uncover insights to make smarter marketing decisions in less time.

    • Agency Solutions

      Earn & keep valuable clients with unparalleled data & insights.

    • Enterprise Solutions

      Gain a competitive edge in the ever-changing world of search.

    • The Moz Story

      Moz was the first & remains the most trusted SEO company.

    • New Releases

      Get the scoop on the latest and greatest from Moz.

    Surface actionable competitive intel
    New Feature

    Surface actionable competitive intel

    Learn More
  • Log in
    • Moz Pro
    • Moz Local
    • Moz Local Dashboard
    • Moz API
    • Moz API Dashboard
    • Moz Academy
  • Avatar
    • Moz Home
    • Notifications
    • Account & Billing
    • Manage Users
    • Community Profile
    • My Q&A
    • My Videos
    • Log Out

The Moz Q&A Forum

  • Forum
  • Questions
  • Users
  • Ask the Community

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

  1. Home
  2. SEO Tactics
  3. Intermediate & Advanced SEO
  4. What does Disallow: /french-wines/?* actually do - robots.txt

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

What does Disallow: /french-wines/?* actually do - robots.txt

Intermediate & Advanced SEO
2
8
1.6k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as question
Log in to reply
This topic has been deleted. Only users with question management privileges can see it.
  • McTaggart
    McTaggart last edited by Mar 16, 2017, 10:01 AM

    Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?*

    Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark?

    Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL?

    I think this has been done to block URLs containing query strings.

    Thanks, Luke

    1 Reply Last reply Reply Quote 0
    • LoganRay
      LoganRay @McTaggart last edited by Mar 21, 2017, 10:39 AM Mar 21, 2017, 10:39 AM

      Glad to help, Luke!

      1 Reply Last reply Reply Quote 0
      • McTaggart
        McTaggart @LoganRay last edited by Mar 21, 2017, 10:25 AM Mar 21, 2017, 10:25 AM

        Thanks Logan for your help with this - much appreciated. Really helpful!

        LoganRay 1 Reply Last reply Mar 21, 2017, 10:39 AM Reply Quote 0
        • LoganRay
          LoganRay @McTaggart last edited by Mar 21, 2017, 10:23 AM Mar 16, 2017, 3:25 PM

          Disallow: /?* is the same thing as Disallow:/?, since the asterisk is a wildcard, both of those disallows prevent any URL that begins with /? from being crawled.

          And yes, it is incredibly easy to disallow the wrong thing! The robots.txt tester in Search Console (under the Crawl menu) is very helpful for figuring out what a disallow will catch and what it will let by. I highly recommend testing any new disallows there before releasing them into the wild.

          McTaggart 1 Reply Last reply Mar 21, 2017, 10:25 AM Reply Quote 1
          • McTaggart
            McTaggart @LoganRay last edited by Mar 16, 2017, 2:02 PM Mar 16, 2017, 2:02 PM

            Thanks again Logan.

            What would Disallow: /?* do because that is what the site I am looking at has implemented. Perhaps it works both ways around?

            I imagine it's easy to disallow the wrong thing or possibly not disallow the right thing. Ugh.

            LoganRay 1 Reply Last reply Mar 16, 2017, 3:25 PM Reply Quote 0
            • LoganRay
              LoganRay @McTaggart last edited by Mar 21, 2017, 10:24 AM Mar 16, 2017, 1:03 PM

              Disallow: /*?

              This disallow literally says to crawlers 'if a URL starts with a slash (all URLs) and has a parameter, don't crawl it'. The * is a wildcard that says anything between / and ? is applicable to the disallow.

              It's very easy to disallow the wrong this especially in regards to parameters, for this reason I always do these 2 things rather than using robots.txt:

              1. Set the purpose of each parameter in Search Console - Go to Crawl > URL Parameters to configure for your site
              2. Self-referring canonicals - most people disallow URLs with parameters in robots.txt to prevent indexing, but this only prevents crawling. A self-referring canonical pointing to the root level of that URL will prevent indexing or URLs with parameters.

              Hope that's helpful!

              McTaggart 1 Reply Last reply Mar 16, 2017, 2:02 PM Reply Quote 1
              • McTaggart
                McTaggart @LoganRay last edited by Mar 16, 2017, 10:16 AM Mar 16, 2017, 10:16 AM

                Thanks Logan - I was just reading: Disallow: /*? # block any URL that includes a ? (and thus a query string) - do you know why the ? comes before the * in this case?

                LoganRay 1 Reply Last reply Mar 16, 2017, 1:03 PM Reply Quote 0
                • LoganRay
                  LoganRay last edited by Mar 16, 2017, 10:05 AM Mar 16, 2017, 10:05 AM

                  Hi Luke,

                  You are correct that this was done to block URLs with parameters. However, since there's no wildcard (the asterisk) before the folder name, the URL would have to start with /french-wines/. This disallow is really only preventing crawling on the single URL www.yoursite.com/french-wines/ with any parameters appended.

                  McTaggart 1 Reply Last reply Mar 16, 2017, 10:16 AM Reply Quote 0
                  • 1 / 1
                  1 out of 8
                  • First post
                    1/8
                    Last post

                  Got a burning SEO question?

                  Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                  Start my free trial


                  Browse Questions

                  Explore more categories

                  • Moz Tools

                    Chat with the community about the Moz tools.

                  • SEO Tactics

                    Discuss the SEO process with fellow marketers

                  • Community

                    Discuss industry events, jobs, and news!

                  • Digital Marketing

                    Chat about tactics outside of SEO

                  • Research & Trends

                    Dive into research and trends in the search industry.

                  • Support

                    Connect on product support and feature requests.

                  • See all categories

                  Related Questions

                  • Mat_C

                    Robots.txt blocked internal resources Wordpress

                    Hi all, We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted.  I've created the following new one: User-agent: *
                    Allow: /
                    Disallow: /wp-admin/
                    Disallow: /wp-includes/
                    Disallow: /wp-content/plugins/
                    Disallow: /wp-content/cache/
                    Disallow: /wp-content/themes/
                    Allow: /wp-admin/admin-ajax.php However, in the site audit on SemRush,  I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts. Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO? Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index? Thanks for your thoughts!

                    Intermediate & Advanced SEO | Nov 26, 2019, 5:09 AM | Mat_C
                    2
                  • rmehta1

                    [Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console

                    I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too): Here's how they show up: Example 1: URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2 Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html Example 2 (this surprised me the most when I looked at the linked from data): URL: www.example.com/search/adult-site-keyword-2.html/page/3/ Linked From: www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site) http://a-spammy-adult-site.com/search/adult-site-keyword-2.html Example 3: URL: www.example.com/search/adult-site-keyword-3.html Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html How do I address this issue?

                    Intermediate & Advanced SEO | Dec 23, 2016, 3:53 PM | rmehta1
                    0
                  • tbps

                    Help article / Knowledge base SEO consideration

                    Hi everyone, I am in the process of building the knowledge base for our SaaS product and I am afraid it could impact us negatively on the SEO side because of: Thin content on pages containing short answers to specific questions Keyword cannibalisation between some of our blog articles and the knowledge base articles I didn't find much on the impact of knowledge bases on SEO when I searched on Google. So I'm hoping we can use this thread to share a few thoughts and best practices on this topic. Below is a bit more details on the issues I face, any tips on how to address them would be most welcome. 1. Thin content: Some articles will have thin content by design: the H1 will be a specific question and there will be only 2 or 3 lines of text answering it in the article. I think creating a dedicated article per question is better than grouping 20 questions on one article from a UX point of view, because this will enable us to direct users more quickly to the answer when they use the live search function inside the software (help widget) or on the knowledge base (saves them the need to scrolling a long article to find the answer). Now the issue is that this will result in lots of pages with thin content. A workaround could be to have both a detailed FAQ style page with all the questions and answers, and individual articles for each question on top of that. The FAQ style page could be indexed in Google while the individual articles would have either a noIndex directive or a rel canonical to the FAQ style page. Have any of you faced similar issues when setting-up your knowledge base? Which approach would you recommend? 2.Keyword cannibalisation: There will be, to some extend, a level of keyword cannibalisation between our blog articles (which rank well) and some of the knowledge base articles. While we want both types of articles to appear in search, we don't want the "How to do XYZ" blog article containing practical tips to compete with the "How to do XYZ in the software" knowledge base article. Do you have any advice on how to achieve that? Having a specific Schema.org (or equivalent) type of markup to differentiate between the 2 types of articles would have been ideal but I couldn't find anything relating to help articles specifically when I searched.

                    Intermediate & Advanced SEO | Jul 10, 2016, 1:11 PM | tbps
                    0
                  • lunavista-comm

                    Why is Google ranking irrelevant / not preferred pages for keywords?

                    Over the past few months we have been chipping away at duplicate content issues. We know this is our biggest issue and is working against us. However, it is due to this client also owning the competitor site. Therefore, product merchandise and top level categories are highly similar, including a shared server. Our rank is suffering major for this, which we understand.  However, as we make changes, and I track and perform test searches, the pages that Google ranks for keywords never seems to match or make sense, at all. For example, I search for "solid scrub tops" and it ranks the "print scrub tops" category. Or the "Men Clearance" page is ranking for  keyword "Women Scrub Pants". Or, I will search for a specific brand, and it ranks a completely different brand. Has anyone else seen this behavior with duplicate content issues?  Or is it an issue with some other penalty?  At this point, our only option is to test something and see what impact it has, but it is difficult to do when keywords do not align with content.

                    Intermediate & Advanced SEO | Dec 1, 2015, 7:53 PM | lunavista-comm
                    0
                  • geniusenergyltd

                    Citation/Business Directory Question...

                    A company I work for has two numbers... one for the std call centre and one for tracking SEO. Now, if local citation/business directory listings have the same address but different numbers, will this affect local/other SEO results? Any help is greatly appreciated! 🙂

                    Intermediate & Advanced SEO | May 2, 2014, 4:42 PM | geniusenergyltd
                    0
                  • 6thirty

                    Should you include domain / brand in Meta Title

                    Hello, I am trying to come up with a strategy for creating meta title information for my eCommerce store. I have read mixed reviews on the examples below. The first includes the company / brand in the meta title and thus is included in SE results. The second does not. Probably not a 'right' answer here so I look forward to answers with rationale... also open to a completely difference strategy all together! 1MR Vortex by BPI Sports - $Company_Name OR 1MR Vortex by BPI Sports - Pre Workout Supplement Thanks!

                    Intermediate & Advanced SEO | Mar 19, 2014, 3:31 AM | 6thirty
                    0
                  • IHSwebsite

                    Robots.txt: Can you put a /* wildcard in the middle of a URL?

                    We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?

                    Intermediate & Advanced SEO | Sep 26, 2012, 1:10 PM | IHSwebsite
                    0
                  • nicole.healthline

                    Robots.txt & url removal vs. noindex, follow?

                    When de-indexing pages from google, what are the pros & cons of each of the below two options: robots.txt & requesting url removal from google webmasters Use the noindex, follow meta tag on all doctor profile pages Keep the URLs in the Sitemap file so that Google will recrawl them and find the noindex meta tag make sure that they're not disallowed by the robots.txt file

                    Intermediate & Advanced SEO | Feb 14, 2013, 8:53 AM | nicole.healthline
                    0

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy

                  Looks like your connection to Moz was lost, please wait while we try to reconnect.