Google robots.txt test - not picking up syntax errors?
-
I just ran a robots.txt file through "Google robots.txt Tester" as there was some unusual syntax in the file that didn't make any sense to me...
e.g. /url/?*
/url/?
/url/*and so on. I would use ? and not ? for example and what is ? for! - etc.
Yet "Google robots.txt Tester" did not highlight the issues...
I then fed the sitemap through http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php and that tool actually picked up my concerns.
Can anybody explain why Google didn't - or perhaps it isn't supposed to pick up such errors?
Thanks, Luke
-
Many thanks Beau - much appreciated.
-
Hey Luke,
It appears that in each of the three examples, there was a plausible case for each example. Let's cover each:
- For /url/?* , it can be expressed that a URL can offer a trailing slash and then a query string, see examples here.
- with /url/? , this covers examples of the above and in addition, would plausibly block product pages that generate query strings, similar to this example from H&M. In essence, only allowing the product page to be seen.
- /url/* , well, that's just anything and everything after the trailing slash.
I guess the question you should ask yourself is "Is this the best approach for the issue?"
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is my name. Overnight it went from being the number one google search to not showing up at all when you google my name. Why would this happen?
I built my website via square space. It is my name. If you google my name it was the number one hit. Suddenly 2 weeks ago it doesn't show up AT ALL. I went through square spaces SEO check list, secured my site etc. Still doesn't show up. Why would this happen all of the sudden and What can I do? Thank you!
Intermediate & Advanced SEO | | Jbark0 -
Google has penalized me for a keyword,and removed from google some one know for how long time is the penalty
i have by some links from fiverr i was ranking 9 for this keyword with 1200 of searches after fiverr it has disappeared from google more then 10 days i guess this is a penalty someone know how long a penalty like this is how many days to months ? i don't get any messages in webmaster tools this is the gig https://www.fiverr.com/carissa30/do-20-unique-domains-high-tf-and-cf-flow-backlinks-high-da?source=Order+page+gig+link&funnel=a7b5fa4f-8c0a-4c3e-98a3-74112b658c7f
Intermediate & Advanced SEO | | alexmuller870 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
Robots.txt Blocking - Best Practices
Hi All, We have a web provider who's not willing to remove the wildcard line of code blocking all agents from crawling our client's site (user-agent: *, Disallow: /). They have other lines allowing certain bots to crawl the site but we're wondering if they're missing out on organic traffic by having this main blocking line. It's also a pain because we're unable to set up Moz Pro, potentially because of this first line. We've researched and haven't found a ton of best practices regarding blocking all bots, then allowing certain ones. What do you think is a best practice for these files? Thanks! User-agent: * Disallow: / User-agent: Googlebot Disallow: Crawl-delay: 5 User-agent: Yahoo-slurp Disallow: User-agent: bingbot Disallow: User-agent: rogerbot Disallow: User-agent: * Crawl-delay: 5 Disallow: /new_vehicle_detail.asp Disallow: /new_vehicle_compare.asp Disallow: /news_article.asp Disallow: /new_model_detail_print.asp Disallow: /used_bikes/ Disallow: /default.asp?page=xCompareModels Disallow: /fiche_section_detail.asp
Intermediate & Advanced SEO | | ReunionMarketing0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page. To avoid duplicates, all of the existing pages were added to the robots file. That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index? Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index. Anyone got any ideas about why this is happening, and how I can get my new pages indexed?
Intermediate & Advanced SEO | | corp08030 -
What is a good content for google?
When we start to study SEO and how google see our webpage, one important point is to have good content. But, for beginners like me, we get lost on this. Is not so black and white: what for you is a good content? the text amount matters? there is any trick that all good content websites need to have?
Intermediate & Advanced SEO | | Naghirniac0 -
Why does this page not show in google at all?
www.lavenderblue-flowers.co.uk Sorry for formatting, below is the source. There are alot of blocks from robots.txt but is there anything easily rectified to get this site SOME visibility? Duplicate content maybe PANDA had it? No backlink profile too which isnt helping but even still, surprising to see a domain auth of 1. Thanks in advance for any responses. DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta http-equiv="expires" content="Fri, 17 Jun 2011 12:06:27 GMT"><title>Bridport Interflora Florist, Lavender Blue, Dorset, DT16 3XDtitle><meta name="description" content="Lavender Blue in Bridport, Dorset, DT16 3XD delivers to Interflora florist based in Bridport is a well established family run business with a dedicated team of florists. We specialise in beautiful wedding flowers and take great pride in our funeral tributes, floral arrangements designed for any occasion for local, national and worldwide delivery."><meta name="keywords" content="Bridport,Interflora Florist,Lavender Blue,Dorset,DT16 3XD"><meta name="abstract" content="Interflora florist based in Bridport is a well established family run business with a dedicated team of florists. We specialise in beautiful wedding flowers and take great pride in our funeral tributes, floral arrangements designed for any occasion for local, national and worldwide delivery."><meta name="robots" content="index,nofollow"><link rel="stylesheet" type="text/css" href="/kernel/styles/print.css?new=new" media="print"><link rel="stylesheet" href="/kernel/styles/d4.css?designtype=d4;theme=blue;" type="text/css"><style type="text/css">style><script language="JavaScript1.2" src="/kernel/utils.js?new" type="text/javascript">script><script language="JavaScript1.2" type="text/javascript" src="/kernel/interflora.js?head=1;si=1000343;">script><script language="JavaScript1.2" type="text/javascript">script><script language="javascript"> var b_site_url = getcookie('b_site_url');if (b_site_url != "" && !getcookie('referral_id') && location.protocol == 'http:' && b_site_url != location.host && location.pathname.indexOf('catalog2') == -1) location.href = location.protocol + "//" + b_site_url + location.pathname + location.search;script>head><body><img border="0" src="/kernel/images/speck.gif" width="1" height="1" alt class="nospace"><div id="page-body"><table class="page-topbanner" border="0" cellpadding="0" cellspacing="0"><tr><td background="/kernel/images/d4/border-blue_03.gif" align="left" valign="top"><img src="/kernel/images/d4/border-blue_01.gif" alt>td><td colspan="2" style="background-image: url(/kernel/images/d4/border-blue_03.gif); background-position: top; background-repeat: repeat-x;"><img src="/kernel/images/speck.gif" width="300" height="50" alt>td><td align="right" valign="top" background="/kernel/images/d4/border-blue_03.gif"><img src="/kernel/images/d4/border-blue_04.gif" alt>td>tr><tr><td style="background-image: url(/kernel/images/d4/border-blue_05.gif); background-repeat: repeat-y;" align="left" valign="top"><img src="/kernel/images/d4/border-blue_01b.gif" alt>td><td valign="top" class="sd-image_only" id="sd-logo_store" colspan="1" rowspan="1"><img src="/kernel/imageload?ttl2=15;table=content_images;key1=fd_img_2606422_1" alt="" title="">td><td class="logo-if" align="right"><img src="/kernel/images/logo-if.png" alt="interflora.co.uk the flower experts™">td><td style="background-image: url(/kernel/images/d4/border-blue_07.gif); background-position: right; background-repeat: repeat-y;"> td>tr><tr><td style="background-image: url(/kernel/images/d4/border-blue_05.gif); background-position: left; background-repeat: repeat-y;" colspan="3" align="center"><table id="website" cellspacing="0" border="0" align="center"><tr><td colspan="3" id="fol_address">1 Lilliput Lane, Bridport, Dorset, DT16 3XDtd>tr><tr><td id="email" colspan="3"><b>Email:b> lavenderblueflowers@hotmail.co.uktd>tr><tr><td style="padding-right:10px;"><b>Phone:b> 01308 459145td><td style="padding-right:10px;"><b>Fax:b> 01308 458417td>tr>table>td><td style="background-image: url(/kernel/images/d4/border-blue_07.gif); background-position: right; background-repeat: repeat-y;"> td>tr><tr><td style="background-image: url(/kernel/images/d4/border-blue_05.gif); background-position: left; background-repeat: repeat-y;" colspan="3" align="center"><div class="page-topmenu"><table class="page-topmenu" cellspacing="0"><tr><td id="account"><a href="/myaccount/"><img src="/kernel/images/d4/icon-account.gif" style="margin: 3px 3px 4px 3px; vertical-align: middle;" width="15" height="13" alt="My Account">My Accounta>td><td id="menu"><a href="/">Homea><img class="bullet" src="/kernel/images/speck.gif" width="2" height="2" alt style="margin: 10px 4px 10px 4px;"><a href="/page.xml?page_name=about">About Usa><img class="bullet" src="/kernel/images/speck.gif" width="2" height="2" alt style="margin: 10px 4px 10px 4px;"><a href="/page.xml?page_name=delivery">Delivery Infoa><img class="bullet" src="/kernel/images/speck.gif" width="2" height="2" alt style="margin: 10px 4px 10px 4px;"><a href="/page.xml?page_name=contactus">Contact Usa>td><td id="cart"><a href="/shopcart/"><img src="/kernel/images/d4/icon-shopcart.gif" style="margin: 3px; vertical-align: middle;" width="14" hieght="14" alt="Shopping Basket">Shopping Basketa>td>tr>table>div>td><td style="background-image: url(/kernel/images/d4/border-blue_07.gif); background-position: right; background-repeat: repeat-y;"> td>tr>table><p id="browser-warning" style="display: block; padding: 2px; border: 2px solid #FC9F85; margin: 0px; background-color: #FDFFC4;"><b>For your information:b> This message has appeared because we've noticed your browser doesn't fully support all functions of this site. For further information please <a href="/page.xml?page_name=faq">click herea>.p><script language="JavaScript1.2" type="text/javascript">var theBrowser = navigator.userAgent.toLowerCase();if(is_nav7up || (parseInt(is_moz_ver) >= 1) || is_ie5_5up || theBrowser.indexOf("safari") != -1) {hideElement('browser-warning',0);}script><table class="body" border="0" cellspacing="0" cellpadding="0"><tr><td align="left" valign="bottom" style="background-image: url(/kernel/images/d4/border-blue_05.gif); background-position: left; background-repeat: repeat-y;"><img src="/kernel/images/d4/border-blue_05.gif" alt>td><td class="menu" valign="top"><img src="/kernel/images/speck.gif" width="150" height="1" border="0" alt><br><form method="get" action="/search/index.xml" id="leftnav_search"><table border="0" cellspacing="0" class="global-search"><tr><th colspan="2">SEARCHth>tr><tr><td width="50%"><input class="text" type="text" name="keywords1" id="search" value maxlength="50" size="15">td><td align="left"><input type="submit" class="button" name="search" id="search" value="GO">td>tr><tr><td colspan="2" align="left"><a href="/search/advanced_search.xml">Advanced Searcha>td>tr>table>form><div class="menusection"><a class="menuParent_off" id="parentcat_2003443" href="/catalog/category.xml?category_id=2003443"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Anniversaryspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2003443">div><a class="menuParent_off" id="parentcat_2003453" href="/catalog/category.xml?category_id=2003453"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Congratulationsspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2003453">div><a class="menuParent_off" id="parentcat_4" href="/category/flower-arrangements/"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">All Flower Bouquetsspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_4">div><a class="menuParent_off" id="parentcat_2003493" href="/catalog/category.xml?category_id=2003493"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Sympathy & Funeralspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2003493">div><a class="menuParent_off" id="parentcat_2003463" href="/catalog/category.xml?category_id=2003463"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Thank Youspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2003463">div><a class="menuParent_off" id="parentcat_2001478" href="/category/same-day-flowers/"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Same Day Flower Deliveryspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2001478">div><a class="menuParent_off" id="parentcat_2124203" href="/category/summer_flowers/"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Summer Flowersspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2124203">div><a class="menuParent_off" id="parentcat_2003403" href="/category/luxury-flowers/"><div class="spacer">div><span class="menu-bullet"><img src="/kernel/images/arrow.gif" class="menu-bullet">Luxury Flowersspan><div class="spacer">div>a><div class="menuChildren" id="menuChildrencat_2003403">div><a class="menuParent_off" id="parentcat_1000343" href="/catalo
Intermediate & Advanced SEO | | ewanstevenson0