Google robots.txt test - not picking up syntax errors?
-
I just ran a robots.txt file through "Google robots.txt Tester" as there was some unusual syntax in the file that didn't make any sense to me...
e.g. /url/?*
/url/?
/url/*and so on. I would use ? and not ? for example and what is ? for! - etc.
Yet "Google robots.txt Tester" did not highlight the issues...
I then fed the sitemap through http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php and that tool actually picked up my concerns.
Can anybody explain why Google didn't - or perhaps it isn't supposed to pick up such errors?
Thanks, Luke
-
Many thanks Beau - much appreciated.
-
Hey Luke,
It appears that in each of the three examples, there was a plausible case for each example. Let's cover each:
- For /url/?* , it can be expressed that a URL can offer a trailing slash and then a query string, see examples here.
- with /url/? , this covers examples of the above and in addition, would plausibly block product pages that generate query strings, similar to this example from H&M. In essence, only allowing the product page to be seen.
- /url/* , well, that's just anything and everything after the trailing slash.
I guess the question you should ask yourself is "Is this the best approach for the issue?"
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you index a Google doc?
We have updated and added completely new content to our state pages. Our old state content is sitting in a our Google drive. Can I make these public to get them indexed and provide a link back to our state pages? In theory it sounds like a great link building strategy... TIA!
Intermediate & Advanced SEO | | LindsayE1 -
Google related searches
Hello, Are the related searches, the words that I should use when writing my content. For ex : when I type online spreadsheet in google, in the related searches it list online spreadsheet open source and spreasheet download. Does it means that when writing content I should included those terms in order to be relevant on the keyword online spreadsheet ? because they are considered closely related by google ?
Intermediate & Advanced SEO | | seoanalytics0 -
Google Not Displaying Rich Snippets
We implemented rich snippets for products some time ago. When viewing our site through a site:xxxx.com on Google, they don't show for every product, despite the fact that they should. I've taken some of the URLs that don't show rich snippets in the SERPs, ran them through Google's testing tool, and they display fine. Not sure what's going wrong here. Any thoughts?
Intermediate & Advanced SEO | | Kingof50 -
Why is our page will not being found by google?
Hi, We have a page that went live nearly 2 months ago. https://www.invoicestudio.com/Secure/InvoiceTemplate Why does google not notice it. Both site: URL's return nothing. site:www.invoicestudio.com/Secure/InvoiceTemplate site:www.invoicestudio.com/Secure This is an important page for us and do not understand why google doesn't like it. Hope you can help Thanks Andrew
Intermediate & Advanced SEO | | Studio330 -
If i disallow unfriendly URL via robots.txt, will its friendly counterpart still be indexed?
Our not-so-lovely CMS loves to render pages regardless of the URL structure, just as long as the page name itself is correct. For example, it will render the following as the same page: example.com/123.html example.com/dumb/123.html example.com/really/dumb/duplicative/URL/123.html To help combat this, we are creating mod rewrites with friendly urls, so all of the above would simply render as example.com/123 I understand robots.txt respects the wildcard (*), so I was considering adding this to our robots.txt: Disallow: */123.html If I move forward, will this block all of the potential permutations of the directories preceding 123.html yet not block our friendly example.com/123? Oh, and yes, we do use the canonical tag religiously - we're just mucking with the robots.txt as an added safety net.
Intermediate & Advanced SEO | | mrwestern0 -
My own brand name disappeared from google?
Hi, about 20-30 hours ago my own brand name disappeared from google results (We redirected old domain to new one about a month ago) My website is: www.websiteplanet.com If you search for Website Planet in google you will not find our homepage any longer.
Intermediate & Advanced SEO | | Ouzan
Not only that the brand name disappeared but we also dropped in rankings and lost about %50 of the organic traffic we had. It's important for me to say that we have never done any sort of blackhat or even greyhat SEO, at all. I could probably come up with many ideas of why it happened but maybe one of you mozzers already experienced this and could enlighten me. Will really appreciate any kind of response/help. Thanks.0 -
Effect duration of robots.txt file.
in my web site there is demo site in that also, index in Google but no need it now.so i have created robots file and upload to server yesterday.in the demo folder there are some html files,and i wanna remove all these in demo file from Google.but still in web master tools it showing User-agent: *
Intermediate & Advanced SEO | | innofidelity
Disallow: /demo/ How long this will take to remove from Google ? And are there any alternative way doing that ?0 -
Block an entire subdomain with robots.txt?
Is it possible to block an entire subdomain with robots.txt? I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
Intermediate & Advanced SEO | | kylesuss12