Robots.txt - "File does not appear to be valid"
-
Good afternoon Mozzers!
I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move!
To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt)
I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error.
Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues.
Cheers,
Lewis
-
Thanks for the quick response, Patrick. Why, if this robots.txt file is incorrect, does it yield no errors on other sites we use this on?
Cheers,
Lewis
-
Hi there
I want to say that needs an...
Allow: /
...or a "Group 2" specification.
I would take a look at Google Developer's Robots.txt Specifications and see where you have opportunities to remedy this issue.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website URL, Robots.txt and Google Search Console (www. vs non www.)
Hi MOZ Community,
Technical SEO | | Badiuzz
I would like to request your kind assistance on domain URLs - www. VS non www. Recently, my team have moved to a new website where a 301 Redirection has been done. Original URL : https://www.example.com.my/ (with www.) New URL : https://example.com.my/ (without www.) Our current robots.txt sitemap : https://www.example.com.my/sitemap.xml (with www.)
Our Google Search Console property : https://www.example.com.my/ (with www.) Question:
1. How/Should I standardize these so that Google crawler can effectively crawl my website?
2. Do I have to change back my website URLs to (with www.) or I just need to update my robots.txt?
3. How can I update my Google Search Console property to reflect accordingly (without www.), because I cannot see the options in the dashboard.
4. Is there any to dos such as Canonicalization needed, or should I wait for Google to automatically detect and change it, especially in GSC property? Really appreciate your kind assistance. Thank you,
Badiuzz0 -
Robots.txt
Hello, My client has a robots.txt file which says this: User-agent: * Crawl-delay: 2 I put it through a robots checker which said that it must have a **disallow command**. So should it say this: User-agent: * Disallow: crawl-delay: 2 What effect (if any) would not having a disallow command make? Thanks
Technical SEO | | AL123al0 -
Rel="canonical"
HI, I have site named www.cufflinksman.com related to Cufflinks. I have also install WordPress in sub domain blog.cufflinksman.com. I am getting issue of duplicate content a site and blog have same categories but content different. Now I would like to rel="canonical" blog categories to site categories. http://www.cufflinksman.com/shop-cufflinks-by-hobbies-interests-movies-superhero-cufflinks.html http://blog.cufflinksman.com/category/superhero-cufflinks-2/ Is possible and also have any problem with Google with this trick?
Technical SEO | | cufflinksman0 -
Choosing the right page for rel="canonical"
I am wondering how you would choose which page to use as a canonical ? All our articles sit in an article section and they are called in the url when linked from a particular category. Since some articles are in many categories, we may have several links for the same page. My first idea was to put the one in the article category as the canonical, but I wonder if Google will lose the context of the page for it's ranking because it will not be in the proper category. For exemple, this page in the article section : http://www.bdc.ca/en/advice_centre/articles/Pages/exporting_entering.aspx Same page in the Expand Your Sales > Going Global section : http://www.bdc.ca/EN/advice_centre/expand_your_sales/going_global_or_international_markets/Pages/RelatedArticles.aspx?PATH=/EN/advice_centre/articles/Pages/exporting_entering.aspx The second one has much more context related to it, like the breadcrumb is showing the path and the left menu is open at the right place. For this example, I would choose te second one, but some articles may be found in 2 or 3 categories. If you could share your lights on this it would be very appreciated ! Thanks
Technical SEO | | jfmonfette0 -
"Products 1-20" text in the Serp Results
We have e-commence site (zen-cart) and we use our category pages (which has the list of the products) as landing pages. In the Serp results our link is showing up like this Our Page Title www.link.com Rich snip stuff Products 1 - 40 of 93 - Meta Description text I just wanted to know where its getting the "Products 1 - 40 of 93" from, and can it be removed (if we wanted to)? On the landing page say "Displaying 1 to 40 (of 93 products)", But i looked in to the source and it does not say "Products 1 - 40 of 93" anywhere, so google must be coming up with that text. I have noticed other zen-cart sites have the same text, and other e-commence sites have something similar like " 20+ Products"
Technical SEO | | eunaneunan0 -
How to allow one directory in robots.txt
Hello, is there a way to allow a certain child directory in robots.txt but keep all others blocked? For instance, we've got external links pointing to /user/password/, but we're blocking everything under /user/. And there are too many /user/somethings/ to just block every one BUT /user/password/. I hope that makes sense... Thanks!
Technical SEO | | poolguy0 -
What is the sense of robots.txt?
Using robots.txt to prevent search engine from indexing the page is not a good idea. so what is the sense of robots.txt? just for attracting robots to crawl sitemap?
Technical SEO | | jallenyang0