Robots.txt - "File does not appear to be valid"
-
Good afternoon Mozzers!
I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move!
To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt)
I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error.
Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues.
Cheers,
Lewis
-
Thanks for the quick response, Patrick. Why, if this robots.txt file is incorrect, does it yield no errors on other sites we use this on?
Cheers,
Lewis
-
Hi there
I want to say that needs an...
Allow: /
...or a "Group 2" specification.
I would take a look at Google Developer's Robots.txt Specifications and see where you have opportunities to remedy this issue.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Utilizing one robots.txt for two sites
I have two sites that are facilitated hosting in similar CMS. Maybe than having two separate robots.txt records (one for every space), my web office has made one which records the sitemaps for the two sites, similar to this:
Technical SEO | | eulabrant0 -
No index tag robots.txt
Hi Mozzers, A client's website has a lot of internal directories defined as /node/*. I already added the rule 'Disallow: /node/*' to the robots.txt file to prevents bots from crawling these pages. However, the pages are already indexed and appear in the search results. In an article of Deepcrawl, they say you can simply add the rule 'Noindex: /node/*' to the robots.txt file, but other sources claim the only way is to add a noindex directive in the meta robots tag of every page. Can someone tell me which is the best way to prevent these pages from getting indexed? Small note: there are more than 100 pages. Thanks!
Technical SEO | | WeAreDigital_BE
Jens0 -
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
"5XX (Server Error)" - How can I fix this?
Hey Mozers! Moz Crawl tells me I am having an issue with my Wordpress category - it is returning a 5XX error and i'm not sure why? Can anyone help me determine the issue? Crawl Issues and Notices for: http://www.refusedcarfinance.com/news/category/news We found 1 crawler issue(s) for this page. High Priority Issues 1 5XX (Server Error) 5XX errors (e.g., a 503 Service Unavailable error) are shown when a valid request was made by the client, but the server failed to complete the request. This can indicate a problem with the server, and should be investigated and fixed.
Technical SEO | | RocketStats0 -
Google displaying "Items 1-9" before the description in the Search Results
We see our pages coming up in Google with the category page/product numbers in front of our descriptions. For example: Items 1 - 24 of 86 (and than the descriptions follows). Our website is magento based. Is there a fix for this that anyone knows of? Is there method of stopping Google from adding this on to the front of our Meta Description?
Technical SEO | | DutchG0 -
Does "?" in my URL have a negative effect?
I am having a difficult time finding specific information about the effect, if any, having a ? within the URL structure. We have the descriptive keyword phrase followed by the ? location id as in this example: http://www.adventuresonly.com/adventure-locations/things-to-do-in-arizona?stateid=124 Any feedback on effect and a corrective process to improve if necessary would be appreciated!
Technical SEO | | RBBonds0 -
"Extremely high number of URLs" warning for robots.txt blocked pages
I have a section of my site that is exclusively for tracking redirects for paid ads. All URLs under this path do a 302 redirect through our ad tracking system: http://www.mysite.com/trackingredirect/blue-widgets?ad_id=1234567 --302--> http://www.mysite.com/blue-widgets This path of the site is blocked by our robots.txt, and none of the pages show up for a site: search. User-agent: * Disallow: /trackingredirect However, I keep receiving messages in Google Webmaster Tools about an "extremely high number of URLs", and the URLs listed are in my redirect directory, which is ostensibly not indexed. If not by robots.txt, how can I keep Googlebot from wasting crawl time on these millions of /trackingredirect/ links?
Technical SEO | | EhrenReilly0 -
My site has a "Reported Web Forgery!" warning
When I check my bing cached page it comes up with a "Reported Web Forgery!" warning. I've looked at google web tools and no malware has been detected. I do have another site that has a very similar web address jaaronwoodcountertops.com and jaaron-wood-countertops.com. Could that be why? How do I go about letting bing and or firefox know this is not a forgery site?
Technical SEO | | JAARON0