Do I have a robots.txt problem?
-
I have the little yellow exclamation point under my robots.txt fetch as you can see here- http://imgur.com/wuWdtvO
This version shows no errors or warnings- http://imgur.com/uqbmbug
Under the tester I can currently see the latest version. This site hasn't changed URLs recently, and we haven't made any changes to the robots.txt file for two years. This problem just started in the last month. Should I worry?
-
Today it has a green check mark, and absolutely no changes were made to the website since I asked this question.
-
It could be that your server had a hard time when Google tried to view your robots.txt file that's why it wouldn't be able to fetch it. As long as this issue doesn't prevent Google anymore in the future it's not much to worry about.
-
That would make me feel more confident of a false error being reported. Time to closely monitor the crawl logs, look at server stats, and keep an eye on GWT for a change in the reporting/indexing. I would also go into the GWT forums and post, see if anyone is reporting a similar error these past couple days.
-
I can't post the domain but I know it is accessible.
When I go to the tester it shows the live robots.txt with no problems. I also can look at the server logs and see that it is being crawled, but being crawled less then Bing Crawls. Also the Bing Webmaster Tools is showing no problems.
-
Can you post your domain? Manually checking the robots.txt file would help.
I've checked many of my GWT accounts and I am not showing a sudden robots.txt error. It could be a false error, but I would take anything with the robots.txt file seriously. You'll want to make sure that it is in fact accessible to all the crawlers desired.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Little confused regarding robots.txt
Hi there Mozzers! As a newbie, I have a question that what could happen if I write my robots.txt file like this... User-agent: * Allow: / Disallow: /abc-1/ Disallow: /bcd/ Disallow: /agd1/ User-agent: * Disallow: / Hope to hear from you...
Technical SEO | | DenorL0 -
I hope someone can help me with page indexing problem
I have a problem with all video pages on www.tadibrothers.com.
Technical SEO | | TadiBrothers
I can not understand why google do not index all the video pages?
I never blocked them with the robots.txt file, there are no noindex/nofollow tags on the pages. The only video page that I found in search results is the main video category page: https://www.tadibrothers.com/videos and 1 video page out of 150 videos: https://www.tadibrothers.com/video/front-side-rear-view-cameras-for-backup-camera-systems I hope someone can point me to the right way0 -
Sub Domains and Robot.txt files...
This is going to seem like a stupid question, and perhaps it is but I am pulling out what little hair I have left. I have a sub level domain on which a website sits. The Main domain has a robots.txt file that disallows all robots. It has been two weeks, I submitted the sitemap through webmaster tools and still, Google has not indexed the sub domain website. My question is, could the robots.txt file on the main domain be affecting the crawlability of the website on the sub domain? I wouldn't have thought so but I can find nothing else. Thanks in advance.
Technical SEO | | Vizergy0 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
I am getting an error message from Google Webmaster Tools and I don't know what to do to correct the problem
The message is:
Technical SEO | | whitegyr
"Dear site owner or webmaster of http://www.whitegyr.com/, We've detected that some of your site's pages may be using techniques that are outside Google's Webmaster Guidelines. If you have any questions about how to resolve this issue, please see our Webmaster Help Forum for support. Sincerely, Google Search Quality Team" I have always tried to follow Google's guidelines and don't know what I am doing wrong, I have eight different websites all getting this warning and I don't know what is wrong, is there anyone you know that will look at my sites and advise me what I need to do to correct the problem? Website with this warning:
artistalaska.com
cosmeticshandbook.com
homewindpower.ws
montanalandsale.com
outdoorpizzaoven.net
shoes-place.com
silverstatepost.com
www.whitegyr.com0 -
Getting home page content at top of what robots see
When I click on the text-only cache of nlpca(dot)com on the home page http://webcache.googleusercontent.com/search?q=cache:UIJER7OJFzYJ:www.nlpca.com/&hl=en&gl=us&strip=1 our H1 and body content are at the very bottom. How do we get the h1 and content at the top of what the robots see? Thanks!
Technical SEO | | BobGW0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Robots.txt
Hi everyone, I just want to check something. If you have this entered into your robots.txt file: User-agent: *
Technical SEO | | PeterM22
Disallow: /fred/ This wouldn't block /fred-review/ from being crawled would it? Thanks0