Do I have a robots.txt problem?
-
I have the little yellow exclamation point under my robots.txt fetch as you can see here- http://imgur.com/wuWdtvO
This version shows no errors or warnings- http://imgur.com/uqbmbug
Under the tester I can currently see the latest version. This site hasn't changed URLs recently, and we haven't made any changes to the robots.txt file for two years. This problem just started in the last month. Should I worry?
-
Today it has a green check mark, and absolutely no changes were made to the website since I asked this question.
-
It could be that your server had a hard time when Google tried to view your robots.txt file that's why it wouldn't be able to fetch it. As long as this issue doesn't prevent Google anymore in the future it's not much to worry about.
-
That would make me feel more confident of a false error being reported. Time to closely monitor the crawl logs, look at server stats, and keep an eye on GWT for a change in the reporting/indexing. I would also go into the GWT forums and post, see if anyone is reporting a similar error these past couple days.
-
I can't post the domain but I know it is accessible.
When I go to the tester it shows the live robots.txt with no problems. I also can look at the server logs and see that it is being crawled, but being crawled less then Bing Crawls. Also the Bing Webmaster Tools is showing no problems.
-
Can you post your domain? Manually checking the robots.txt file would help.
I've checked many of my GWT accounts and I am not showing a sudden robots.txt error. It could be a false error, but I would take anything with the robots.txt file seriously. You'll want to make sure that it is in fact accessible to all the crawlers desired.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots User-agent Query
Am I correct in saying that the allow/disallow is only applied to msnbot_mobile? mobile robots file User-agent: Googlebot-Mobile User-agent: YahooSeeker/M1A1-R2D2 User-agent: MSNBOT_Mobile Allow: / Disallow: /1 Disallow: /2/ Disallow: /3 Disallow: /4/
Technical SEO | | ThomasHarvey1 -
Robots file set up
The robots file looks like it has been set up in a very messy way.
Technical SEO | | mcwork
I understand the # will comment out a line, does this mean the sitemap would
not be picked up?
Disallow: /js/ should this be allowed like /*.js$
Disallow: /media/wysiwyg/ - this seems to be causing alerts in webmaster tools as it can not access
the images within.
Can anyone help me clean this up please #Sitemap: https://examplesite.com/sitemap.xml Crawlers Setup User-agent: *
Crawl-delay: 10 Allowable Index Mind that Allow is not an official standard Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/catalog/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/ Disallow: /media/ Disallow: /media/captcha/ Disallow: /media/catalog/ #Disallow: /media/css/
#Disallow: /media/css_secure/
Disallow: /media/customer/
Disallow: /media/dhl/
Disallow: /media/downloadable/
Disallow: /media/import/
#Disallow: /media/js/
Disallow: /media/pdf/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/wysiwyg/
Disallow: /media/xmlconnect/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: */catalog/product/upload/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php # Magento 1.5+ Paths (no clean URLs) #Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=
Disallow: /rss*
Disallow: /*PHPSESSID Disallow: /:
Disallow: /😘 User-agent: Fatbot
Disallow: / User-agent: TwengaBot-2.0
Disallow: /0 -
Auto genrated content problem?
Hi all, I operate a Dutch website (sneeuwsporter.nl), the website is a a database of European ski resorts and accommodations (hotels, chalets etc). We launched about a month ago with a database of about 1700+ accommodations. Of every accommodation we collected general information like what village it is in, how far it is from the city centre and how many stars it has. This information is shown in a list on the right of each page (e.g. http://www.sneeuwsporter.nl/oostenrijk/zillertal-3000/mayrhofen/appartementen-meckyheim/). In addition a text of this accomodation is auto generated based on some of the properties that are also in the list (like distance, stars etc). Below the paragraph about the accommodation is a paragraph about the village the accommodation is located in, this is a general text that is the same with all the accommodations in this village. Below that is a general text about the resort area, this text is also identical on all the accommodation pages in the area. So a lot of these texts about the village and area are used many times on different pages. Things went well at first and every day we got more Google traffic, and more and more pages. But a few days ago our organic traffic took a near 100% dive, we are hardly listed anymore and if we are at very low places. We expect the Google gave us a penalty. We expect this to be the case because of 2 reasons: we have auto generated text that only vary slightly per page we re-use the content about villages and area's on many pages We quickly removed the content of the villages and resort area's because we are pretty sure that this is definitely something Google does not want. We are less sure about the auto generated content, is this something we should remove as well? These are normal readable text, they just happen to be structured more or less the same way on every page. Finally, when we made these and maybe some other fixes, what is the best and quickest ways to let Google see us again and show them we improved? Thanks in advance!
Technical SEO | | sneeuwsporter0 -
How many times robots.txt gets visited by crawlers, especially Google?
Hi, Do you know if there's any way to track how often robots.txt file has been crawled? I know we can check when is the latest downloaded from webmaster tool, but I actually want to know if they download every time crawlers visit any page on the site (e.g. hundreds of thousands of times every day), or less. thanks...
Technical SEO | | linklater0 -
Should I add my blog posts to my sitemap.txt file?
This seems like it should be an obvious no, just because of the amount of work that would entail, and then remembering to do it every time I make a post, but since I couldn't find anything on Google about it and have never heard anyone mention it, I figured I'd ask.
Technical SEO | | UnderRugSwept0 -
Robots.txt
Hello Everyone, The problem I'm having is not knowing where to have the robots.txt file on our server. We have our main domain (company.com) with a robots.txt file in the root of the site, but we also have our blog (company.com/blog) where were trying to disallow certain directories from being crawled for SEO purposes... Would having the blog in the sub-directory still need its own robots.txt? or can I reference the directories i don't want crawled within the blog using the root robots.txt file? Thanks for your insight on this matter.
Technical SEO | | BailHotline0 -
Problem of printer friendly version.
For one of our client's side, most of the backlinks are going to printer friendly version page. I recommeded to him to use the canonical tag on printer friendly version pointing to other page. Luckily, while searching i came across this posts at - http://www.seomoz.org/q/solving-printer-friendly-version The solution recommended was this - <link type="text/css" rel="stylesheet" media="print" href="our-print-version.css"> My questions are - 1. what should i write in place of our-print-version.css Should it be print.css ? 2. Where do i place this code ? in which file ?
Technical SEO | | seoug_20050 -
301 redirect .htaccess problem
Can anyone explain to me why this doesn't work? Redirect 301 /category/diamond-pendants/nstart/1/start/(.*) http://www.povada.com/category/pendants/nstart/1/start/$1 Im trying to replace everything after /start/ and insert it into the new url. Thanks in advance.
Technical SEO | | 13375auc30