Googlebot Can't Access My Sites After I Repair My Robots File

NiallSmith

Hello Mozzers,

A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file'

My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows:

User-agent: *

Disallow: /cgi-bin/

After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage.

I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period.

Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'?

Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!

Igal_Zeifman

Oleg gave a great answer.

Still I would add 2 things here:

1. Go to GWMT and under "Health" do a "Fetch as Googlebot" test.
This will tell you what pages are reachable.

2. I`ve saw some occasions of server-level Googlebot blockage.
If your robots.txt is fine and your page contains no "no-index" tags, and yet you still getting an error message while fetching, you should get a hold on your access logs and check it for Googlebot user-agents to see if (and when) you were last visited.

This will help you pin-point the issue, when talking to your hosting provider (or 3rd party security vendor).

If unsure, you can find Googlebot information (user agent and IPs ) at Botopedia.org.

Webrevolve

A great answer

OlegKorneitchouk

Maybe the spacing is off when you posted it here, but blank lines can affect robots.txt files. Try code:

User-agent: *
Disallow: /cgi-bin/
#End Robots#

Also, check for robot blocking meta tags on the individual pages.

You can test to see if Google can access specific pages through GWT > Health > Blocked URLs (should see your robots.txt file contents int he top text area, enter the urls to test in the 2nd text area, then press "Test" at the bottom - test results will appear at the bottom of the page)

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Googlebot Can't Access My Sites After I Repair My Robots File

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Should you 'noindex' Checkout Pages?

What's the best way of crawling my entire site to get a list of NoFollow links?

Open Site Explorer - Top Pages that don't exist / result of a hack(?)

The images on site are not found/indexed, it's been recommended we change their presentation to Google Bot - could this create a cloaking issue?

Community Discussion - What's the ROI of "pruning" content from your ecommerce site?

Can anyone see any issues with the canonical tags on this web site?

Should I include www in url, or doesn't it matter?

Effect duration of robots.txt file.