Googlebot Can't Access My Sites After I Repair My Robots File

NiallSmith

Hello Mozzers,

A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file'

My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows:

User-agent: *

Disallow: /cgi-bin/

After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage.

I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period.

Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'?

Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!

Igal_Zeifman

Oleg gave a great answer.

Still I would add 2 things here:

1. Go to GWMT and under "Health" do a "Fetch as Googlebot" test.
This will tell you what pages are reachable.

2. I`ve saw some occasions of server-level Googlebot blockage.
If your robots.txt is fine and your page contains no "no-index" tags, and yet you still getting an error message while fetching, you should get a hold on your access logs and check it for Googlebot user-agents to see if (and when) you were last visited.

This will help you pin-point the issue, when talking to your hosting provider (or 3rd party security vendor).

If unsure, you can find Googlebot information (user agent and IPs ) at Botopedia.org.

Webrevolve

A great answer

OlegKorneitchouk

Maybe the spacing is off when you posted it here, but blank lines can affect robots.txt files. Try code:

User-agent: *
Disallow: /cgi-bin/
#End Robots#

Also, check for robot blocking meta tags on the individual pages.

You can test to see if Google can access specific pages through GWT > Health > Blocked URLs (should see your robots.txt file contents int he top text area, enter the urls to test in the 2nd text area, then press "Test" at the bottom - test results will appear at the bottom of the page)

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Googlebot Can't Access My Sites After I Repair My Robots File

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

How can improve Domain Athturitymy web site

When the site's entire URL structure changed, should we update the inbound links built pointing to the old URLs?

Old site penalised, we moved: Shall we cut loose from the old site. It's curently 301 to new site.

Can you explain why a site with loads of keywork anchor backlinks is ranking well?

Manual action penalty revoked, rankings still low, if we create a new site can we use the old content?

Other domains hosted on same server showing up in SERP for 1st site's keywords

Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?

Should I robots block site directories with primarily duplicate content?