Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Oh no googlebot can not access my robots.txt file
-
I just receive a n error message from google webmaster
Wonder it was something to do with Yoast plugin.
Could somebody help me with troubleshooting this?
Here's original message
Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.
Recommended action
If the site error rate is 100%:
- Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot.
- If your robots.txt is a static page, verify that your web service has proper permissions to access the file.
- If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure.
If the site error rate is less than 100%:
- Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors.
- The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website.
After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
-
I can open text file but Godaddy told me robots.txt file is not on my server (root level).
Also told me that my site is not crawled because robot.txt file is not there.
Basically all of those might have resulted from plug in I was using (term optimizer)
Based on what Godaddy told me, my .htaccess file was crashed because of that and had to be recreated. So now .htaceess file is good.
Now I have to figure out is why my site is not accessible from Googlebot.
Let me know Keith if this is a quick fix or need some time to troubleshoot. You can send me a message to discuss about fees if nessary.
Thanks again
-
Hi,
You have a robots.txt file here: http://www.soobumimphotography.com/robots.txt
Can you write this again in English so it makes sense?
"I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine."
Yes google xml sitemaps will add the location of your stitemap to the robots.txt file - but there is nothing wrong with your robots.txt file.
-
I just called Godaddy and told me that I don't have robots.txt tile. Can anyone help with this issue?
So here's what happen:
I purchased Joos de Vailk's Term Optimizer to consolidate tags etc.
As soon as I installed & opened it, my site crashed.
I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine.
Isn't plugin like the Google XML Sitemaps automatically generates robots.txt file?
-
Yes, my site was down.
-
I had a .htaccess issue past 24 hour with plug in and Godaddy had fixed it for me.
I think this caused problem.
I just fetched again and still getting unreachable page. I wonder if I have bad .htaccess file
-
Was your site down during this period?
I would recommend setting up pingdom.com (free site monitoring), this will email you if your site goes down - I suspect this is a hosting related issue.
FYI, I can access your robots.txt fine from here.
-
Hi Bistoss, You should log into Google Webmaster Tools to check the day the problem occurred. It is not uncommon for host to have problems that temporarily cause access problems. In some rare cases Google itself could be having problems. For example, in July we had 1 day with a 11% failure rate, it was the host. Since then no problems. If your problems are persistent, then you may have an issue like this: http://blog.jitbit.com/2012/08/fixing-googlebot-cant-access-your-site.html old Analytic code. Other things to look at is any recent changes, specifically anything that had to do with .htaccess Be sure to use the FETCH AS GOOGLE bot after any changes to verify that Google can now crawl your site. Hope this helps
-
I also use Robots Meta Configuration plug in
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robot.txt : How to block a specific file type in several subdirectories ?
Hello everyone ! I need help setting up a robot.txt. I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site. Block files of a specific file type (for example, .gif) | Disallow: /*.gif$ 2 questions : Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ? Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$ Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files. Let's say I want to block pdf files in all these 3 directories /fileadmin/directory1 /fileadmin/directory1/sub1 /fileadmin/directory1/sub1/pdf Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple : Disallow: /fileadmin/directory1*/ Many thanks in advance for any insight you may have.
Technical SEO | | LabeliumUSA0 -
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
Converting files from .html to .php or editing .htaccess file
Good day all, I have a bunch of files that are .html and I want to add some .php to them. It seems my 2 options are Convert .html to .php and 301 redirect or add this line of code to my .htaccess file and keep all files that are .html as .html AddType application/x-httpd-php .html My gut is that the 2nd way is better so as not alter any SEO rankings, but wanted to see if anybody had any experience with this line of code in their .htaccess file as definitely don't wan to mess up my entire site 🙂 Thanks for any help! John
Technical SEO | | JohnHerrigel0 -
Invisible robots.txt?
So here's a weird one... Client comes to me for some simple changes, turns out there are some major issues with the site, one of which is that none of the correct content pages are showing up in Google, just ancillary (outdated) ones. Looks like an issue because even the main homepage isn't showing up with a "site:domain.com" So, I add to Webmaster Tools and, after an hour or so, I get the red bar of doom, "robots.txt is blocking important pages." I check it out in Webmasters and, sure enough, it's a "User agent: * Disallow /" ACK! But wait... there's no robots.txt to be found on the server. I can go to domain.com/robots.txt and see it but nothing via FTP. I upload a new one and, thankfully, that is now showing but I've never seen that before. Question is: can a robots.txt file be stored in a way that can't be seen? Thanks!
Technical SEO | | joshcanhelp0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
Can local SEO harm national rankings?
Today I met with a firm called Localeze that provides local directory submissions. I understand the importance of this service if your site is competing locally, however I'm not sure the effects of local SEO for a national brand. Our firm gets most of our traffic from across the country, not just one location, and our business is scattered (which is a good thing). We rank for service related keywords that are not tied to a location. We do not show up for local results so our business in our immediate location is weak. We would like to increase our local presence in search engines but I want to make sure that this will not take away from our national presence. Will optimizing a site for local search negatively affect general rankings? Thanks
Technical SEO | | KevinBloom1 -
.htacess file format for Apache Server
Hi, My website having canonical issue for home page, I have written the .htaccess file and upload the root directory. But still I didn't see any changes in the home page. I am copying syntax which one I have written in the .htaccess file. Please review the syntax and let me know the changes. Options +FollowSymlinks RewriteEngine on #RewriteBase / re-direct index.htm to root / ### RewriteCond %{THE_REQUEST} ^./index.htm\ HTTP/ RewriteRule ^(.)index.htm$ /$1 [R=301,L] re-direct IP address to www ### re-direct non-www to www ### re-direct any parked domain to www of main domain RewriteCond %{http_host} !^www.metricstream.com$ [nc] RewriteRule ^(.*)$ http://www.metricstream.com/$1 [r=301,nc,L] Is there any specific htaccess file format for apache server? Thanks, Karthik
Technical SEO | | karthik-1755440