What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
-
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT.
I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap.
Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
-
No, but they might write to it, modify it, or do all sorts of other nasty stuff I've seen hackers do when they get a hold of any writeable file on a system.
-
lol it's a robots text file. what are they going to do. Steal it?
I should have clarified do a 777 to make sure that is not your problem, then yes change the permission to be tighter
-
Eesh I don't recommend 777. 644 or, if you're going to change it right back, 755 at most.
-
File permission maybe? Change it to 777 and try it again
-
If you have shell access on Linux you can use wget or GET or run lynx.
If google is getting the wrong robots file then your web server must be sending out something other than what you think is the robots file.
What happens if you do this in your browser:
-
Looking in my log files, Google hits robots.txt just about every time it crawls our site.
What are you trying to accomplish using fetch as Googlebot? Any chance CURL could do the job for you, or another tool that ignores robots.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google adding text to SERP title which isn't relevant
Hi guys, I have a site with around 300 articles on it and these articles came from three old domains which were migrated during a Wordpress domain migration almost four months back. There The problem I'm having is that for quite a lot of the articles in the SERP, Google is adding '- Maine Coons' to the end of the title. One of our old domains was related to this breed of cat so at least in Google's eyes it must have something to do with this I guess. I've attached a screenshot that shows one such example. What's odd is a lot of the new content that has been created also has this suffix added and it doesn't show in any other search engine. So, it doesn't appear in other search engines and it's not coming from the article itself (proved also via developer tools inspecting the code). So, Google is adding it but as you can see in this example (there are many more) it has absolutely no relevance to the post. Has anyone seen this behavior or have any idea how to fix it? I've tried all kinds of things and have even hired SEO 'experts' that haven't been able to see any problems. Any clues? Thanks, Matt K71Y3P9
Technical SEO | | mattpettitt0 -
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
Google will index us, but Bing won't. Why?
Bing is crawling our site, but not indexing it, and we cannot figure out why -- plus it's being indexed fine in Google. Any ideas on what the issue with Bing might be? Here's are some details to let you know what we've already checked/established: We have 4 301’s and the rest of our site checks out We’ve already established our Robots is ok, and that we are fixing our site map/it's in fine shape We do not see anything blocking bingbot access to the site There is no varnish or any load balancers, so nothing on that end that would be blocking the access We also don't see any rules in the apache or the .htaccess config that would be blocking the access
Technical SEO | | Alex_RevelInteractive0 -
Using Wix and the keyword grader tool isn't working.
After reading other posts about Wix and SEO I think I need to change the web design provider to something I have more control of SEO options. Does anyone have any suggestions of something I can use?
Technical SEO | | benjacksoncook0 -
Google Webmasters Quality Issue Message
I am a consultant who works for a website www.skift.com. Today we received an automated message from Google Webmasters saying our site has quality issues. Since the message is very vague and obviously automated I was hoping to get some insight into whether this message is something to be very concerned about and what can be done to correct the issue.From reviewing the Webmasters Quality Guidelines, the site is not in violation of any of the guidelines. I am wondering if this message is generated as a results of licensing content from Newscred, as I have other clients who are licensing content from Newscred and getting the same message from Google Webmasters.Thanks in advance for any assistance.
Technical SEO | | electricpulp0 -
Persistent Unnatural Links in Webmaster tools
We recently were notified about unnatural links from two websites (totalling a few thousands links each). We went to the websites and asked them to remove the links, which they apparently did. After this we applied for reconsideration to Google, explaining the situation, however they came back and said we still have links. We noticed there were still links, however there were less than before, and so we once again asked the sites to remove all the links. Now we are sure all the links are gone as when we click a random link and view the page source there is no reference to our site, however WebMaster tools is not updating the link list, claiming we still have thousands of links. Do we have to apply for another reconsideration request to get them to re-crawl the sites to get rid of the links, or should it happen automatically?
Technical SEO | | eXia0 -
Magento - Google Webmaster Crawl Errors
Hi guys, Started my free trial - very impressed - just thought I'd ask a question or two while I can. I've set up the website for http://www.worldofbooks.com (large bookseller in the UK), using Magento. I'm getting a huge amount of not found crawl errors (27,808), I think this is due to URL rewrites, all the errors are in this format (non search friendly): http://www.worldofbooks.com/search_inventory.php?search_text=&category=&tag=Ure&gift_code=&dd_sort_by=price_desc&dd_records_per_page=40&dd_page_number=1 As oppose to this format: http://www.worldofbooks.com/arts-books/history-of-art-design-styles/the-art-book-by-phaidon.html (the re-written URL). This doesn't seem to really be affecting our rankings, we targeted 'cheap books' and 'bargain books' heavily - we're up to 2nd for Cheap Books and 3rd for Bargain Books. So my question is - are these large amount of Crawl errors cause for concern or is it something that will work itself out? And secondly - if it is cause for concern will it be affecting our rankings negatively in any way and what could we do to resolve this issue? Any points in the right direction much appreciated. If you need any more clarification regarding any points I've raised just let me know. Benjamin Edwards
Technical SEO | | Benj250