What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
-
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT.
I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap.
Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
-
No, but they might write to it, modify it, or do all sorts of other nasty stuff I've seen hackers do when they get a hold of any writeable file on a system.
-
lol it's a robots text file. what are they going to do. Steal it?
I should have clarified do a 777 to make sure that is not your problem, then yes change the permission to be tighter
-
Eesh I don't recommend 777. 644 or, if you're going to change it right back, 755 at most.
-
File permission maybe? Change it to 777 and try it again
-
If you have shell access on Linux you can use wget or GET or run lynx.
If google is getting the wrong robots file then your web server must be sending out something other than what you think is the robots file.
What happens if you do this in your browser:
-
Looking in my log files, Google hits robots.txt just about every time it crawls our site.
What are you trying to accomplish using fetch as Googlebot? Any chance CURL could do the job for you, or another tool that ignores robots.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt on subdomains
Hi guys! I keep reading conflicting information on this and it's left me a little unsure. Am I right in thinking that a website with a subdomain of shop.sitetitle.com will share the same robots.txt file as the root domain?
Technical SEO | | Whittie0 -
Google Webmaster Tools: MESSAGE
Dear site owner or webmaster of http://www.enakliyat.com.tr/,
Technical SEO | | iskq
Some of your site's pages may be using techniques that do not comply with Google's Webmaster Guidelines.
On your site, in particular, does not provide an adequate level of innovation in low-quality unique content or set of pages. Examples of this type of thin affiliate pages, pages, bridge pages, it will automatically be created or copied content. For more information about the unique and interesting content, visit http://www.google.com/support/webmasters/bin/answer.py?answer=66361.
We recommend you to make the necessary changes to your site to fit your site's quality guidelines. After making these changes, please submit your site for reconsideration in Google's search results.
If you have questions about how to resolve this problem, please see our Webmaster Help Forum for support.
Sincerely,
Google Search Quality Team **After this massege ve find our low quality pages and we added this urls on Robots.txt. Other than that, what can we do? ** **Our site is a home to home moving listing portal. Consumers who wants to move his home fills a form so that moving companies can cote prices. We were generating listing page URL’s by using the title submitted by customer. **0 -
Data Highlighter doesn't show page
We have an event related website http://www.sbo.nl so i wanted to use data highlighter because most of our event pages are the same. But data highlighter doesn't show those pages, I will see only an empty page. For example http://www.sbo.nl/veiligheid/brandveiligheid-gebouwen/ Does someone of you understand what is going on. Data highlighter does show the homepage. I am thinking it is maybe because of tabbed browsing, or the chat function in that page. Hope someone can help. You can see a screenshot of Data Highlighter http://www.clipular.com/c?6659030=2X2gQv4O8_9RzcZ1Hk_7xGtCPYo&f=d40975c80bdd11dc357f050cafa73a80 I hope someone can help because i am lost 🙂 Cheers Ruud
Technical SEO | | RuudHeijnen0 -
'External nofollow' in a robots meta tag? (advertorial links)
I believe this has never worked? It'd be an easy way of preventing any penalties from Google's recent crackdown on paid links via advertorials. When it's not possible to nofollow each external link individually, what are people doing? Nofollowing and/or noindexing the whole page?
Technical SEO | | Alex-Harford0 -
How can I optimise for Google Products?
Has anyone got experience of optimising Google Products (Google Base) feeds? I've noticed that, although my site doesn't often appear on page one in the standard results, we occasionally appear right at the top because of the "universal" shopping results. My question is: how can we make this happen more often? There seems to be a lot less competition (presumably because our competitors haven't worked out how to provide the feed to Google yet!), so I imagine it should be easier and quicker to reach the top this way than any other way. Thanks! Alex
Technical SEO | | reddogmusic0 -
Should I add my blog posts to my sitemap.txt file?
This seems like it should be an obvious no, just because of the amount of work that would entail, and then remembering to do it every time I make a post, but since I couldn't find anything on Google about it and have never heard anyone mention it, I figured I'd ask.
Technical SEO | | UnderRugSwept0 -
Blocking other engines in robots.txt
If your primary target of business is not in China is their any benefit to blocking Chinese search robots in robots.txt?
Technical SEO | | Romancing0 -
Access To Client's Google Webmaster Tools
Hi, What's the best/easiest way for a client to grant access to his Google Webmaster Tools to me? Thanks! Best...Michael
Technical SEO | | 945010