Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google insists robots.txt is blocking... but it isn't.
-
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site.
When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt
Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section.
Bing's webmaster tools are able to read the site and sitemap just fine.
Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
-
Hi Aaron - You have a couple of solid answers here. Has your issue been resolved in GWT?
-
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
-
Hi Aaron,
I identify with your frustration, but want to lead my response with the caveat that I am not a developer so there may be people here with much more technical SEO expertise than me who might have a better answer.
What I do know id that Google Webmaster Tools data is not real time and can often take days to weeks to update. It could be that the reason GWT is showing something different about your robots.txt file is because it's old information that hasn't updated yet.
When I looked at your robots.txt file, I found two sitemaps, one with 2 URLs and one with 8 URLs. This is pretty tiny. Even in the old days, conventional wisdom was that it took at least 20 content pages in order for Google to take note and index the site.
Have you tried posting the URLs of your new site on Google+? I have heard that this is a great indexing tool in addition to the Fetch as Googlebot in GWT. Just a thought!
You know, there was a time when it took 6-8 weeks for a new site to get indexed. Google has definitely sped up to the point where I think we are all expecting instant results and sometimes that just doesn't happen.
I think this just might be a matter of patience. However, I am always willing to admit that I could be wrong and am interested to know what others think!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why can't google mobile friendly test access my website?
getting the following error when trying to use google mobile friendly tool: "page cannot be reached. This could be because the page is unavailable or blocked by robots.txt" I don't have anything blocked by robots.txt or robots tag. i also manage to render my pages on google search console's fetch and render....so what can be the reason that the tool can't access my website? Also...the mobile usability report on the search console works but reports very little, and the google speed test also doesnt work... Any ideas to what is the reason and how to fix this? LEARN MOREDetailsUser agentGooglebot smartphone
Technical SEO | | Nadav_W0 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
Technical SEO | | Pete40 -
How to find temporary redirects of existing site you don't control?
I am getting ready to move a clients site from another company. They have like 35 tempory redirects according to MOZ. Question is, how can I find out then current redirects so I can update everything for the new site? Do I need access to the current htaccess file to do this?
Technical SEO | | scott3150 -
Blocked jquery in Robots.txt, Any SEO impact?
I've heard that Google is now indexing links and stuff available in javascript and jquery. My webmastertools is showing that some links are blocked in robots.txt of jquery. Sorry I'm not a developer or designer. I want to know is there any impact of this on my SEO? and also how can I unblock it for the robots? Check this screenshot: http://i.imgur.com/3VDWikC.png
Technical SEO | | hammadrafique0 -
Block Domain in robots.txt
Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.
Technical SEO | | zeepartner0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Blocking URL's with specific parameters from Googlebot
Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards,
Technical SEO | | aethereal
Asim0