Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google insists robots.txt is blocking... but it isn't.
-
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site.
When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt
Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section.
Bing's webmaster tools are able to read the site and sitemap just fine.
Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
-
Hi Aaron - You have a couple of solid answers here. Has your issue been resolved in GWT?
-
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
-
Hi Aaron,
I identify with your frustration, but want to lead my response with the caveat that I am not a developer so there may be people here with much more technical SEO expertise than me who might have a better answer.
What I do know id that Google Webmaster Tools data is not real time and can often take days to weeks to update. It could be that the reason GWT is showing something different about your robots.txt file is because it's old information that hasn't updated yet.
When I looked at your robots.txt file, I found two sitemaps, one with 2 URLs and one with 8 URLs. This is pretty tiny. Even in the old days, conventional wisdom was that it took at least 20 content pages in order for Google to take note and index the site.
Have you tried posting the URLs of your new site on Google+? I have heard that this is a great indexing tool in addition to the Fetch as Googlebot in GWT. Just a thought!
You know, there was a time when it took 6-8 weeks for a new site to get indexed. Google has definitely sped up to the point where I think we are all expecting instant results and sometimes that just doesn't happen.
I think this just might be a matter of patience. However, I am always willing to admit that I could be wrong and am interested to know what others think!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots txt. in page with 301 redirect
We currently have a a series of help pages that we would like to disallow from our robots txt. The thing is that these help pages are located in our old website, which now has a 301 redirect to current site. Which is the proper way to go around? 1- Add the pages we want to disallow to the robots.txt of the new website? 2- Break the redirect momentarily and add the pages to the robots.txt of the old one? Thanks
Technical SEO | | Kilgray0 -
Robots.txt and Multiple Sitemaps
Hello, I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file. Example: User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J
Technical SEO | | allstatetransmission0 -
Why is Google's cache preview showing different version of webpage (i.e. not displaying content)
My URL is: http://www.fslocal.comRecently, we discovered Google's cached snapshots of our business listings look different from what's displayed to users. The main issue? Our content isn't displayed in cached results (although while the content isn't visible on the front-end of cached pages, the text can be found when you view the page source of that cached result).These listings are structured so everything is coded and contained within 1 page (e.g. http://www.fslocal.com/toronto/auto-vault-canada/). But even though the URL stays the same, we've created separate "pages" of content (e.g. "About," "Additional Info," "Contact," etc.) for each listing, and only 1 "page" of content will ever be displayed to the user at a time. This is controlled by JavaScript and using display:none in CSS. Why do our cached results look different? Why would our content not show up in Google's cache preview, even though the text can be found in the page source? Does it have to do with the way we're using display:none? Are there negative SEO effects with regards to how we're using it (i.e. we're employing it strictly for aesthetics, but is it possible Google thinks we're trying to hide text)? Google's Technical Guidelines recommends against using "fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash." If we were to separate those business listing "pages" into actual separate URLs (e.g. http://www.fslocal.com/toronto/auto-vault-canada/contact/ would be the "Contact" page), and employ static HTML code instead of complicated JavaScript, would that solve the problem? Any insight would be greatly appreciated.Thanks!
Technical SEO | | fslocal0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Error: Missing Meta Description Tag on pages I can't find in order to correct
This seems silly, but I have errors on blog URLs in our WordPress site that I don't know how to access because they are not in our Dashboard. We are using All in One SEO. The errors are for blog archive dates, authors and just simply 'blog'. Here are samples: http://www.fateyes.com/2012/10/
Technical SEO | | gfiedel
http://www.fateyes.com/author/gina-fiedel/
http://www.fateyes.com/blog/ Does anyone know how to input descriptions for pages like these?
Thanks!!0 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
Robots.txt Sitemap with Relative Path
Hi Everyone, In robots.txt, can the sitemap be indicated with a relative path? I'm trying to roll out a robots file to ~200 websites, and they all have the same relative path for a sitemap but each is hosted on its own domain. Basically I'm trying to avoid needing to create 200 different robots.txt files just to change the domain. If I do need to do that, though, is there an easier way than just trudging through it?
Technical SEO | | MRCSearch0