Why is robots.txt blocking URL's in sitemap?
-
Hi Folks,
Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't?
I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine.
Is this a temporary issue that will be resolved over a few days or should I be concerned.
I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover.
Thanks
Gaz
-
Brilliant, thanks for the clarification Matt. Much appreciated.
-
Yes, it should clear up by itself after a day or two. Google will need to recrawl the robots.txt file next time they visit your site. It should clear up then. If you make immediate changes and try to resubmit a sitemap, you almost always get this issue.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap url's not being indexed
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed) The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it. For example Url in the sitemap: http://example.com/example-category/0246 Url once you actually go to that link: http://example.com/example-category/0246#.VR5a Just for further information, the XML file does not have any style information associated with it and is in it's most basic form. Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ? Thanks all for your help.
Technical SEO | | GreenStone0 -
When do you use 'Fetch as a Google'' on Google Webmaster?
Hi, I was wondering when and how often do you use 'Fetch as a Google'' on Google Webmaster and do you submit individual pages or main URL only? I've googled it but i got confused more. I appreciate if you could help. Thanks
Technical SEO | | Rubix1 -
What's the correct SEO for a Gallery?
Hi there, I was wondering if anyone was an expert on galleries and using canonical URL's? URL: http://www.tecsew.com/gallery In short I'm doing SEO for a site and it has a large gallery (3000+ images) where each specific image has it's own page and each category (there's 200+) also has its own page. Now, what I'm thinking is that this should be reduced and asking Google to index/rank each page is wrong (I also think this because the quality of the pages are relatively low i.e little text & content etc) Therefore, what should be suggested/done to the gallery? Should just the main gallery categories get indexed (i.e http://www.tecsew.com/3d-cad-showcase)? Or should I continue to allow Google to trawl through all of it? Or should canonical URL's be used? Any help would be greatly appreciated. Best Wishes, Charlie S
Technical SEO | | media.street0 -
What's the latest on Title Tags?
What is the latest on what Google is looking for? Keyword one, Keyword two? Sentences with the Keyword in them?
Technical SEO | | netviper0 -
Using Robots.txt
I want to Block or prevent pages being accessed or indexed by googlebot. Please tell me if googlebot will NOT Access any URL that begins with my domain name, followed by a question mark,followed by any string by using Robots.txt below. Sample URL http://mydomain.com/?example User-agent: Googlebot Disallow: /?
Technical SEO | | semer0 -
301 an old URL with a ? in the URL?
I am redoing a site and the URL's are changing structure. The client's site was in magento and in the store they would get two URLs, for example: /store/categoryname/productname and /store/categoryname/productname?SID=dslkajsfdoiu947598whouieht983hg98 Do I have to 301 redirect both of these URL's to their new counterpart? Both go to the same content but magento seemed to add these SIDs into the navigation and Google has both versions in the index.
Technical SEO | | DanDeceuster0 -
What's the difference between a category page and a content page
Hello, Little confused on this matter. From a website architectural and content stand point, what is the difference between a category page and a content page? So lets say I was going to build a website around tea. My home page would be about tea. My category pages would be: White Tea, Black Tea, Oolong Team and British Tea correct? ( I Would write content for each of these topics on their respective category pages correct?) Then suppose I wrote articles on organic white tea, white tea recipes, how to brew white team etc...( Are these content pages?) Do I think link FROM my category page ( White Tea) to my ( Content pages ie; Organic White Tea, white tea receipes etc) or do I link from my content page to my category page? I hope this makes sense. Thanks, Bill
Technical SEO | | wparlaman0 -
Handling '?' in URLs.
Adios! (or something), I've noticed in my SEOMoz campaign that I am getting duplicate content warnings for URLs with extensions. For example: /login.php?action=lostpassword /login.php?action=register etc. What is the best way to deal with these type of URLs to avoid duplicate content penelties in search engines? Thanks 🙂
Technical SEO | | craigycraig0