Sitemap Warnings
-
Due to an issue with our CMS, I had a bunch of URL aliases that were being indexed and causing duplicate content issues.
I disallowed indexing of the bad URLs (they all had a similar URL structure so that was easy). I did this until I could clean up the bad URLs
I then recieved a bunch of sitemap warnings that the URLs that I blocked URLs with robots.txt that were in the sitemap.
Isn't this the point of robots.txt? Why am I getting warnings and how can I get rid of them?
-
Irving -
Ok, so we took the restriction out of robots.txt while IT tries to fix the issue of URLs showing up on the sitemap that shouldn't.
Warnings haven't fallen off and now our sitemap is a day behind now as it's stuck in pending for almost a full day.
Any thoughts on what might be causing? I'm assuming this is impacting what's indexed and hurting our site.
-
Ok, so we took the restriction out of robots.txt while IT tries to fix the issue of URLs showing up on the sitemap that shouldn't.
Warnings haven't fallen off and now our sitemap is a day behind now as it's stuck in pending for almost a full day.
Any thoughts on what might be causing? I'm assuming this is impacting what's indexed and hurting our site.
-
Irving,
Totally get that and we're working to ensure they are no longer included in the sitemap.
Thanks,
Lisa
-
The purpose of your sitemap is to tell Google to go out and index the pages you specify. The purpose of the robots.txt is to tell Google not to index the page. The warning is likely just a precaution to let you know that you may have by accident requested them to block something in robots.txt. If you remove the URL's from your submitted sitemap the warnings should disappear. If you leave them, you will have warnings but Google should not index the content since your blocked it in robots.txt.
-
you are not supposed to include blocked URLs in the sitemap.xml files, or Google considers it wasting their crawl time. Are these automated sitemap.xml files?
You're basically saying "come index these pages i've listed, but don't index them!"
Remove the URLs that are blocked content (or rerun/regenerate them) and resubmit the sitemaps and the warnings will go away.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta no index crawler warnings
I've decided that the duplicate content issues on my site weren't worth the effort from the amount of traffic the archive pages on my WordPress site received no I decided to no-index them using Yoast. Now I have 60 meta no-index crawler warnings. Should I just ignore these? It seems I get warnings, either way, I use the site. Does anyone have advice on how to move on with this?
Moz Pro | | Libra_Photographic0 -
Inbound Links Warning
I got the following error about our domain name in Link Explorer. "You entered the URL freexy.net which redirects to youlovelife.com/?domain=freexy.net. Click here to analyze youlovelife.com/?domain=freexy.net instead." Can you give me an advice about this problem?
Moz Pro | | ligia.tatucu0 -
Missing Title for Sitemap
Our site is built on Wordpress and we use a very popular SEO plugin called Yoast to generate our sitemap (as well as handle multiple other SEO functions). When MOZ's spider crawls our site, this sitemap triggers an error saying "Missing Title or Empty." My question is how can I avoid having this error hurt me in terms of my rankings. It seems strange to me that such a ubiquitous plugin would be generating something as important as a sitemap in an incorrect format.
Moz Pro | | ShatterBuggy0 -
Issues with Moz producing 404 Errors from sitemap.xml files recently.
My last campaign crawl produced over 4k 404 errors resulting from Moz not being able to read some of the URLs in our sitemap.xml file. This is the first time we've seen this error and we've been running campaigns for almost 2 months now -- no changes were made to the sitemap.xml file. The file isn't UTF-8 encoded, but rather Content-Type:text/xml; charset=iso-8859-1 (which is what Moveable Type uses). Just wondering if anyone has had a similar issue?
Moz Pro | | BriceSMG0 -
How to fix the Crawl Diagnostics error and warnings
hi im new to the seo world and i dont know a lot about it , so after my site get crawled i found 1 error and 151 warning and 96 notices , it that bad ?? and plz cam someone explain to me how to fix thos problem , a will be very thankful
Moz Pro | | medlife0 -
Warnings, Notices, and Errors- don't know how to correct these
I have been watching my Notices, Warnings and Errors increase since I added a blog to our WordPress site. Is this effecting our SEO? We now have the following: 2 4XX errors. 1 is for a page that we changed the title and nav for in mid March. And one for a page we removed. The nav on the site is working as far as I can see. This seems like a cache issue, but who knows? 20 warnings for “missing meta description tag”. These are all blog archive and author pages. Some have resulted from pagination and are “Part 2, Part 3, Part 4” etc. Others are the first page for authors. And there is one called “new page” that I can’t locate in our Pages admin and have no idea what it is. 5 warnings for “title element too long”. These are also archive pages that have the blog name and so are pages I can’t access through the admin to control page title plus “part 2’s and so on. 71 Notices for “Rel Cononical”. The rel cononicals are all being generated automatically and are for pages of all sorts. Some are for a content pages within the site, a bunch are blog posts, and archive pages for date, blog category and pagination archive pages 6 are 301’s. These are split between blog pagination, author and a couple of site content pages- contact and portfolio. Can’t imagine why these are here. 8 meta-robot nofollow. These are blog articles but only some of the posts. Don’t know why we are generating this for some and not all. And half of them are for the exact same page so there are really only 4 originals on this list. The others are dupes. 8 Blocked my meta-robots. And are also for the same 4 blog posts but duplicated twice each. We use All in One SEO. There is an option to use noindex for archives, categories that I do not have enabled. And also to autogenerate descriptions which I do not have enabled. I wasn’t concerned about these at first, but I read these (below) questions yesterday, and think I'd better do something as these are mounting up. I’m wondering if I should be asking our team for some code changes but not sure what exactly would be best. http://www.seomoz.org/q/pages-i-dont-want-customers-to-see http://www.robotstxt.org/meta.html Our site is http://www.fateyes.com Thanks so much for any assistance on this!
Moz Pro | | gfiedel0 -
Does anyone have suggestions for a good XML Sitemap Generator?
Does anyone have any suggestions on a good XML Sitemap Generator? Also interested in best practices and tips for updating the XML Sitemap. I typically have relied on my web developers to do this however it seems that they have not been setting this up with SEO in mind.
Moz Pro | | webestate0 -
SEOMoz's Crawl Diagnostics showing an error where the Title is missing on our Sitemap.xml file?
Hi Everyone, I'm working on our website Sky Candle and I've been running it as a campaign in SEOmoz. I've corrected a few errors we had with the site previously, but today it's recrawled and found a new error which is a missing Title tag on the sitemap.xml file. Is this a little glitch in the SEOmoz system? Or do I need to add a page title and meta description to my XML file. http://www.skycandle.co.uk/sitemap.xml Any help would be greatly appreciated. I didn't think I'd need to add this. Kind Regards Lewis
Moz Pro | | LewisSellers0