Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How important are sitemap errors?
-
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors?
I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds
Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
-
Very important. Particularly if you have a large site. We operate a large site with 100,000's of pages and as Dan said it can be difficult to maintain. We use something called Unlimited XML Sitemap Generator which builds XML sitemaps for us automatically. I'd highly recommend it although it takes a bit of fiddling with to get it up and running as it's software which sits on site. We couldn't manage without it as we'd be forever on sitemaps.
We found that getting sitemaps right on a large site made a huge difference to the crawl rate that we encountered in GWT and a huge indexation to follow.
In particular check for 302's. I made the mistake of leaving those for a while and am sure that we suffered from some loss of link equity along the way.
Hope it helps
Dawn
-
Your sitemap should only list pages that actually exist.
If you delete some pages, then you need to rebuild the sitemap.
Ditto if you delete them and redirect.
Google is always lagging, so if you delete 10 pages and then update the sitemap, even if google downloads the sitemap immediately, they will still be running crawls on the old map, and they may be crawling the now-missing pages, but haven't shown the failures in your WMT yet.
If you update your sitemap quickly, it is possible they will never crawl the missing pages and get a 404 or 301.
(but of course, there could be other sites pointing to the now-missing pages, and the 404s will show up elsewhere as missing)
I am always checking, adding, deleting and redirecting pages, and I update the current sitemap every hour and all the others are rebuilt at midnight every night. I usually do deletions just before midnight if I can, to minimize the time the sitemap is out of sync.
-
As far as I know Google is more lenient with sitemap errors, but I would still recommend looking into it. The first step would be to be sure your sitemap is up to date to begin with - and has all the URLs you want (and not any you don't want). The main thing is none of them should 404 and then beyond that, yes, they should return 200's.
Unless you're dealing with a gigantic site which might be hard to maintain, in theory there shouldn't be errors in sitemaps if you have the correct URLs in there.
Even better, if you're running WordPress the Yoast SEO plugin will generate an XML sitemap for you and it update automatically.
Hope that helps!
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How much time does it take for Google to read the Sitemap?
Hi there, I could use your help with something. Last week, I submitted my sitemap in the search console to improve my website's visibility on Google. Unfortunately, I got an error message saying that Google is not reading my sitemap. I'm not sure what went wrong. Could you take a look at my site (OceanXD.org) and let me know if there's anything I can do to fix the issue? I would appreciate your help. Thank you so much!
Intermediate & Advanced SEO | | OceanXD1 -
404 Errors flaring on nonexistent or unpublished pages – should we be concerned for SEO?
Hello! We keep getting "critical crawler" notifications on Moz because of firing 404 codes. We've checked each page and know that we are not linking to them anywhere on our site, they are not published and they are not indexed on Google. It's only happened since we migrated our blog to Hubspot so we think it has something to do with the test pages their developers had set up and that they are just lingering in our code somewhere. However, we are still concerned having these codes fire implies negative consequences for our SEO. Is this the case? Should we be concerned about these 404 codes despite the pages from those URLs not actually existing? Thank you!
Intermediate & Advanced SEO | | DebFF
Chloe0 -
The Bad effect of Submitting Sitemap frequently?
Hi Mozzer... so, i keep thinking of this... what is the bad effect of submitting the sitemap frequently? is it something like google would smell something suspicious and begin to decrease my website's authority? and is there any supporting articles for it? my website is an e-commerce website by the way... so please, help me with this.. Thank you 🙂
Intermediate & Advanced SEO | | ricoplaza0 -
Sitemap with homepage URL repeated several times - it is a problem?
Hello Mozzers, I am looking at a website with the homepage repeated several times (4 times) on the sitemap (sitemap is autogenerated via a plugin) - is this an SEO problem do you think - might it damage SEO performance, or can I ignore this issue? I am thinking I can ignore, yet it's an odd "issue" so your advice would be welcome! Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Do I have to many internal links which is diluting link juice to less important pages
Hello Mozzers, I was looking at my homepage and subsequent category landing pages on my on my eCommerce site and wondered whether I have to many internal links which could in effect be diluting link juice to much of the pages I need it to flow. My homepage has 266 links of which 114 (43%) are duplicate links which seems a bit to much to me. One of my major competitors who is a national company has just launched a new site design and they are only showing popular categories on their home page although all categories are accessible from the menu navigation. They only have 123 links on their home page. I am wondering whether If I was to not show every category on my homepage as some of them we don't really have any sales from and only concerntrate on popular ones there like my competitors , then the link juice flowing downwards in the site would be concerntated as I would have less links for them to flow ?... Is that basically how it works ? Is there any negatives with regards to duplicate links on either home or category landing page. We are showing both the categories as visual boxes to select and they are also as selectable links on the left of a page ? Just wondered how duplicate links would be treated? Any thoughts greatly appreciated thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Recovering from robots.txt error
Hello, A client of mine is going through a bit of a crisis. A developer (at their end) added Disallow: / to the robots.txt file. Luckily the SEOMoz crawl ran a couple of days after this happened and alerted me to the error. The robots.txt file was quickly updated but the client has found the vast majority of their rankings have gone. It took a further 5 days for GWMT to file that the robots.txt file had been updated and since then we have "Fetched as Google" and "Submitted URL and linked pages" in GWMT. In GWMT it is still showing that that vast majority of pages are blocked in the "Blocked URLs" section, although the robots.txt file below it is now ok. I guess what I want to ask is: What else is there that we can do to recover these rankings quickly? What time scales can we expect for recovery? More importantly has anyone had any experience with this sort of situation and is full recovery normal? Thanks in advance!
Intermediate & Advanced SEO | | RikkiD220 -
Generating 404 Errors but the Pages Exist
Hey I have recently come across an issue with several of a sites urls being seen as a 404 by bots such as Xenu, SEOMoz, Google Web Tools etc. The funny thing is, the pages exist and display fine. This happens on many of the pages which use the Modx CMS, but the index is fine. The wordpress blog in /blog/ all works fine. The only thing I can think of is that I have a conflict in the htaccess, but troubleshooting this is difficult, any tool I have found online seem useless. Have tried to rollback to previous versions but still does not work. Anyone had any experience of similar issues? Many thanks K.
Intermediate & Advanced SEO | | Found0