Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How important are sitemap errors?
-
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors?
I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds
Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
-
Very important. Particularly if you have a large site. We operate a large site with 100,000's of pages and as Dan said it can be difficult to maintain. We use something called Unlimited XML Sitemap Generator which builds XML sitemaps for us automatically. I'd highly recommend it although it takes a bit of fiddling with to get it up and running as it's software which sits on site. We couldn't manage without it as we'd be forever on sitemaps.
We found that getting sitemaps right on a large site made a huge difference to the crawl rate that we encountered in GWT and a huge indexation to follow.
In particular check for 302's. I made the mistake of leaving those for a while and am sure that we suffered from some loss of link equity along the way.
Hope it helps
Dawn
-
Your sitemap should only list pages that actually exist.
If you delete some pages, then you need to rebuild the sitemap.
Ditto if you delete them and redirect.
Google is always lagging, so if you delete 10 pages and then update the sitemap, even if google downloads the sitemap immediately, they will still be running crawls on the old map, and they may be crawling the now-missing pages, but haven't shown the failures in your WMT yet.
If you update your sitemap quickly, it is possible they will never crawl the missing pages and get a 404 or 301.
(but of course, there could be other sites pointing to the now-missing pages, and the 404s will show up elsewhere as missing)
I am always checking, adding, deleting and redirecting pages, and I update the current sitemap every hour and all the others are rebuilt at midnight every night. I usually do deletions just before midnight if I can, to minimize the time the sitemap is out of sync.
-
As far as I know Google is more lenient with sitemap errors, but I would still recommend looking into it. The first step would be to be sure your sitemap is up to date to begin with - and has all the URLs you want (and not any you don't want). The main thing is none of them should 404 and then beyond that, yes, they should return 200's.
Unless you're dealing with a gigantic site which might be hard to maintain, in theory there shouldn't be errors in sitemaps if you have the correct URLs in there.
Even better, if you're running WordPress the Yoast SEO plugin will generate an XML sitemap for you and it update automatically.
Hope that helps!
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Shopify: AggregateRating Schema Error
Hi lovely community, I know google made some schema changes in Sept 2019. I got an AggregateRating Error:
Intermediate & Advanced SEO | | Insightful_Media
One of offers or review or aggregateRating should be provided. I am using a third-party app 'Shopify Product Review' to implement the rating. What I should do to solve this error. Thanks very much for the help! I found many people have this issue too in the community! Many thanks Pui0 -
URL structure change and xml sitemap
At the end of April we changed the url structure of most of our pages and 301 redirected the old pages to the new ones. The xml sitemaps were also updated at that point to reflect the new url structure. Since then Google has not indexed the new urls from our xml sitemaps and I am unsure of why. We are at 4 weeks since the change, so I would have thought they would have indexed the pages by now. Any ideas on what I should check to make sure pages are indexed?
Intermediate & Advanced SEO | | ang0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
How important is admin-ajax.php?
Hi there! It's been a long time since I last did a technical audit of a site. I've currently playing with the 'fetch as google' tool to find out if we're blocking anything vital. The site is based on Wordpress, and after a recent hacking incident, a previous SEO moved the login portal from domain.com/wp-admin/ to domain.com/pr3ss/wp-admin/ - to stop people finding it. Fair enough. But they then updated the robots.txt file to look like this: User-agent: *
Intermediate & Advanced SEO | | Muhammad-Isap
Disallow: /pr3ss/wp-admin/ Now, some pages are trying to draw on theme elements like: http://www.domain.com/pr3ss/wp-admin/admin-ajax.php
http://www.domain.com/pr3ss/wp-content/themes/bestpracticegroup/images/column_wrapper_bg.png And are naturally being blocked (not that this seems to affect the way pages are rendering in Google's eyes) A good SEO friend of mine has suggested allowing the theme folder, and any sub folders where this becomes an issue. What are your thoughts? Is it even worth disallowing the /pr3ss/wp-admin/ path? Cheers guys and gals! All the best, John. I've found a couple of the theme's0 -
What Happens If a Hreflang Sitemap Doesn't Include Every Language for Missing Translated Pages?
As we are building a hreflang sitemap for a client, we are correctly implementing the tag across 5 different languages including English. However, the News and Events section was never translated into any of the other four languages. There are also a few pages that were translated into some but not all of the 4 languages. Is it good practice to still list out the individual non-translated pages like on a regular sitemap without a hreflang tag? Should the hreflang sitemap include the hreflang tag with pages that are missing a few language translations (when one or two language translations may be missing)? We are uncertain if this inconsistency would create a problem and we would like some feedback before pushing the hreflang sitemap live.
Intermediate & Advanced SEO | | kchandler0 -
URL Error or Penguin Penalty?
I am currently having a major panic as our website www.uksoccershop.com has been largely dropped from Google. We have not made any changes recently and I am not sure why this is happening, but having heard all sorts of horror stories of penguin update, I am fearing the worst. If you google "uksoccershop" you will see that the homepage does not rank. We previously ranked in the top 3 for "football shirts" but now we don't, although on page 2, 3 and 4 you will see one of our category pages ranking (this didn't used to happen). Some rankings are intact, but many have disappeared completely and in some cases been replaced by other pages on our site. I should point out our existing rankings have been consistently there for 5-6 years until today. I logged into webmaster tools and thankfully there is no warning message from Google about spam, etc, but what we do have is 35,000 URL errors for pages which are accessible. An example of this is: | URL: | http://www.uksoccershop.com/categories/5_295_327.html | | Error details In Sitemaps Linked from Last crawled: 6/20/12First detected: 6/15/12Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request. Is it possible this is the cause of the issue (we are not currently sure why the URL's are being blocked) and if so, how severe is it and how recoverable?If that is unlikely to cause the issue, what would you recommend our next move is?All help is REALLY REALLY appreciated 🙂
Intermediate & Advanced SEO | | ukss19840 -
Sitemaps. When compressed do you use the .gz file format or the (untidy looking, IMHO) .xml.gz format?
When submitting compressed sitemaps to Google I normally use the a file named sitemap.gz A customer is banging on that his web guy says that sitemap.xml.gz is a better format. Google spiders sitemap.gz just fine and in Webmaster Tools everything looks OK... Interested to know other SEOmoz Pro's preferences here and also to check I haven't made an error that is going to bite me in the ass soon! Over to you.
Intermediate & Advanced SEO | | NoisyLittleMonkey0 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0