How important are sitemap errors?
-
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors?
I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds
Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
-
Very important. Particularly if you have a large site. We operate a large site with 100,000's of pages and as Dan said it can be difficult to maintain. We use something called Unlimited XML Sitemap Generator which builds XML sitemaps for us automatically. I'd highly recommend it although it takes a bit of fiddling with to get it up and running as it's software which sits on site. We couldn't manage without it as we'd be forever on sitemaps.
We found that getting sitemaps right on a large site made a huge difference to the crawl rate that we encountered in GWT and a huge indexation to follow.
In particular check for 302's. I made the mistake of leaving those for a while and am sure that we suffered from some loss of link equity along the way.
Hope it helps
Dawn
-
Your sitemap should only list pages that actually exist.
If you delete some pages, then you need to rebuild the sitemap.
Ditto if you delete them and redirect.
Google is always lagging, so if you delete 10 pages and then update the sitemap, even if google downloads the sitemap immediately, they will still be running crawls on the old map, and they may be crawling the now-missing pages, but haven't shown the failures in your WMT yet.
If you update your sitemap quickly, it is possible they will never crawl the missing pages and get a 404 or 301.
(but of course, there could be other sites pointing to the now-missing pages, and the 404s will show up elsewhere as missing)
I am always checking, adding, deleting and redirecting pages, and I update the current sitemap every hour and all the others are rebuilt at midnight every night. I usually do deletions just before midnight if I can, to minimize the time the sitemap is out of sync.
-
As far as I know Google is more lenient with sitemap errors, but I would still recommend looking into it. The first step would be to be sure your sitemap is up to date to begin with - and has all the URLs you want (and not any you don't want). The main thing is none of them should 404 and then beyond that, yes, they should return 200's.
Unless you're dealing with a gigantic site which might be hard to maintain, in theory there shouldn't be errors in sitemaps if you have the correct URLs in there.
Even better, if you're running WordPress the Yoast SEO plugin will generate an XML sitemap for you and it update automatically.
Hope that helps!
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Importance (or lack of) Meta keywords tags and Tags in Drupal
I'm wondering should I put any effort in making Meta Keywords tags for my pages or normal Tags (they're separate in Drupal), since apparently first are not considered by most of search engines, while not sure about normal tags. Obviously SERPS has to determine partial valu of the page by content, thus consider keywords / tags to some extend. What's your opinion on that. Thank you.
Intermediate & Advanced SEO | | Optimal_Strategies1 -
URL indexed but not submitted in sitemap, however the URL is in the sitemap
Dear Community, I have the following problem and would be super helpful if you guys would be able to help. Cheers Symptoms : On the search console, Google says that some of our old URLs are indexed but not submitted in sitemap However, those URLs are in the sitemap Also the sitemap as been successfully submitted. No error message Potential explanation : We have an automatic cache clearing process within the company once a day. In the sitemap, we use this as last modification date. Let's imagine url www.example.com/hello was modified last time in 2017. But because the cache is cleared daily, in the sitemap we will have last modified : yesterday, even if the content of the page did not changed since 2017. We have a Z after sitemap time, can it be that the bot does not understands the time format ? We have in the sitemap only http URL. And our HTTPS URLs are not in the sitemap What do you think?
Intermediate & Advanced SEO | | ZozoMe0 -
How important is it to rank for a product category?
We make a product in a category of products -- let's say "donuts". There are really only 4 major donut companies (lots of artisanal donuts out there, but they're not really competitive yet). One of our competitors has systematically achieved top rank for "donut" and lots of adjacent keywords like "donuts" and "buy donuts". My question is, does their success ranking for the product category keyword "donut" influence their success ranking for long-tail keywords like "powdered donuts" and "tastiest donuts"? Or, to flip that question, should we try to compete for "donut" before worrying about "decadent delicious donuts"? Other factors: In terms of search volume, as you would expect, "donut" sees 10 to 1000 times as many searches as most of the other keywords adjacent to it. We can definitely compete for "donut" -- just trying to figure out if doing so should be our top priority.
Intermediate & Advanced SEO | | hoosteeno0 -
Does a sitemap override Google parameter handling?
This question might seem silly, but I'll ask anyway. We have an eCommerce site with a ton of duplicate content, mostly caused by faceted navigation. In researching ways to reduce the clutter, I've decided to use Google parameter handling to stop Googlebot from crawling pages with certain parameters, like: sort order, page #, etc... Now my question: If I set all of these parameters so that Googlebot doesn't crawl the grids, how will they ever find the individual product pages? We do upload a sitemap with all of the product pages. Does this solve my issue? Or, should I handle the duplicate content with noindex, follow tag? Or, is there an even better way? Thanks
Intermediate & Advanced SEO | | rhoadesjohn0 -
Is there a maximum amount of pages that should be added on a sitemap daily?
I started a new music site that has a database of 8,000,000 songs and 500,000+ artists that we are cross referencing with free & legal content sources. Each song essentially has its own page. We are about to start adding links to a sitemap and wanted to find the best practices. Should we add all 8,000,000+ links at once? Should we add a maximum amount a day? Maybe max 5,000? What are the pros and cons of slowly adding the pages or adding them all at once. Any risks? At the rate google is crawling our page it will take 8 years to have all of our songs indexed (It would be very hard to crawl all of our songs as our system is more of an app). I wan't to play it safe and not do anything that will come off as spammy. I have been trying to find some actual evidence on what the best course of action is. Thanks in Advance!
Intermediate & Advanced SEO | | mikecrib10 -
Does Google index more than three levels down if the XML sitemap is submitted via Google webmaster Tools?
We are building a very big ecommerce site. The site has 1000 products and has many categories/levels. The site is still in construccion so you cannot see it online. My objective is to get Google to rank the products (level 5) Here is an example level 1 - Homepage - http://vulcano.moldear.com.ar/ Level 2 - http://vulcano.moldear.com.ar/piscinas/ Level 3 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/ Level 4 - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes.html/ Level 5 - Product is on this level - http://vulcano.moldear.com.ar/piscinas/electrobombas-para-piscinas/autocebantes/autocebante-recomendada-para-filtros-vc-10.html Thanks
Intermediate & Advanced SEO | | Carla_Dawson0 -
Domain and Sitemap Question
Hi - I am hoping you can help me with this issue we are currently trying to solve. We are hosting our mobile site's content on a different domain than what the URL of the site is, though owned by same company. In Google Webmasters tool we have the mobile sitemap under "sitemaps.xyz.com", however the URL of the site is "m.xyz.com". We have submitted 60MM pages in the mobile sitemap, but only 1MM pages have been indexed. Do you think this set up causes confusion with the bots? Does this affect the crawlability of the site? Any thoughts would be greatly appreciated. Thank you!
Intermediate & Advanced SEO | | ladylana
Eva0 -
Magento Hidden Products & Google Not Found Errors
We recently moved our website over to the Magento eCommerce platform. Magento has functionality to make certain items not visible individually so you can, for example, take 6 products and turn it into 1 product where a customer can choose their options. You then hide all the individual products, leaving only that one product visible on the site and reducing duplicate content issues. We did this. It works great and the individual products don't show up in our site map, which is what we'd like. However, Google Webmaster Tools has all of these individual product URLs in its Not Found Crawl Errors. ! For example: White t-shirt URL: /white-t-shirt Red t-shirt URL: /red-t-shirt Blue t-shirt URL: /blue-t-shirt All of those are not visible on the site and the URLs do not appear in our site map. But they are all showing up in Google Webmaster Tools. Configurable t-shirt URL: /t-shirt This product is the only one visible on the site, does appear on the site map, and shows up in Google Webmaster Tools as a valid URL. ! Do you know how it found the individual products if it isn't in the site map and they aren't visible on the website? And how important do you think it is that we fix all of these hundreds of Not Found errors to point to the single visible product on the site? I would think it is fairly important, but don't want to spend a week of man power on it if the returns would be minimal. Thanks so much for any input!
Intermediate & Advanced SEO | | Marketing.SCG0