Omitting URLs from XML Sitemap - Bad??
-
Hi all,
We are working on an extremely large retail site with some major duplicate content issues that we are in the process of remedying. The site also does not currently have an XML sitemap.
Would it be advisable to create a small XML sitemap with only the main category pages for the time being, and then after our duplicate content issues are resolved, uploading the complete sitemap? Or should we wait to upload anything until all work is complete down to the product page level and canonicals are in place? Will uploading a incomplete sitemap be fraudulent or misleading in the eyes of the search engines and prompt a penalty, or would having at least the main pages mapped while we continue work be okay?
Please let me know if more info is needed to answer! Thanks in advance!
-
Some good answers here, so I'll just throw in my own 2 cents.
The purpose of a sitemap is to help search engines find pages they might not otherwise find during a regular crawl. Sometimes sitemaps can help pages get indexed faster. Other sitemaps serve special purposes, such as News or Video sitemaps, which can add extra information and help ranking particular types of content.
In reality, many, many sitemaps are incomplete, missing, or flat out wrong. To my knowledge, no search engine will penalize you for this, as they would be penalizing half the web.
The danger of an inaccurate sitemap is that the search engines may chose to ignore it completely. Daune Forrester of Bing has stated that if they find a 1% error rate in your sitemap file, then they will disregard the file. However, no such action is known to exist for incomplete sitemaps.
So I'd say there is little in submitting a sitemap of your truly important page. Unfortunately, this won't stop Google from discovering or crawling your duplicate content issues.
The faster you get these fixed, the better.
-
Hi Thomas,
Definitely comforting to hear that you ran your site with an incomplete sitemap without seeing any negative results. Like I said in my response above, I think we will proceed with the partial sitemap, just to have one on there, and then upload a complete one once we can clean up the site a little more. Thanks for your insights - they were very helpful!
-
Hi Saijo,
Thanks for your recommendations! We do want to place a little more emphasis on our top level and main navigation pages, so I think we will probably proceed with a preliminary sitemap with just those pages for now. Once we get to that point, we'll definitely be needing to use multiple sitemaps and an index - thanks for pointing this out!
-
I ran my site with an incomplete site map for years and didn't seem to have a negative effect. I feel that any site map is better than no sitemap. Sitemaps are such a small part of the SEO equation. What they are most useful for is telling Google what to crawl. Beyond that, I don't believe they have much relevance in passing authority.
-
My Theory On this ( I have no tests to prove this )
If you upload a verified site map thats is essentially telling Google these are the important pages on my site. would you want to risk the importance of the other pages by telling Google you consider only the few important categories as the important ones . They wont drop the other pages completely but MIGHT see them as less important .
I really don't see Google penalizing a site for incomplete sitemaps.
If you have a really large site you might also want to look in to multiple sitemaps : http://googlewebmastercentral.blogspot.com.au/2006/10/multiple-sitemaps-in-same-directory.html
You might also want to look in to the best situations to use rel canonical vs a 301 redirect .
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
XML sitemap and rel alternate hreflang requirements for Google Shopping
Our company implemented Google Shopping for our site for multiple countries, currencies and languages. Every combination of language and country is accessible via a url path and for all site pages, not just the pages with products for sale. I was not part of the project. We support 18 languages and 14 shop countries. When the project was finished we had a total of 240 language/country combinations listed in our rel alternate hreflang tags for every page and 240 language/country combinations in our XML sitemap for each page and canonicals are unique for every one of these page. My concern is with duplicate content. Also I can see odd language/country url combinations (like a country with a language spoken by a very low percentage of people in that country) are being crawled, indexed, and appearing in serps. This uses up my crawl budget for pages I don't care about. I don't this it is wise to disallow urls in robots.txt for that we are simultaneously listing in the XML sitemap. Is it true that these are requirements for Google Shopping to have XML sitemap and rel alternate hreflang for every language/country combination?
Technical SEO | | awilliams_kingston0 -
Xml sitemaps giving 404 errors
We have recently made updates to our xml sitemap and have split them into child sitemaps. Once these were submitted to search console, we received notification that the all of the child sitemaps except 1 produced 404 errors. However, when we view the xml sitemaps in a browser, there are no errors. I have also attempted crawling the child sitemaps with Screaming Frog and received 404 responses there as well. My developer cannot figure out what is causing the errors and I'm hoping someone here can assist. Here is one of the child sitemaps: http://www.sermonspice.com/sitemap-countdowns_paged_1.xml
Technical SEO | | ang0 -
Sitemap links
Hi, I´m running a sitemap using pro-sitemaps and I find several pages that shouldn´t be listed. How do I find how are these pages being generated? Can´t find the links the robot is following to get to those pages..
Technical SEO | | ceci27100 -
Affiliate urls and duplicate content
Hi, What is the best way to get around having an affiliate program, and the affiliate links on your site showing as duplicate content?
Technical SEO | | Memoz0 -
What are the SEO implications of URLs that use a # in them?
I have several clients who have begun to ask questions about sites that are designed to look like a single page. When you click on a link, the URL changes but it uses a # before (i.e. http://www.kelloggs.com/teamusa**/#**/teamusa/athletes/kerri-walsh.html. What are the SEO implications of having a page set up this way? I noticed that Google has indexed this page but the indexed URL does not include a #. Is Google indexing a separate version of this page? Any insights would be really helpful! Thanks
Technical SEO | | VMLYRDiscoverability0 -
Ignore Urls with pattern.
I have 7000 warnings of urls because of a 302 redirect. http://imageshack.us/photo/my-images/215/44060409.png/ I want to get rid of those, is it possible to get rid of the Urls with robots.txt. For example that it does not crawl anything that has /product_compare/ in its url? Thank you
Technical SEO | | levalencia10 -
Duplicate XML sitemaps - 404 or leave alone?
We switched over from our standard XML sitemap to a sitemap index. Our old sitemap was called sitemap.xml and the new one is sitemapindex.xml. In Webmaster Tools it still shows the old sitemap.xml as valid. Also when you land on our sitemap.xml it will display the sitemap index, when really the index lives on sitemapindex.xml. The reason you can see the sitemap on both URLs is because this is set from the sitemap plugin. So the question is, should we change the plugin setting to let the old sitemap.xml 404, or should we allow the new sitemap index to be accessed on both URLs?
Technical SEO | | Hakkasan0 -
URL Structure
Hi Guys, I'm in the process of creating a very exciting startup aimed at the baby industry. It's essentially a social commerce question where parents can shop for products, create lists of products and ask questions. The challenge I'm facing is how best to structure my URLs from an SEO standpoint. For example a common baby topic such as "feeding", can sit in all three categories: Shopping category aggregates all products related to feeding List category aggregates all lists related to feeding Question category aggregates all question and answers on feeding So for that keyword "feeding" you have 3 potential landing pages. What I was wondering is what is the most effective way of doing it? I was thinking of something along these lines: /shopping/feeding /baby_list/feeding /ask/feeding Would love to hear your points of view on this. Thanks! Walid
Technical SEO | | walidalsaqqaf0