Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Can a XML sitemap index point to other sitemaps indexes?
-
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking.
Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps?
Has anyone done this successfully? Based upon the description here:
http://sitemaps.org/protocol.php#index it seems like it should be possible.
Thanks in advance for your help!
-
It's easy to implement. I've broken the content up into multiple sitemaps before as a diagnostic tool as well, to see which sections of the site are getting most of their pages indexed and which sections of the site have few pages indexed.
-
You only need to use a sitemap index file if you have over 50,000 URL's. So if your site is that massive, then yes you'll want to do this. Basically all you do is create a bunch of sitemaps. You can then create a sitemap index file that is just a list of links to those sitemaps. You then submit your sitemap index file to google or each sitemap individually. Just glancing at your link it is pretty darn clear, there's not much to it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pending Sitemaps
Hi, all Wondering if someone could give me a pointer or two, please. I cannot seem to get Google or Bing to crawl my sitemap. If I submit the sitemap in WMT and test it I get a report saying 44,322urls found. However, if I then submit that same sitemap it either says Pending (in old WMT) or Couldn't fetch in the new version. This couldn't fetch is very puzzling as it had no issue fetching the map to test it. My other domains on the same server are fine, the problem is limited to this one site. I have tried several pages on the site using the Fetch as Google tool and they load without issue, however, try as I may, it will not fetch my sitemap. The sitemapindex.xml file won't even submit. I can confirm my sitemaps, although large, work fine, please see the following as an example (minus the spaces, of course, didn't want to submit and make it look like I was just trying to get a link) https:// digitalcatwalk .co.uk/sitemap.xml https:// digitalcatwalk .co.uk/sitemapindex.xml I would welcome any feedback anyone could offer on this, please. It's driving me mad trying to work out what is up. Many thanks, Jeff
Intermediate & Advanced SEO | | wonkydogadmin0 -
Spotify XML Sitemap
All, Working on an SEO work up for a Spotify site. Looks like they are using a sitemap that links to additional pages. A problem, none of the links are actually linked within the sitemap. This feels like a strong error. https://lubricitylabs.com/sitemap.xml Thoughts?
Intermediate & Advanced SEO | | dmaher0 -
:Pointing hreflang to a different domain
Hi all, Let's say I have two websites: www.mywebsite.com and www.mywebsite.de - they share a lot of content but the main categories and URLs are almost always different. Am I right in saying I can't just set the hreflang tag on every page of www.mywebsite.com to read: rel='alternate' hreflang='de' href='http://mywebsite.de' /> That just won't do anything, right? Am I also right in saying that the only way to use hreflang properly across two domains is to have a customer hreflang tag on every page that has identical content translated into German? So for this page: www.mywebsite.com/page.html my hreflang tag for the german users would be: <link < span="">rel='alternate' hreflang='de' href='http://mywebsite.de/page.html' /></link <> Thanks for your time.
Intermediate & Advanced SEO | | Bee1590 -
Priority Attribute in XML Sitemaps - Still Valid?
Is the priority value (scale of 0-1) used for each URL in an XML sitemap still a valid way of communicating to search engines which content you (the webmaster) believe is more important relative to other content on your site? I recall hearing that this was no longer used, but can't find a source. If it is no longer used, what are the easiest ways to communicate our preferences to search engines? Specifically, I'm looking to preference the most version version of a product's documentation (version 9) over the previous version (version 8). Thanks!
Intermediate & Advanced SEO | | Allie_Williams0 -
Hreflang in vs. sitemap?
Hi all, I decided to identify alternate language pages of my site via sitemap to save our development team some time. I also like the idea of having leaner markup. However, my site has many alternate language and country page variations, so after creating a sitemap that includes mostly tier 1 and tier 2 level URLs, i now have a sitemap file that's 17mb. I did a couple google searches to see is sitemap file size can ever be an issue and found a discussion or two that suggested keeping the size small and a really old article that recommended keeping it < 10mb. Does the sitemap file size matter? GWT has verified the sitemap and appears to be indexing the URLs fine. Are there any particular benefits to specifying alternate versions of a URL in vs. sitemap? Thanks, -Eugene
Intermediate & Advanced SEO | | eugene_bgb0 -
Noindex xml RSS feed
Hey, How can I tell search engines not to index my xml RSS feed? The RSS feed is created by Yoast on WordPress. Thanks, Luke.
Intermediate & Advanced SEO | | NoisyLittleMonkey0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
Reducing Booking Engine Indexation
Hi Mozzers, I am working on a site with a very useful room booking engine. Helpful as it may be, all the variations (2 bedrooms, 3 bedrooms, room with a view, etc, etc,) are indexed by Google. Section 13 on Search Pagination in Dr. Pete's great post on Panda http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world speaks to our issue, but I was wondering since 2 (!) years have gone by, if there are any additional solutions y'all might recommend. We want to cut down on the duplicate titles and content and get the useful but not useful for SERPs online booking pages out of the index. Any thoughts? Thanks for your help.
Intermediate & Advanced SEO | | Leverage_Marketing0