Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should XML sitemaps include *all* pages or just the deeper ones?
-
Hi guys,
Ok this is a bit of a sitemap 101 question but I cant find a definitive answer:
When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages.
What do you think?
-
It is correct that DA, PA, depth of pages, etc. are all factors in determining which pages get indexed. If your site offers good navigation, reasonable backlinks, anchor text, etc then you can get close to all pages indexed even on a very large site.
Your site map should naturally include a date on every link which indicates when content was added or changed. Even if you submit a 10k list of links, Google can evaluate the dates on each link and determine which content has been added or modified since your site was last crawled.
-
Well yes, that's kinda my point. We do have a sensible, crawlable navigation so there will be no problems there, so then the sitemap really becomes an indicator of what needs to be crawled (new and updated pages), but then the same question stands...
With other sites we've managed with thousands of pages we've found it detrimental to give Google hundreds of pages to crawl on a sitemap that we don't feel are important. We're pretty sure (and SEOmoz staff have supported this) that domain authority and the number of pages you can get into the index are closely related.
-
Tim,
We always index ALL pages...the help tip on Google XML also suggests including all pages of your site in the XML sitemap.
-
Your sitemap should include every page of your site that you wish to be indexed.
The idea is that if your site does not provide crawlable navigation, Google can use your sitemap to crawl your site. There are some sites that use flash and when a crawler lands on a page there is absolutely no where for the crawler to go.
If your site navigation is solid then a sitemap doesn't offer any value to Google other then an indicator of when content is updated or added.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Desktop & Mobile XML Sitemap Submitted But Only Desktop Sitemap Indexed On Google Search Console
Hi! The Problem We have submitted to GSC a sitemap index. Within that index there are 4 XML Sitemaps. Including one for the desktop site and one for the mobile site. The desktop sitemap has 3300 URLs, of which Google has indexed (according to GSC) 3,000 (approx). The mobile sitemap has 1,000 URLs of which Google has indexed 74 of them. The pages are crawlable, the site structure is logical. And performing a Landing Page URL search (showing only Google/Organic source/medium) on Google Analytics I can see that hundreds of those mobile URLs are being landed on. A search on mobile for a longtail keyword from a (randomly selected) page shows a result in the SERPs for the mobile page that judging by GSC has not been indexed. Could this be because we have recently added rel=alternate tags on our desktop pages (and of course corresponding canonical ones on mobile). Would Google then 'not index' rel=alternate page versions? Thanks for any input on this one. PmHmG
Technical SEO | | AlisonMills0 -
Blog Page Titles - Page 1, Page 2 etc.
Hi All, I have a couple of crawl errors coming up in MOZ that I am trying to fix. They are duplicate page title issues with my blog area. For example we have a URL of www.ourwebsite.com/blog/page/1 and as we have quite a few blog posts they get put onto another page, example www.ourwebsite.com/blog/page/2 both of these urls have the same heading, title, meta description etc. I was just wondering if this was an actual SEO problem or not and if there is a way to fix it. I am using Wordpress for reference but I can't see anywhere to access the settings of these pages. Thanks
Technical SEO | | O2C0 -
Sitemap indexed pages dropping
About a month ago I noticed my pages indexed from my sitemap are dropping.There are 134 pages in my sitemap and only 11 are indexed. It used to be 117 pages and just died off quickly. I still seem to be getting consistant search traffic but I'm just not sure whats causing this. There are no warnings or manual actions required in GWT that I can find.
Technical SEO | | zenstorageunits0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Should I include tags in sitemap?
Hello All, I was wondering if you should include tags and categories in your sitemap. In the past on previous blogs I have always left tags and categories out. The reason for this is a good friend of mine who has been doing SEO for a long time and inhouse always told me that this would result in duplicate content. I thought that it would be a great idea to get some input from the SEOmoz community as this obviously has a big affect on your blog and the number of pages indexed. Any help would be great. Thanks, Luke Hutchinson.
Technical SEO | | LukeHutchinson1 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Sitemap for dynamic website with over 10,000 pages
If I have a website with thousands of products, is it a good idea to create a sitemap for this website for the search engines where you show maybe 250 products on a page so it makes it easy for the search engine to find the part and also puts that part closer to the home page? Seems like google likes pages that are the closest to the home page (less clicks the better)
Technical SEO | | roundbrix0