Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should XML sitemaps include *all* pages or just the deeper ones?
-
Hi guys,
Ok this is a bit of a sitemap 101 question but I cant find a definitive answer:
When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages.
What do you think?
-
It is correct that DA, PA, depth of pages, etc. are all factors in determining which pages get indexed. If your site offers good navigation, reasonable backlinks, anchor text, etc then you can get close to all pages indexed even on a very large site.
Your site map should naturally include a date on every link which indicates when content was added or changed. Even if you submit a 10k list of links, Google can evaluate the dates on each link and determine which content has been added or modified since your site was last crawled.
-
Well yes, that's kinda my point. We do have a sensible, crawlable navigation so there will be no problems there, so then the sitemap really becomes an indicator of what needs to be crawled (new and updated pages), but then the same question stands...
With other sites we've managed with thousands of pages we've found it detrimental to give Google hundreds of pages to crawl on a sitemap that we don't feel are important. We're pretty sure (and SEOmoz staff have supported this) that domain authority and the number of pages you can get into the index are closely related.
-
Tim,
We always index ALL pages...the help tip on Google XML also suggests including all pages of your site in the XML sitemap.
-
Your sitemap should include every page of your site that you wish to be indexed.
The idea is that if your site does not provide crawlable navigation, Google can use your sitemap to crawl your site. There are some sites that use flash and when a crawler lands on a page there is absolutely no where for the crawler to go.
If your site navigation is solid then a sitemap doesn't offer any value to Google other then an indicator of when content is updated or added.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Customer Reviews on Product Page / Pagination / Crawl 3 review pages only
Hi experts, I present customer feedback, reviews basically, on my website for the products that are sold. And with this comes the ability to read reviews and obviously with pagination to display the available reviews. Now I want users to be able to flick through and read the reviews to help them satisfy whatever curiosity they have. My only thinking is that the page that contains the reviews, with each click of the pagination will present roughly the same content. The only thing that changes is the title tags which will contain the number in the H1 to display the page number. I'm thinking this could be duplication but i have yet to be notified by Google in my Search console... Should i block crawlers from crawling beyond page 3 of reviews? Thanks
Technical SEO | | Train4Academy.co.uk0 -
Upgrade old sitemap to a new sitemap index. How to do without danger ?
Hi MOZ users and friends. I have a website that have a php template developed by ourselves, and a wordpress blog in /blog/ subdirectory. Actually we have a sitemap.xml file in the root domain where are all the subsections and blog's posts. We upgrade manually the sitemap, once a month, adding the new posts created in the blog. I want to automate this process , so i created a sitemap index with two sitemaps inside it. One is the old sitemap without the blog's posts and a new one created with "Google XML Sitemap" wordpress plugin, inside the /blog/ subdirectory. That is, in the sitemap_index.xml file i have: Domain.com/sitemap.xml (old sitemap after remove blog posts urls) Domain.com/blog/sitemap.xml (auto-updatable sitemap create with Google XML plugin) Now i have to submit this sitemap index to Google Search Console, but i want to be completely sure about how to do this. I think that the only that i have to do is delete the old sitemap on Search Console and upload the new sitemap index, is it ok ?
Technical SEO | | ClaudioHeilborn0 -
Blog Page Titles - Page 1, Page 2 etc.
Hi All, I have a couple of crawl errors coming up in MOZ that I am trying to fix. They are duplicate page title issues with my blog area. For example we have a URL of www.ourwebsite.com/blog/page/1 and as we have quite a few blog posts they get put onto another page, example www.ourwebsite.com/blog/page/2 both of these urls have the same heading, title, meta description etc. I was just wondering if this was an actual SEO problem or not and if there is a way to fix it. I am using Wordpress for reference but I can't see anywhere to access the settings of these pages. Thanks
Technical SEO | | O2C0 -
What to do with temporary empty pages?
I have a website listing real estate in different areas that are for sale. In small villages, towns, and areas, sometimes there is nothing for sale and therefore the page is completely empty with no content except a and some footer text. I have thousand of landing pages for different areas. For example "Apartments in Tibro" or "Houses in Ljusdahl" and Moz Pro gives me some warnings for "Duplicate Content" on the empty ones (I think it does so because the pages are so empty that they are quite similar). I guess Google could also think bad of my site if I have hundreds or thousands of empty pages even if my total amount of pages are 100,000. So, what to do with these pages for these small cities, towns and villages where there is not always houses for sale? Should I remove them completely? Should I make a 404 when no houses for sale and a 200 OK when there is? Please note that I have totally 100,000+ pages and this is only about 5% of all my pages.
Technical SEO | | marcuslind900 -
Can you noindex a page, but still index an image on that page?
If a blog is centered around visual images, and we have specific pages with high quality content that we plan to index and drive our traffic, but we have many pages with our images...what is the best way to go about getting these images indexed? We want to noindex all the pages with just images because they are thin content... Can you noindex,follow a page, but still index the images on that page? Please explain how to go about this concept.....
Technical SEO | | WebServiceConsulting.com0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Sitemap for dynamic website with over 10,000 pages
If I have a website with thousands of products, is it a good idea to create a sitemap for this website for the search engines where you show maybe 250 products on a page so it makes it easy for the search engine to find the part and also puts that part closer to the home page? Seems like google likes pages that are the closest to the home page (less clicks the better)
Technical SEO | | roundbrix0