Sitemaps: HTML and/or XML?
-
Can someone explain sitemaps, and if you need html and/or xml?
I have a site with a few html sitemaps, one for products, one for categories. I have another site with just one xml sitemap for my entire site (which has massive pages, 600k+).
Should I be dividing the site with massive pages into html sitemaps like my other site?
-
If you have got a large website with 100's or 1000's of pages then you can prioritise which pages Google should see first in your XML sitemap. Your HTML should sit in the footer of your website and is important to have because it should increase the speed at which Google sees all your pages on the website. I always recommend having both XML and HTML
-
You mention XML sitemaps. They need to have less than 50K links in each sitemap and less than 50MB in size.
What you do is setup your main XML sitemap and then have it contain all the URLs to your sitemaps with up to 50K urls each. BFYO has a great article on this http://www.blindfiveyearold.com/optimize-your-sitemap-index
Main support doc on sitemaps
https://support.google.com/webmasters/answer/183668?hl=en&ref_topic=8476
Reference for Index sitemap
https://support.google.com/webmasters/answer/71453
as Moosa mentioned, the XML really helps Google find all your important links and crawl the site. You need to have one setup and submit to Google Webmaster Tools. Note that if you have an index sitemap pointing to others, you can just submit the index and Google can find the rest.
As far as an HTML sitemap, that is an HTML page that users can browse to find your pages. It also helps the bots. You can have an HTML sitemap, but I would limit it to your main pages and category pages that then can lead to all of your product pages etc. I would not bother with an extensive HTML sitemap to all products on your website when your paginated category pages do this and act as an extension of your main HTML sitemap.
-
XML sitemap helps Google while crawling the site, whereas HTML sitemaps are usually used to help the visitors to have a better and easier site experience.
In my opinion having a XML sitemap is great as it will help Google while crawling and indexing the site in to search engine but there is no technical use of HTML sitemap. If you think that your visitors need one, than go for it but otherwise having XML sitemap for a website is enough!
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I use two sitemaps?
I have a Magento website. I am going to add a Wordpress blog under /blog. If I setup each with its own webmaster tools to submit a sitemap does it hurt anything?
Intermediate & Advanced SEO | | Tylerj0 -
Add versioning to an xml sitemap?
Is there a way to add versioning to an xml sitemap? Something like <version>x.x</version> outside of the <urlset>?</urlset> I've looked at a bunch of sitemaps for various sites and don't see anyone adding versioning information, but it seems like it would be a common issue - I can't believe someone hasn't come up with some way to do it.
Intermediate & Advanced SEO | | ATT_SEO0 -
Removing .html from URLs - impact of rankings?
Good evening Mozzers. Couple of questions which I hope you can help with. Here's the first. I am wondering, are we likely to see ranking changes if we remove the .html from the sites URLs. For example website.com/category/sub-category.html Change to: website.com/category/sub-category/ We will of course make sure we 301 redirect to the new, user friendly URLs, but I am wondering if anyone has had previous experience of implementing this change and how it has effected rankings. By having the .html in the URLs, does this stop link juice being flowed back to the root category? Second question: If one page can be loaded with and without a forward slash "/" at the end, is this a duplicate page, or would Google consider this as the same page? Would like to eliminate duplicate content issues if this is the case. For example: website.com/category/ and website.com/category Duplicate content/pages?
Intermediate & Advanced SEO | | Jseddon920 -
Noindex / Nofollow multiple reviews pages?
I have well over a hundred pages of reviews (10 per page). I know this is solid content and I'd hate to not be able to leverage it, but I'm running into the issue of having duplicate title tags and H1s on all of the pages. What's the best way to make use of the review content without have those types of issues? Is a noindex / nofollow strategy something I should be considering here for Page 2 and beyond? Thanks! Edit: I did additional digging into pagination strategies and found this terrific article on Moz. I'm thinking it should address my questions regarding review pages as well.
Intermediate & Advanced SEO | | Andrew_Mac0 -
Keyphrase / Keyword arrangement
Hi all, What are your thoughts on the arrangement of keyphrases / words? For example, does it make a difference if the words are arranged in the following way: "Keyword 1 Keyword 2" or "Keyword 2 Keyword 1" Both ways make a phrases which is favourable in the search engines. Can I stick with 1 way or should I be going with both arrangements. Hope that is clear 🙂
Intermediate & Advanced SEO | | wtfi0 -
Sitemaps: Alternate hreflang
Hi, some time ago I have read that there is a limit of 50.000 URLs per sitemap file (So, you need to create a sitemap index and separate files with 50.000 urls each). [Source]. Now we are about to implement the link hreflang in the sitemap [Source], and we dont know if we have to count each alternate as a different url. We have 21 different well positioned domains (Same name, different cctlds, a little different content [varies in currencies, taxes, some labels, etc] depending in the target country) so the amount of links per url would be high. A) Shall we count each link alternate as a separate url, or just the original ones? For example, if we have to count the link alternates, that would make us have 2380pages per sitemap, each with one original url and 20 alternate links. (Always being aware of the 50mb maximum filesize) B) Actually we have one sitemap per domain. Using this, shall we generate one per domain using the matching domain as original url? Or it would be the same if we upload to every domain the same sitemap? Thanks
Intermediate & Advanced SEO | | marianoSoler980 -
Xml sitemap advice for website with over 100,000 articles
Hi, I have read numerous articles that support submitting multiple XML sitemaps for websites that have thousands of articles... in our case we have over 100,000. So, I was thinking I should submit one sitemap for each news category. My question is how many page levels should each sitemap instruct the spiders to go? Would it not be enough to just submit the top level URL for each category and then let the spiders follow the rest of the links organically? So, if I have 12 categories the total number of URL´s will be 12??? If this is true, how do you suggest handling or home page, where the latest articles are displayed regardless of their category... so I.E. the spiders will find l links to a given article both on the home page and in the category it belongs to. We are using canonical tags. Thanks, Jarrett
Intermediate & Advanced SEO | | jarrett.mackay0 -
WWW vs Non-WWW/Moving a site to a new CMS/Redirect all of the previous URLs
We are working on a new design for a website, which is currently on a CMS that has non-seo-friendly URLs. There is no redirection of 'www' to non-www or vice versa, or handling of homepage redirection so there is only one instance of 'home'. To move the site in the future, all of these URLs will have to be redirected to their new, and I hope, seo-friendly counterparts. Is it prudent now to redirect the four home page links so there is only one? and to redirect all non-www to 'www' so there is only one instance of each page? Or should I leave it and redirect all of them when the time comes?
Intermediate & Advanced SEO | | haan_seo0