Sitemaps: HTML and/or XML?
-
Can someone explain sitemaps, and if you need html and/or xml?
I have a site with a few html sitemaps, one for products, one for categories. I have another site with just one xml sitemap for my entire site (which has massive pages, 600k+).
Should I be dividing the site with massive pages into html sitemaps like my other site?
-
If you have got a large website with 100's or 1000's of pages then you can prioritise which pages Google should see first in your XML sitemap. Your HTML should sit in the footer of your website and is important to have because it should increase the speed at which Google sees all your pages on the website. I always recommend having both XML and HTML
-
You mention XML sitemaps. They need to have less than 50K links in each sitemap and less than 50MB in size.
What you do is setup your main XML sitemap and then have it contain all the URLs to your sitemaps with up to 50K urls each. BFYO has a great article on this http://www.blindfiveyearold.com/optimize-your-sitemap-index
Main support doc on sitemaps
https://support.google.com/webmasters/answer/183668?hl=en&ref_topic=8476
Reference for Index sitemap
https://support.google.com/webmasters/answer/71453
as Moosa mentioned, the XML really helps Google find all your important links and crawl the site. You need to have one setup and submit to Google Webmaster Tools. Note that if you have an index sitemap pointing to others, you can just submit the index and Google can find the rest.
As far as an HTML sitemap, that is an HTML page that users can browse to find your pages. It also helps the bots. You can have an HTML sitemap, but I would limit it to your main pages and category pages that then can lead to all of your product pages etc. I would not bother with an extensive HTML sitemap to all products on your website when your paginated category pages do this and act as an extension of your main HTML sitemap.
-
XML sitemap helps Google while crawling the site, whereas HTML sitemaps are usually used to help the visitors to have a better and easier site experience.
In my opinion having a XML sitemap is great as it will help Google while crawling and indexing the site in to search engine but there is no technical use of HTML sitemap. If you think that your visitors need one, than go for it but otherwise having XML sitemap for a website is enough!
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spotify XML Sitemap
All, Working on an SEO work up for a Spotify site. Looks like they are using a sitemap that links to additional pages. A problem, none of the links are actually linked within the sitemap. This feels like a strong error. https://lubricitylabs.com/sitemap.xml Thoughts?
Intermediate & Advanced SEO | | dmaher0 -
Duplicate/ <title>element too long issues</title>
I have a "duplicate <title>"/"<title> element too long" issue with thousands of pages. In the future I would like to automate these in a way that keeps them from being duplicated AND too long. The solution I came up with was to standardize these monthly posts with a similar, shorter, <title>, but then differentiate by adding the month and the year of the post at the end of each <title>. Hundreds of these come out every week, so it is hard to sit there and come up with a unique <title> every time. With this solution the <title> tags would undoubtedly be short enough, however my primary concern is, would simply adding the month and year at the end of each <title> be enough for Google/Moz to decide it is not a duplicate? How much variation is enough for it not to be deemed a duplicate <title>? </p></title>
Intermediate & Advanced SEO | | Brian_Dowd0 -
Low text-HTML ratios
Are low text-HTML ratios still a negative SEO ranking factor? Today I ran SEMRUSH site audit that showed 344 out of 345 pages on our website (www.nyc-officespace-leader.com) show an text-HTML ratio that ranges from 8% to 22%. This is characterized as a warning on SEMRUSH. This error did not exist in April when the last SEMRUSH audit was conducted. Is it worthwhile to try to externalize code in order to improve this ratio? Or to add text (major project on a site of this size)? These pages generally have 200-400 words of text. Certain URLs, for example www.nyc-officespace-leader.com/blog/nycofficespaceforlease more text, yet it still shows an text-HTML ratio of only 16%. We recently upgraded to the WordPress 4.2.1. Could this have bloated the code (CSS etcetera) to the detriment of the text-HTML ratio? If Google has become accustomed to more complex code, is this a ratio that I can ignore. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Xml sitemap only shows up sometimes (magento)
Hi Moz community, I'm using Magento platform. I can generate a sitemap using their xml generator, but it will only pull up sometimes in web explorers, the rest of the time it will show a 404 page. GWT also tells me that I get a 404 error when testing the sitemap, but sometimes it will acknowledge that it's there. Anyone had this problem before or know how to help. sitemap= www.ice.com/sitemap.xml Let me know what other information I can provide to help. Thanks!
Intermediate & Advanced SEO | | IceIcebaby0 -
Regular Expression / Wildcard Redirect Situation
I am dealing with an interesting situation. Here's what's going on: Current URLs Example1:
Intermediate & Advanced SEO | | NakulGoyal
www.domain.com/red-widgets-cid-1234.html
www.domain.com/red-widgets-cid-1234-1.html
www.domain.com/red-widgets-cid-1234-1-1.html Canonical on All Above URLs:
www.domain.com/red-widgets-cid-1234.html New URL:
www.domain.com/red-widgets-cid-4567.html Current URLs Example2:
www.domain.com/red-widgets-cid-1234+10.html
www.domain.com/red-widgets-cid-1234+10-1.html
www.domain.com/red-widgets-cid-1234+10-1-1.html Canonical on All Above URLs:
www.domain.com/red-widgets-cid-1234+10.html New URL:
www.domain.com/red-widgets-cid-6789.html I want to make sure all variations of the above URL redirect to the new url. What wildcard 301 redirect / regular expression can I use to tackle these ?0 -
How to 301 redirect all URLs with /? in?
I want to redirect all URLs that have /? in it. Indexed in Google is a bunch of urls lik: mysite.com/?674764 mysite.com/?rtf8y78 I want all these URLs to be redirected to my home page. Any ideas?
Intermediate & Advanced SEO | | JohnPeters0 -
Why does this show up in my browser/index2.php
i type is a clients site and BOOM...it suddenly ends with "index2.php" who will link to that? I can't seem to make it not appear...according the template creator... Is this "index2.php" affecting my link juice?
Intermediate & Advanced SEO | | SEObleu.com0 -
What would you pick? Species/Breed or Topic
If you'd like to take a look, the site under quesiton is http://ArkAnimals.Com. At the moment I am considering doing landing pages by topics and not by the type of animals. I will be blending both wild and domestic animals but how to best do this is confusing since so much has changed over the years. My competitors are focusing on animal types mainly and competition is fierce. Also the site attracts by three main topics not specific animals--so I want to be a bit unique which is why I am considering a topic driven focus. What would you recommend? Background This site has been online since 1994 and on its own domain for a long while. However, over time it has suffered from a lot of things--different designers, expansion, movement of content to niche sites and bad seo. LOL Once everything was on one site with sub directories. Then, it expanded and my online advisors recommended moving topics off into their own niche sites. So, I did that. Ugh. Now, much of that content is being integrated back as I am undergoing an intense revamp (the last one was a disaster). There are a few presenting problems that I could use your perspective and expertise--since I am too close to it. Problems for Needing Your Input The site is over 2600 pages with many in html and others in php.What is the best practice? Moving the remaining html pages over into php? Some of the pages that were not active have a redirect to the blog. I plan on doing page to page 301 redirects once I dig in--unless you have a better idea. There are a lot of well established links to some of the pages. How many topics are too many? I have a wide variety of content. First, the magazine format covered about six topics. Later, I began covering more pet related items and did a lot of different news summaries to keep it fresh. I want to dump the short outdated pages as many of them have obsolete links or are too short to add any value. Or should I update if they help with the seo rather than continue to let them dilute the site? Landing page or blog? Which is better, an index landing page or blog? At the moment the blog appears on the main index for freshness and the site attracts traffic for specific topics not animal breeds or species. I want to move the site from an educational site to serving as a main funnel for potential clients driving them to get on a list or to a niche site for sales related to the particular topic/training of interest. What your take on this if you were to tackle it? Any input would be greatly appreciated. My audience includes those who are pet owners, novice trainers, and animal lovers with no critter sense.
Intermediate & Advanced SEO | | TheARKlady0