Submitting XML Sitemap for large website: how big?
-
Hi there,
I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages.
Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted.
Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index.
Finally, my questions are:
- How big should a sitemap be? What proportion of the URLs of a website should it cover?
- What are the best tools for creating the sitemaps of large websites?
- How often should a sitemap be updated?
Thanks
-
Thanks Matt, that's really useful
-
Yeah, it's better to have one than not - but I have always aimed to make it as complete as I can. Why? I'm not sure - mostly because I figure Google is GREAT at crawling my main structure - it's those far-reaching pages that I'm hoping they find in the sitemap.
-
Thanks for both your replies - I will check out the tools and recommendations you suggested.
I'm sure I remember somewhere reading a recommendation that it was only necessary to submit the basic site structure in a sitemap. It sounds like this is not the case and that a site map should , if possible, be comprehensive.
Would it be better to have a basic sitemap giving the main navigational URLs than having nothing at all?
-
I've created sitemaps with the paid version of Screaming Frog that were almost 80,000 pages. That's what I'd use. No point asking what % unless you can't get it all. If you're crawling Microsoft, break it up. Otherwise, organize it if you can (category sitemap, month by month, something.) or just make one big finger to Google type sitemap. lol
-
Hi!
First off, since your content can be accessed in multiple ways, I'd make sure that you're applying means to indicate duplicate pages as such to search engines. Easy access to great content is fantastic, but you can devaluate your own pages a lot when you're not careful. If you're not using it yet, I recommend implementing the rel="canonical" tag in your website.
To answer your questions:
- It should cover all URLs that want indexed. Ideally, that would be every URL
- I'm not sure what 'the best' tools would be, but I used http://www.xml-sitemaps.com a lot a few years back. Their sitemaps are free up to 500 URLs. There are payment plans for bigger ones.
- I wouldn't update an XML sitemap for every new page you make once a month. Instead, let the search engine find their own way in that case. Should your entire site structure change, an XML sitemap can be a great way to help search engine understand your new site setup better.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Add versioning to an xml sitemap?
Is there a way to add versioning to an xml sitemap? Something like <version>x.x</version> outside of the <urlset>?</urlset> I've looked at a bunch of sitemaps for various sites and don't see anyone adding versioning information, but it seems like it would be a common issue - I can't believe someone hasn't come up with some way to do it.
Intermediate & Advanced SEO | | ATT_SEO0 -
Do you get links from new websites?
There's a new industry specific website that looks decent. It's clean and nothing spammy. However, it's so new it's DA is under 10. Is it worth pursuing a link from a site like this? On one hand, there's nothing spammy and it is industry specific. On the other...it's just DA is so terrible (worse than any of our other links), I don't want it to hurt us. Any thoughts? Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup1 -
Website Migration and SEO
Recently I migrated three websites from www.product.com to www.brandname.com/product. Two of these site are performing as normal when it comes to SEO but one of them lost half of its traffic and dropped in rankings significantly. All pages have been properly redirected, onsite SEO is intact and optimized, and all pages are indexed by Search engines. Has anyone had experience with this type of migration that could give some input on what a possible solution could be? Any help would be greatly appreciated!
Intermediate & Advanced SEO | | AlexVelazquez0 -
Sitemap on a Subdomain
Hi, For various reasons I placed my sitemaps on a subdomain where I keep images and other large files (static.example.com). I then submitted this to Google as a separate site in Webmaster tools. Is this a problem? All of the URLs are for the actual site (www.example.com), the only issue on my end is not being able to look at it all at the same time. But I'm wondering if this would cause any problems on Google's end.
Intermediate & Advanced SEO | | enotes0 -
Website Layout and SEO
Hi All, As a brand new user to Wordpress and having read articles and forum posts I have purchased Studiopress Genesis Enterprise Theme. QuestionWordpress like any traditional bespoke site can be written to incorporate variations of columns structures.
Intermediate & Advanced SEO | | Mark_Ch
What is the best column strategy or page layout strategy for SEO? Thanks Mark0 -
Construction website
Hi, I have a construction website that is aimed at tradesmen. There are 2 goals of the site: 1. To allow potential customers to sign up for a trade account. 2. To allow existing customers to access to products and login to their account to make an order. The site is full of categories and products which should be indexed so we rank for these trade products. The homepage redesign is where i am having an issue: Currently the site is set up like a standard retail site but without prices, which are viewable only when logged in. The homepage is designed such that there is several call to actions about promotions, services and to apply for a trade account, that apply to both existing and potential customers. At the moment there is a poor conversion to get potential customers to apply for a trade account. This is because there is too much distraction away from this goal and they are allowed to engage other areas of the site freely. The main purpose of the homepage should be to encourage potential customers to sign up. The secondary purpose to for existing customers to access the accounts and products. I believe potential customers should not be exposed to the categories and products as it is a distraction from the primary goal. Potential customers, i.e. Tradesmen, would already have a certain understanding of the types of products we provide, so I don't feel it is necessary to allow them to crawl the rest of the site unless they have an account. What are your thoughts on that? Here is my lack of understanding: On the homepage, if I restrict access to categories and products to existing account holders only, where a login is required to proceed, would that mean Google cannot access these pages to index them? Or is this only controlled by NoFollows & Robots.txt? Obviously not indexing is undesirable. I do understand potential customers will need some information about our range of products but the idea is to coerce them to sign up for an account so they can see this information. The more information that is provided to a potential customer, the higher the probability a person can make a decision against applying for an account. Restricting access creates a motivator to reveal information and we capture their data to converse with them personally. This increases the probability of us being able to retain their interest by providing a customised service based on their needs. All of this I feel makes perfect sense to me, the only query/obstacle I have is the indexing of the site. If Google cannot index pages that are restricted by account access, then I would like suggestions to solve/compromise/optimise the above. Just to address the desired behaviour of index pages. If in search a our product page appears, the person clicking the link would either be redirected or exposed to a login or sign up screen to view. Thank you so much for your help. Antonio
Intermediate & Advanced SEO | | AVSFencingSupplies0 -
Merging two websites to one...
Hi all. Could do with a second opinion on this please... At present a client of ours owns two shops (both doing the same but in towns about 20 miles apart - they sell flooring, but using different names) and has a website for each. The plan is to rebrand both of these stores the same and merge both websites into one. The problem comes that both of the individual websites rank very well in their respective Google Local search results and I fear that killing one of the sites will mean that one store will vanish from the local listings. One domain is a DA 45 and the other a DA 11 so the plan is to use the stronger of the two domains. The question I would like to ponder with people wiser than myself is how can we ensure that the new single domain ranks for both locations in the local? Would the easiest solution be to have pages such as domain.com/store1 and domain.com/store2 with full listings for that store inc name, address, phone number, customer reviews etc? At present the DA 45 domain ranks very well in it's Google local so we need to find a way to change the homepage of that to have both the stores phone numbers but without affecting the local listing. I was considering adding the second phone number as a text based image so that it's visible for people but not for bots Finally, would 301 redirecting the now unused store to domain.com/store2 help with ensuring that we do not lose any local listing for that keyword? If not, are there any suggestions people could offer up Many thanks for any help and sorry for the very long question Carl
Intermediate & Advanced SEO | | GrumpyCarl0 -
How long until Sitemap pages index
I recently submitted an XML sitemap on Webmaster tools: http://www.uncommongoods.com/sitemap.xml Once Webmaster tools downloads it, how long do you typically have to wait until the pages index ?
Intermediate & Advanced SEO | | znotes0