Is this a good sitemap hierarchy for a big eCommerce site (50k+ pages).
-
Hi guys, hope you're all good.
I am currently in the process of designing a new sitemap hierarchy to ensure that every page on the site gets indexed and is accessible via Google. It's important that our sitemap file is well structured, divided and organised into relevant sub-categories to improve indexing.
I just wanted to make sure that it's all good before forwarding onto the development team for them to consider. At the moment the site has everything thrown into /sitemap.xml/ and it exceeds the 50k limit. Here is what I have came up with:
A primary sitemap.xml referencing other sitemap files, each of the following areas will have their own sitemap of which is referenced by /sitemap.xml/. As an example, sitemap.xml will contain 6 links, all of which link to other sitemaps.
- Product pages;
- Blog posts;
- Categories and sub categories;
- Forum posts, pages etc;
- TV specific pages (we have a TV show);
- Other pages.
Is this format correct? Once it has been implemented I can then go ahead and submit all 6 separate sitemaps to webmaster tools + add a sitemap link to the footer of the site.
All comments are greatly appreciated - if you know of a site which has a good sitemap architecture, please send the link my way!
Brett
-
Have a read of what Google say about them here.
And yes, image search is huge. As for the way it's used, I can't comment on what everyone else does.
-Andy
-
Interesting, I haven't ever came across someone who said that I should put image URL's in a sitemap. Do users really search via Google images though - if they do aren't they just looking to copy an image / and or download it?
I can't see the site generating qualified leads through image based searches.
-
Duplicate content is when two or more URLs show the same content.
I referred to the fact that sometime categories, tags or subcategories show the same content. By the latter, i mean the same posts.Just to clarify, imagine that you have a category: Dogs and the subcategory: Puppies. And the last 5 articles/posts have both, category and subcategory.
When visiting the main page fo both(cat and subcat) will show the same content, the same 5 posts/articlesDid I make myself clear?
-
Thanks for getting back to me so quickly Gaston, I appreciate it.
You mentioned duplicate content - what do you mean? If the page has already been indexed, Google will skip/re-crawl the page. Not too sure what you mean by that?
Brett
-
Hi Brett,
Don't forget to add an images sitemap, as Google is pretty hot on those, and make sure you do some good image marketing as well.
But what you suggest is absolutely fine. From the main Sitemap, Google will find all of the others as well.
Just as a note, do make sure you know which pages need more crawling through using the last modified date. This will help them know which pages they should be recrawling more often.
-Andy
-
Hi brett,
Yeap, the hierarchy is ok. You should keep in mind to only submit to index the pages that are of yout interest and dont generate duplicate content, just a reminder.
Then, just submit every sitemap to search console.
Hope it helps.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce category pages
Hi there, I've been thinking a lot about this lately. I work on a lot of webshops that are made by the same company. I don't like to say this, but not all of their shops perform great SEO-wise. They use a filtering system which occasionally creates hundreds to thousands of category pages. Basically what happens is this: A client that sells fashion has a site (www.client.com). They have 'main categories' like 'Men' 'Women', 'Kids', 'Sale'. So when you click on 'men' in the main navigation, you get www.client.com/men/. Then you can filter on brand, subcategory or color. So you get: www.client.com/men/brand. Basically, the url follows the order in which you filter. So you can also get to 'brand' via 'category': www.client.com/shoes/brand Obviously, this page has the same content as www.client.com/brand/shoes or even /shoes/brand/black and /men/shoes/brand/black if all the brands' shoes happen to be black and mens' shoes. Currently this is fixed by a dynamic canonical system that canonicalizes the brand/category combinations. So there can be 8000 url's on the site, which canonicalize to about 4000 url's. I have a gut feeling that this is still not a good situation for SEO, and I also believe that it would be a lot better to have the filtering system default to a defined order, like /gender/category/brand/color so you don't even need to use these excessive amounts of canonicalization. Because, you can canonicalize the whole bunch, but you'd still offer thousands of useless pages for Google to waste its crawl budget on. Not to mention the time saved when crawling and analysing using Screaming Frog or other audit tools. Any opinions on this matter?
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Cache and index page of Mobile site
Hi, I want to check cache and index page of mobile site. I am checking it on mobile phone but it is showing the cache version of desktop. So anybody can tell me the way(tool, online tool etc.) to check mobile site index and cache page.
Intermediate & Advanced SEO | | vivekrathore0 -
Page is an A but does not rank extremely good… Any ideas?
Hi! My page werkzeug-kasten.com is not ranking the way it should for "Webdesign Freiburg" on google Germany. Although it receives an A it is only seen on page 2 although the competition is not that hard. Do you have any ideas why that is and what I could improve? Best regards Marc
Intermediate & Advanced SEO | | RWW0 -
We are switching our CMS local pages from a subdomain approach to a subfolder approach. What's the best way to handle this? Should we redirect every local subdomain page to its new subfolder page?
We are looking to create a new subfolder approach within our website versus our current subdomain approach. How should we go about handling this politely as to not lose everything we've worked on up to this point using the subdomain approach? Do we need to redirect every subdomain URL to the new subfolder page? Our current local pages subdomain set up: stores.websitename.com How we plan on adding our new local subfolder set-up: websitename.com/stores/state/city/storelocation Any and all help is appreciated.
Intermediate & Advanced SEO | | SEO.CIC0 -
Following Penguin 2.0 hit in May, my site experienced another big drop on August 13th
Hi everyone, my website experienced a 30% drop in organic traffic following the Penguin 2.0 update in May. This was the first significant drop that the site has experienced since 2007, and I was initially concerned that the new website design I released in March was partly to blame. On further investigation, many spammy sites were found to be linking to my website, and I immediately contacted the sites, asked for the removal of the sites, before submitting a disavow file to Google. At the same time, I've had some great content written for my website over the last few months, which has attracted over 100 backlinks from some great websites, as well as lots of social media interaction. So, while I realise my site still needs a lot of work, I do believe I'm trying my best to do things in the correct manner. However, on August 11th, I received a message in Google WMTs : Googlebot found an extremely high number of URLs on your site I studied the table of internal links in WMTs and found that Google has been crawling many URLs throughout my site that I didn't necessarily intend it to find i.e. lots of URLs with filtering and sorting parameters added. As a result, many of my pages are showing in WMTs as having over 300,000 internal links!! I immediately tried to rectify this issue, updating the parameters section in WMTs to tell Google to ignore many of the URLs it comes across that have these filtering parameters attached. In addition, since my access logs were showing that Googlebot was frequently crawling all the URLs with parameters, I also added some Disallow entries to robots.txt to tell Google and the other spiders to ignore many of these URLs. So, I now feel that if Google crawls my site, it will not get bogged down in hundreds of thousands of identical pages and just see those URLs that are important to my business. However, two days later, on August 13th, my site experienced a further huge drop, so its now dropped by about 60-70% of what I would expect at this time of the year! (there is no sign of any manual webspam actions) My question is - do you think the solutions I've put in place over the last week could be to blame for the sudden drop, or do you think I'm taking the correct approach, and that the recent drop is probably due to Google getting bogged down in the crawling process. I'm not aware of any subsequent Penguin updates in recent days, so I'm guessing that this issue is somehow due to the internal structure of my new design. I don't know whether to roll back my recent changes or just sit tight and hope that it sorts itself out over the next few weeks when Google has more time to do a full crawl and observe the changes I've made. Any suggestions would be greatly appreciated. My website is ConcertHotels.com. Many thanks Mike
Intermediate & Advanced SEO | | mjk260 -
Is Sitemap Issue Causing Duplicate Content & Unindexed Pages on Google?
On July 10th my site was migrated from Drupal to Google. The site contains approximately 400 pages. 301 permanent redirects were used. The site contains maybe 50 pages of new content. Many of the new pages have not been indexed and many pages show as duplicate content. Is it possible that there is a site map issue that is causing this problem? My developer believes the map is formatted correctly, but I am not convinced. The sitemap address is http://www.nyc-officespace-leader.com/page-sitemap.xml [^] I am completely non technical so if anyone could take a brief look I would appreciate it immensely. Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan | |0 -
Sitemap or Sitemaps for Magento and Wordpress?
I'm trying to figure out what to do with our sitemap situation. We have a magento install for our shopping cart
Intermediate & Advanced SEO | | chrishansen
sdhydroponics.com
and a wordpress install on
sdhydroponics.com/resources In Magento we get the XML sitemap manually by going to Catalog => Google Sitemap => Add Sitemap In wordpress we use Google XML sitemaps plugin. My questions are: Do I need both of these sitemaps? Or can I use one or the other? If I use both, do I make one sitemap1.xml and the other sitemap2.xml and drop them in the root? How do I make sure google knows I have 2 sitemaps? Anything else I should know? Thank You0 -
Could adding canonical tags to large Ecommerce site ever hurt rankings? Temporarily?
We have a really large site we're working on who's product pages rank well for the most part but also have multiple products listed in different categories with different URL's. I'm assuming there's no downside to adding canonical tags to these right? Its peak season so I don't want to do anything that could, even temporarily, bring down their rankings. Thanks!
Intermediate & Advanced SEO | | iAnalyst.com0