Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to handle sorting, filtering, and pagination in ecommerce? Canonical is enough?
-
Hello,
after reading various articles and watching several videos I'm still not sure how to handle faceted navigation (sorting/filtering) and pagination on my ecommerce site.
Current indexation status:
- The number of "real" pages (from my sitemap) - 2.000 pages
- Google Search Console (Valid) - 8.000 pages
- Google Search Console (Excluded) - 44.000 pages
Additional info:
- Vast majority of those 50k additional pages (44 + 8 - 2) are pages created by sorting, filtering and pagination.
- Example of how the URL changes while applying filters/sorting:
example.com/category --> example.com/category/1/default/1/pricefrom/100
- Every additional page is canonicalized properly, yet as you can see 6k is still indexed.
- When I enter site:example.com/category in Google it returns at least several results (in most of the cases the main page is on the 1st position).
- In Google Analytics I can see than ~1.5% of Google traffic comes to the sorted/filtered pages.
- The number of pages indexed daily (from GSC stats) - 3.000
And so I have a few questions:
- Is it ok to have those additional pages indexed or will the "real" pages rank higher if those additional would not be indexed?
- If it's better not to have them indexed should I add "noindex" to sorting/filtering links or add eg. Disallow: /default/ in robots.txt?
- Or perhaps add "noindex, nofollow" to the links? Google would have then 50k pages less to crawl but perhaps it'd somehow impact my rankings in a negative way?
- As sorting/filtering is not based on URL parameters I can't add it in GSC. Is there another way of doing that for this filtering/sorting url structure?
Thanks in advance,
Andrew
-
Canonical reference links are the preferred technique for this.
If you do nothing, very likely the search engines will decide for you which variations of your pages to index, and the selection may not be ideal. If an index page can be filtered many different ways, the unfiltered version should be referenced as the canonical on each, and a self-referencing canonical link should also be specified on the unfiltered version.
You don't really yet want to disallow the crawling of the refinement paths, because without canonicals implemented, you might very well do more harm than good, finding important pages getting de-indexed. If at some point in the future you find that all the URLs from the refinement paths have been disappeared from the index, and your desired pages are all indexed properly, then at that future date you might want to disallow crawling of the refinement paths (in your robots.txt file). But, not yet, IMO.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical Chain
This is quite advanced so maybe Rand can give me an answer? I often have seen questions surrounding a 301 chain where only 85% of the link juice is passed on to the first target and 85% of that to the next one, up to three targets. But how about a canonical chain? What do I mean by this:? I have a client who sells lighting so I will use a real example (sans domain) I don't want 'new-product' pages appearing in SERPS. They dilute link equity for the categories they replicate and often contain identical products to the main categories and subcategories. I don't want to no index them all together I'd rather tell Google they are the same as the higher category/sub category. (discussion whether a noindex/follow tag would be better?) If I canonicalize new-products/ceiling-lights-c1/kitchen-lighting-c17/kitchen-ceiling-lights-c217 to /ceiling-lights-c1/kitchen-lighting-c17/kitchen-ceiling-lights-c217 I then subsequently discover that everything in kitchen-ceiling-lights-c217 is already in /kitchen-lighting-c17 and I decide to canonicalize those two - so I place a /kitchen-lighting-c17 canonical on /kitchen-ceiling-lights-c217. Then what happens to the new-products canonical? Is it the same rule - does it pass 85% of link equity back to the non new-product URL and 85% of that back to the category? does it just not work? or should I do noindexi/follow Now before you jump in: Let's assume these are done over a period of time because the obvious answer is: Canonicalize both back to /ceiling-lights-c1/kitchen-lighting-c17 I know that and that is not what I am asking. What if they are done in a sequence what is the real result? I don't want to patronise anyone but please read this carefully before giving an answer. Regards Nigel Carousel Projects.
Intermediate & Advanced SEO | Sep 27, 2017, 5:07 AM | Nigel_Carr0 -
Google Ignoring Canonical Tag for Hundreds of Sites
Bazaar Voice provides a pretty easy-to-use product review solution for websites (especially sites on Magento): https://www.magentocommerce.com/magento-connect/bazaarvoice-conversations-1.html If your product has over a certain number of reviews/questions, the plugin cuts off the number of reviews/questions that appear on the page. To see the reviews/questions that are cut off, you have to click the plugin's next or back function. The next/back buttons' URLs have a parameter of "bvstate....." I have noticed Google is indexing this "bvstate..." URL for hundreds of sites, even with the proper rel canonical tag in place. Here is an example with Microsoft: http://webcache.googleusercontent.com/search?q=cache:zcxT7MRHHREJ:www.microsoftstore.com/store/msusa/en_US/pdp/Surface-Book/productID.325716000%3Fbvstate%3Dpg:8/ct:r+&cd=2&hl=en&ct=clnk&gl=us My website is seeing hundreds of these "bvstate" urls being indexed even though we have a proper rel canonical tag in place. It seems that Google is ignoring the canonical tag. In Webmaster Console, the main source of my duplicate titles/metas in the HTML improvements section is the "bvstate" URLs. I don't necessarily want to block "bvstate" in the robots.txt as it will prohibit Google from seeing the reviews that were cutoff. Same response for prohibiting Google from crawling "bvstate" in Paramters section of Webmaster Console. Should I just keep my fingers crossed that Google honors the rel canonical tag? Home Depot is another site that has this same issue: http://webcache.googleusercontent.com/search?q=cache:k0MBLFcu2PoJ:www.homedepot.com/p/DUROCK-Next-Gen-1-2-in-x-3-ft-x-5-ft-Cement-Board-172965/202263276%23!bvstate%3Dct:r/pg:2/st:p/id:202263276+&cd=1&hl=en&ct=clnk&gl=us
Intermediate & Advanced SEO | Apr 12, 2016, 7:59 AM | redgatst1 -
Sitemap generator which only includes canonical urls
Does anyone know of a 3rd party sitemap generator that will only include the canonical url's? Creating a sitemap with geo and sorting based parameters isn't the most ideal way to generate sitemaps. Please let me know if anyone has any ideas. Mind you we have hundreds of thousands of indexed url's and this can't be done with a simple text editor.
Intermediate & Advanced SEO | Dec 2, 2015, 6:58 PM | recbrands0 -
Partial duplicate content and canonical tags
Hi - I am rebuilding a consumer website, and each product page will contain a unique product image, and a sentence or two about the product (and we tend to use a lot of the same words in different ways across products). I'd like to have a tabbed area below the product info that talks about the overall product line, and this content would be duplicate across all the product pages (a "Why use our products" type of thing). I'd have this duplicate content also living on its own URL's so they can be found alone in the SERP's. Question is, do I need to add the canonical tag to this page, since there's partial duplicate content on the product pages? And if I did that, would my product pages go un-indexed?? I understand how to handle completely duplicated content, it's the partial duplicate that I'm having difficulty figuring out.
Intermediate & Advanced SEO | Jan 7, 2014, 6:18 PM | Jenny10 -
Canonical tag - but Title and Description are slightly different
I am building a new SEO site with a "Silo" / Themed architecture. I have a travel website selling hotel reservations. I list a hotel page under a city page - example, www.abc.com/Dallas/Hilton.html Then I use that same property under a segment within the city - example www.abc.com/Dallas/Downtown/Hilton.html, so there are two URLs with the same content Both pages are identical, except I want to customize the Title and Description. I want to customize the title and description to build a consistent theme - for example the /Downtown/Hilton page will have the words "Near Downtown" in the Title and Description, while the primary city Hilton page will not. So I have two questions about this. First, is it okay to use a canonical tag if the Title and Description are slightly different? Everything else is identical. If so, will Google crawl and comprehend the unique Title and Description on the "Downtown" silo? I want Google to see that I have several "supporting" pages to my main landing page(s). I want to present to Google 5 supporting pages in each silo that each has a supporting keyword theme. But I'm not sure if Google will consider content of pages that point to a different page using the canonical tag. Please see this supporting example: http://d.pr/i/aQPv Thanks for your insights. Rob
Intermediate & Advanced SEO | Dec 19, 2013, 9:57 PM | partnerf0 -
Dynamic pages - ecommerce product pages
Hi guys, Before I dive into my question, let me give you some background.. I manage an ecommerce site and we're got thousands of product pages. The pages contain dynamic blocks and information in these blocks are fed by another system. So in a nutshell, our product team enters the data in a software and boom, the information is generated in these page blocks. But that's not all, these pages then redirect to a duplicate version with a custom URL. This is cached and this is what the end user sees. This was done to speed up load, rather than the system generate a dynamic page on the fly, the cache page is loaded and the user sees it super fast. Another benefit happened as well, after going live with the cached pages, they started getting indexed and ranking in Google. The problem is that, the redirect to the duplicate cached page isn't a permanent one, it's a meta refresh, a 302 that happens in a second. So yeah, I've got 302s kicking about. The development team can set up 301 but then there won't be any caching, pages will just load dynamically. Google records pages that are cached but does it cache a dynamic page though? Without a cached page, I'm wondering if I would drop in traffic. The view source might just show a list of dynamic blocks, no content! How would you tackle this? I've already setup canonical tags on the cached pages but removing cache.. Thanks
Intermediate & Advanced SEO | Nov 9, 2012, 5:30 PM | Bio-RadAbs0 -
Best way to merge 2 ecommerce sites
Our Client owns two ecommerce websites. Website A sells 20 related brands. Website has improving search rank, but not normally on the second to fourth page of google. Website B was purchased from a competitor. It has 1 brand (also sold on site A). Search results are normally high on the first page of google. Client wants to consider merging the two sites. We are looking at options. Option 1: Do nothing, site B dominates it’s brand, but this will not do anything to boost site A. Option 2: keep both sites running, but put lots of canonical tags on site B pointing to site A Option 3: close down site B and make a lot of 301 redirects to site A Option 4: ??? Any thoughts on this would be great. We want to do this in a way that boosts site A as much as possible without losing sales on the one brand that site B sells.
Intermediate & Advanced SEO | Nov 12, 2011, 5:32 AM | EugeneF0 -
Canonical & noindex? Use together
For duplicate pages created by the "print" function, seomoz says its better to use noindex (http://www.seomoz.org/blog/complete-guide-to-rel-canonical-how-to-and-why-not) and JohnMu says its better to use canonical http://www.google.com/support/forum/p/Webmasters/thread?tid=6c18b666a552585d&hl=en What do you think?
Intermediate & Advanced SEO | Apr 9, 2013, 3:55 PM | nicole.healthline1