How to handle sorting, filtering, and pagination in ecommerce? Canonical is enough?
-
Hello,
after reading various articles and watching several videos I'm still not sure how to handle faceted navigation (sorting/filtering) and pagination on my ecommerce site.
Current indexation status:
- The number of "real" pages (from my sitemap) - 2.000 pages
- Google Search Console (Valid) - 8.000 pages
- Google Search Console (Excluded) - 44.000 pages
Additional info:
- Vast majority of those 50k additional pages (44 + 8 - 2) are pages created by sorting, filtering and pagination.
- Example of how the URL changes while applying filters/sorting:
example.com/category --> example.com/category/1/default/1/pricefrom/100
- Every additional page is canonicalized properly, yet as you can see 6k is still indexed.
- When I enter site:example.com/category in Google it returns at least several results (in most of the cases the main page is on the 1st position).
- In Google Analytics I can see than ~1.5% of Google traffic comes to the sorted/filtered pages.
- The number of pages indexed daily (from GSC stats) - 3.000
And so I have a few questions:
- Is it ok to have those additional pages indexed or will the "real" pages rank higher if those additional would not be indexed?
- If it's better not to have them indexed should I add "noindex" to sorting/filtering links or add eg. Disallow: /default/ in robots.txt?
- Or perhaps add "noindex, nofollow" to the links? Google would have then 50k pages less to crawl but perhaps it'd somehow impact my rankings in a negative way?
- As sorting/filtering is not based on URL parameters I can't add it in GSC. Is there another way of doing that for this filtering/sorting url structure?
Thanks in advance,
Andrew
-
Canonical reference links are the preferred technique for this.
If you do nothing, very likely the search engines will decide for you which variations of your pages to index, and the selection may not be ideal. If an index page can be filtered many different ways, the unfiltered version should be referenced as the canonical on each, and a self-referencing canonical link should also be specified on the unfiltered version.
You don't really yet want to disallow the crawling of the refinement paths, because without canonicals implemented, you might very well do more harm than good, finding important pages getting de-indexed. If at some point in the future you find that all the URLs from the refinement paths have been disappeared from the index, and your desired pages are all indexed properly, then at that future date you might want to disallow crawling of the refinement paths (in your robots.txt file). But, not yet, IMO.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Handle Spammy Top Referring Domains
We keep getting links from the domain lyricswithoutmelody.org. Currently we have the most referring backlinks of all from them. I'm not sure what to do with it... is it hurting us? I know I can disavow them, but I'm afraid it will hurt since we have 472 total backlinks from the domain. Their trust flow is 9 and citation flow is 11. Another option I was thinking is to block the domains IP from seeing our website, would that work? Just trying to figure out the best coarse of action... or if no action at all is best. I've attached a screenshot of my top referring domains. The ones outlined in red I don't know who they are and if it's helping or hurting. Moz Fam HELP! Ijb09DNhIW5
Intermediate & Advanced SEO | | LindsayE0 -
Sale Pages On An eCommerce Website
I have a client who sells 50 brands of shoes. At the moment the developer has a noindex/nofollow tag on all sale pages which is wrong as around 10% of site activity revolves around those pages. The structure looks like this: 1. For Cats/Sub Cats site/sale
Intermediate & Advanced SEO | | Nigel_Carr
site/womens/sale
site/womens/shoe/sale
site/womens/shoes/ballerinas/sale For every cat/subcat - there are 10 cats and average 5 subcats per cat so 50 pages of sale. 2. For Brands site/brand
site/brand/womens
site/sale/brand
site/sale/womens/brand
site/sale/womens/cat/brand
site/sale/womens/cat/subcat/brand So each brand can have four sale pages on top of its own brand page. 50 brands x 54 = around 2700. Now no one is going to start writing 2700 pieces of additional on page content (although Meta is OK! ) and we risk further diluting the brand pages we need to show highly for, so we need to do something. Should we Category Pages: 1. Allow all sale cat and subcat pages to proliferate through Google? or
2. Canonicalise all sale sub category pages back to category
3. Caonicalise all category and Subcategory pages back to sale/womens Brand Pages: 1. Allow all sale brand pages to proliferate through Google ?
2. Canonicalise Sub Cat brand pages back to sale/category/brand
3. Canonicalise Sub Cat and Cat back to sale/brand Note the lower pages never do well in search. If you search a brand + Sale in Google it is always the site/brand page that comes up, never the sale version (This is from research on other similar sites and my own analysis) Same with Sub Cats - eg, Brand + Subcat - it's always site/brand that comes up first wand has the highest PA. Also we can't analyse any of these sale pages in MOZ or anywhere else as they are not in search at all having been no indexed. That's my conundrum for today, Any thoughts would be appreciated!0 -
Canonicals Passing Link Juice?
After having read this thread, the answer seems to be a tentative "Yes", but I am curious if I am doing this wrong, or causing myself problems, for a specific situation. We have a thread on the forums that has over 50,000 views for that thread alone. No doubt many people have linked to it across the web, and it ranks very well with Google. But we are dealing with a major problem in that the main portion of our site (home page and core content) which are the most important, aren't ranking in Google at all. A big part of this is because that part of the site hasn't been updated in years, whereas the forum is updated daily. By users. We've begun putting out quality content in our News Center lately, and hoping to start boosting its presence in Google. We have an article on the exact same topic that the forum thread covers. I was thinking of putting a canonical on that thread, pointing to the article, and hopefully pointing some very powerful link juice, popularity, and traffic into our news center articles. People can comment there as well if they like. Are there any potential downsides to doing this? My hope is that the forum thread loses rankings and the article takes on its rankings. Thank you.
Intermediate & Advanced SEO | | HLTalk1 -
Canonical tag - link juice to the frontpage
I only wants to be 100% sure about using the canonical tag.. I want to use it on pages that rankes together with the frontpage in Google, but i only want the frontpage to rank alone and to have the link juice from the other 2 sites direct-ed to the frontpage.. Hope you agre that its the correct way to doo so?? Wich one is correct: http://www.testtest.com/”> Or this http://www.testtest.com/”/>
Intermediate & Advanced SEO | | seopeter290 -
Canonical URL on search result pages
Hi there, Our company sells educational videos to Nurses via subscription. I've been looking at their video search results page:
Intermediate & Advanced SEO | | 9868john
http://www.nursesfornurses.com.au/cpd When you click on a category, the URL appears like this:
http://www.nursesfornurses.com.au/cpd?view=category&cat=9&name=Acute+Surgical+Nursing
http://www.nursesfornurses.com.au/cpd?view=category&cat=6&name=Medications Would this be an instance where i'd use the canonical tag to redirect each search results page? Bearing in mind the /cpd page is under /Nursing cpd, and that /Nursing cpd is our best performing page in search engines, would it be better to refer it to the 'Nursing CPD' rather than 'CPD' page? Any advice is very welcome,
Thanks,
John0 -
Yoast & rel canonical for paginated Wordpress URLs
Hello, our Wordpress blog at http://www.jobs.ca/career-resources has a rel canonical issue since we added pagination to the front page and category-pages. We're using Yoast and it's incorrectly applying a rel-canonical meta tag referencing page 1 on page 2, 3, etc. This is a known misuse of the rel-canonical tag (per Google's Webmaster Blog - http://googlewebmastercentral.blogspot.ca/2013/04/5-common-mistakes-with-relcanonical.html, which says rel-canonical should be replaced with rel-prev and rel-next for page 2, 3, etc.). We don't see a way to specify anywhere in Yoast's options to correct this behaviour for page 2, 3, etc. Yoast allows you to override a page's canonical URL, otherwise it automatically uses the Wordpress permalink. My question is, does anyone know how to configure Yoast to properly replace rel-canonical tags with rel-prev and rel-next for paginated URLs, or do I need to look at another plugin or customize the behavior directly in my child theme code? This issue was brought up here as well: http://moz.com/community/q/canonical-help, but the only response did not relate to Yoast. (We're using Wordpress 3.6.1 and Yoast "Wordpress SEO" 1.4.18)
Intermediate & Advanced SEO | | aactive0 -
Techniques to fix eCommerce faceted navigation
Hi everyone, I've read a lot about different techniques to fix duplicate content problems caused by eCommerce faceted navigation (e.g. redundant URL combinations of colors, sizes, etc.). From what I've seen suggested methods include using AJAX or JavaScript to make the links functional for users only and prevent bots from crawling through them. I was wondering if this technique would work instead? If we detect that the user is a robot, instead of displaying a link, we simply display its anchor text. So what would be for a human COLOR < li > < a href = red >red < /a > < /li >
Intermediate & Advanced SEO | | anthematic
< li > < a href = blue>blue < /a > < /li > Would be for a robot COLOR < li > red < /li >
< li > blue < /li > Any reason I shouldn't do this? Thanks! *** edit Another reason to fix this is crawl budget since robots can waste their time going through every possible combination of facet. This is also something I'm looking to fix.0 -
How should we handle syndicated content on a partner site?
Say we have a subdomain with resources (resources.site.com) and a partner site (partner.com) and have an agreement to share content (I know - this isn't ideal but it's what I've got to work with). Please comment on the following: the use of cross-domain canonicals on "shared" articles an intro and/or conclusion paragraph that is unique on the site that re-publishes that could say something like "our partner over at resources.site.com recently published the following report ... yada, yada....." other meta tags to let Google know that we are not scraping, e.g. author tags any other steps we can take to ensure neither site gets "dinged" by the search engines. Thanks a bunch in advance! AK26
Intermediate & Advanced SEO | | akim260