Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
-
We are now introducing 5 links in all our category pages for different sorting options of category listings.
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon.Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization?
On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.
-
With canonicals, I would not worry about the incoming pages. If the new content is useful and relevant, plus linked to internally, they should do fine in terms of indexation. Use the canonical for now, and once you launch the new pages, well a month after launch, if there are key pages not getting indexed, then you can reassess. The canonical is the right thing to do in this case.
As for link equity, you are right, that is a simplistic view of it. It is actually much more intricate than that, but that's a good basic understanding. However, the canonical is not going to hurt your internal link equity. Those links to the different sorting are navigational in nature and the structure will be repeated throughout the site. Google's algo is good at determining internal, editorial links versus those that are navigational in nature. The navigational links don't impact the strength nearly as much as an editorial link.
My personal belief is that you are worrying about something that isn't going to make an impact on your organic traffic. Ensure the correct canonicals are in place and launch the new content. If that new content has the same issue with sorting, use canonicals there as well and let Google figure it out. "They" have gotten pretty good at identifying what to keep and what not.
If you don't want the sorting pages in there at all, you'll need to do one of the following:
- Noindex, disallow in robots.txt - Rhea Drysdale showed me a few years back that you can do a disallow and noindex in robots. If you do both, Google gets the command to not only noindex the URLs, but also cannot crawl the content.
- Noindex, nofollow using meta robots - This would stop all link equity flow from these pages. If you want to attempt to stop flow to these pages, you'll need to nofollow any links to them. The pages can still be crawled however.
- Noindex, follow - Same as above but internal link equity would still flow. Again, if you want to attempt to cut off link equity to these sorting pages, any links to them would need to be nofollowed.
- Disallow in robots - This would stop them from crawling the content, but the URLs could technically still be indexed.
Personally, I believe trying to manage link equity using nofollow is a waste of time. You more than likely have other things that could be making larger impacts. The choice is yours however and I always recommend testing anything to see if it makes an impact.
-
Kate. The domain has 100.000 pages and will scale to over 1 million unique pages during the next couple of months. I do not want the Sorting URLs have any negative effect on the new indexing of the new 900.000 unique pages in the next months.
Regarding link equity. My simplified understanding of link equity is that if a page has 10 links then each link carries 10% of the total link juice of the page. If now 5 of the 10 links do link to a canonical version of the same page (=sorting URLs), I may be losing out on 50% of the potential link juice the page carries. This is my concern. Therefore my doubt is if I should rather try to hide these sorting URLs from google (same as was also recommended by Rand for facetted navigation pages that one does not consider important for being indexed).
-
Is your issue with crawling or indexing? Those are two separate issues. Why don't you want Google having the canonicals in the index? If you can give me some more insight, I can try to recommend the best option.
And I'm not following your last question. Can you try to ask it another way?
-
Hi Kate, thanks lot. Yes canonical is something we should definetly do and we have implemented.
Still I had the experience in the past that google also indexed lots of canonicalized URLs with near identical content. Any additional step I could do to minimize indexing of these URLs further?
Wouldn't then the basically "self referencing" URLS of sorting links (going to canonicalized versions of the same page) be lost for link equity?
-
This one would need a canonical. For one category page with 5 different sort options, you'd need one canonical URL (one without any sorting or the default sorting) and point all others to that URL using a canonical tag.
https://support.google.com/webmasters/answer/139066?hl=en
Would that work for your setup? If I understand your situation correctly, this should work. It consolidates link equity and allows Google to choose what needs to be indexed and served.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Avoid Too Many Internal Links" when you have a mega menu
Using the on-page grader and whilst further investigating internal linking, I'm concerned that as the ecommerce website has a very link heavy mega menu the rule of 100 may be impeding on the contextual links we're creating. Clearly we don't want to no-follow our entire menu. Should we consider no-indexing the third-level- for example short sleeve shirts here... Clothing > Shirts > Short Sleeve Shirts What about other pages we're don't care to index anyway such as the 'login page' the 'cart' the search button? Any thoughts appreciated.
Intermediate & Advanced SEO | | Ant-Scarborough0 -
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Why is rel="canonical" pointing at a URL with parameters bad?
Context Our website has a large number of crawl issues stemming from duplicate page content (source: Moz). According to an SEO firm which recently audited our website, some amount of these crawl issues are due to URL parameter usage. They have recommended that we "make sure every page has a Rel Canonical tag that points to the non-parameter version of that URL…parameters should never appear in Canonical tags." Here's an example URL where we have parameters in our canonical tag... http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/ rel="canonical" href="http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/?pageSize=0&pageSizeBottom=0" /> Our website runs on IBM WebSphere v 7. Questions Why it is important that the rel canonical tag points to a non-parameter URL? What is the extent of the negative impact from having rel canonicals pointing to URLs including parameters? Any advice for correcting this? Thanks for any help!
Intermediate & Advanced SEO | | Solid_Gold1 -
What is the best practice for URLs for E-commerce products in multiple categories?
Hello all! I have always worked successfully with SEO on E-commerce sites, however we are currently revamping an older site for a client and so I thought I'd turn to the community to ask what the best practices that you guys are experiencing for url structures at the moment. Obviously we do not wish to create duplicate content and so the big question is, what would you guys do for the very best structure for URLs on an E-commerce site that has products in multiple categories? Let's imagine we are selling toy cars. I have a sports car for sale, so naturally it can go in the sports cars category and it could also go in to the convertibles category too. What is the best way you have found recently that works and increases rankings, but does not create duplicate content? Thanks in advance! 🙂 Kind Regards, JDM
Intermediate & Advanced SEO | | Hatfish0 -
Best Practices for Moving a Sub-Domain to a Sub-Folder
One of my clients is moving their subdomain to a subfolder on their main domain. (ie. blog.example.com to example.com/blog) I just wanted to get everyone's thoughts on some best practices for things we should be doing/looking for when making this move.? ie WMT, .htaccess, 301s etc? Thanks.
Intermediate & Advanced SEO | | DarinPirkey0 -
Is it better "nofollow" or "follow" links to external social pages?
Hello, I have four outbound links from my site home page taking users to join us on our social Network pages (Twitter, FB, YT and Google+). if you look at my site home page, you can find those 4 links as 4 large buttons on the right column of the page: http://www.virtualsheetmusic.com/ Here is my question: do you think it is better for me to add the rel="nofollow" directive to those 4 links or allow Google to follow? From a PR prospective, I am sure that would be better to apply the nofollow tag, but I would like Google to understand that we have a presence on those 4 social channels and to make clearly a correlation between our official website and our official social channels (and then to let Google understand that our social channels are legitimate and related to us), but I am afraid the nofollow directive could prevent that. What's the best move in this case? What do you suggest to do? Maybe the nofollow is irrelevant to allow Google to correlate our website to our legitimate social channels, but I am not sure about that. Any suggestions are very welcome. Thank you in advance!
Intermediate & Advanced SEO | | fablau9 -
What is the best way to optimize/setup a teaser "coming soon" page for a new product launch?
Within the context of a physical product launch what are some ideas around creating a /coming-soon page that "teases" the launch. Ideally I'd like to optimize a page around the product, but the client wants to try build consumer anticipation without giving too many details away. Any thoughts?
Intermediate & Advanced SEO | | GSI0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0