What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
-
We are now introducing 5 links in all our category pages for different sorting options of category listings.
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon.Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization?
On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.
-
With canonicals, I would not worry about the incoming pages. If the new content is useful and relevant, plus linked to internally, they should do fine in terms of indexation. Use the canonical for now, and once you launch the new pages, well a month after launch, if there are key pages not getting indexed, then you can reassess. The canonical is the right thing to do in this case.
As for link equity, you are right, that is a simplistic view of it. It is actually much more intricate than that, but that's a good basic understanding. However, the canonical is not going to hurt your internal link equity. Those links to the different sorting are navigational in nature and the structure will be repeated throughout the site. Google's algo is good at determining internal, editorial links versus those that are navigational in nature. The navigational links don't impact the strength nearly as much as an editorial link.
My personal belief is that you are worrying about something that isn't going to make an impact on your organic traffic. Ensure the correct canonicals are in place and launch the new content. If that new content has the same issue with sorting, use canonicals there as well and let Google figure it out. "They" have gotten pretty good at identifying what to keep and what not.
If you don't want the sorting pages in there at all, you'll need to do one of the following:
- Noindex, disallow in robots.txt - Rhea Drysdale showed me a few years back that you can do a disallow and noindex in robots. If you do both, Google gets the command to not only noindex the URLs, but also cannot crawl the content.
- Noindex, nofollow using meta robots - This would stop all link equity flow from these pages. If you want to attempt to stop flow to these pages, you'll need to nofollow any links to them. The pages can still be crawled however.
- Noindex, follow - Same as above but internal link equity would still flow. Again, if you want to attempt to cut off link equity to these sorting pages, any links to them would need to be nofollowed.
- Disallow in robots - This would stop them from crawling the content, but the URLs could technically still be indexed.
Personally, I believe trying to manage link equity using nofollow is a waste of time. You more than likely have other things that could be making larger impacts. The choice is yours however and I always recommend testing anything to see if it makes an impact.
-
Kate. The domain has 100.000 pages and will scale to over 1 million unique pages during the next couple of months. I do not want the Sorting URLs have any negative effect on the new indexing of the new 900.000 unique pages in the next months.
Regarding link equity. My simplified understanding of link equity is that if a page has 10 links then each link carries 10% of the total link juice of the page. If now 5 of the 10 links do link to a canonical version of the same page (=sorting URLs), I may be losing out on 50% of the potential link juice the page carries. This is my concern. Therefore my doubt is if I should rather try to hide these sorting URLs from google (same as was also recommended by Rand for facetted navigation pages that one does not consider important for being indexed).
-
Is your issue with crawling or indexing? Those are two separate issues. Why don't you want Google having the canonicals in the index? If you can give me some more insight, I can try to recommend the best option.
And I'm not following your last question. Can you try to ask it another way?
-
Hi Kate, thanks lot. Yes canonical is something we should definetly do and we have implemented.
Still I had the experience in the past that google also indexed lots of canonicalized URLs with near identical content. Any additional step I could do to minimize indexing of these URLs further?
Wouldn't then the basically "self referencing" URLS of sorting links (going to canonicalized versions of the same page) be lost for link equity?
-
This one would need a canonical. For one category page with 5 different sort options, you'd need one canonical URL (one without any sorting or the default sorting) and point all others to that URL using a canonical tag.
https://support.google.com/webmasters/answer/139066?hl=en
Would that work for your setup? If I understand your situation correctly, this should work. It consolidates link equity and allows Google to choose what needs to be indexed and served.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Internal Links - Different URLs
Hey so, In my product page, I have recommended products at the bottom. The issue is that those recommended products have long parameters such as sitename.com/product-xy-z/https%3A%2F%2Fwww.google.co&srcType=dp_recs The reason why it has that long parameter is due to tracking purposes (internally with the dev and UX team). My question is, should I replace it with the clean URL or as long as it has the canonical tag, it should be okay to have such a long parameter? I would think clean URL would help with internal links and what not...but if it already has a canonical tag would it help? Another issue is that the URL is different and not just the parameter. For instance..the canonical URL is sitename.com/productname-xyz/ and so the internal link used on the product page (same exact page just different URL with parameter) sitename.com/xyz/https%3A%2F%2Fwww.google.co&srcType=dp_recs (missing product name), BUT still has the canonical tag!
Intermediate & Advanced SEO | | ggpaul5620 -
"No Index, No Follow" or No Index, Follow" for URLs with Thin Content?
Greetings MOZ community: If I have a site with about 200 thin content pages that I want Google to remove from their index, should I set them to "No Index, No Follow" or to "No Index, Follow"? My SEO firm has advised me to set them to "No Index, Follow" but on a recent MOZ help forum post someone suggested "No Index, No Follow". The MOZ poster said that telling Google the content was should not be indexed but the links should be followed was inconstant and could get me into trouble. This make a lot of sense. What is proper form? As background, I think I have recently been hit with a Panda 4.0 penalty for thin content. I have several hundred URLs with less than 50 words and want them de-indexed. My site is a commercial real estate site and the listings apparently have too little content. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
How to structure links on a "Card" for maximum crawler-friendliness
My question is how to best structure the links on a "Card" while maintaining usability for touchscreens. I've attached a simple wireframe, but the "card" is a format you see a lot now on the web: it's about a "topic" and contains an image for the topic and some text. When you click the card it links to a page about the "topic". My question is how to best structure the card's html so google can most easily read it. I have two options: a) Make the elements of the card 2 separate links, one for the image and one for the text. Google would read this as follows. //image
Intermediate & Advanced SEO | | jcgoodrich
[](/target URL) //text
<a href=' target="" url'="">Topic</a href='> b) Make the entire "Card" a link which would cause Google to read it as follows: <a></a> <a>Bunch of div elements that includes anchor text and alt-image attributes above along with a fair amount of additional text.</a> <a></a> Holding UX aside, which of these options is better purely from a Google crawling perspective? Does doing (b) confuse the bot about what the target page is about? If one is clearly better, is it a dramatic difference? Thanks! PwcPRZK0 -
Will Canonical tag on parameter URLs remove those URL's from Index, and preserve link juice?
My website has 43,000 pages indexed by Google. Almost all of these pages are URLs that have parameters in them, creating duplicate content. I have external links pointing to those URLs that have parameters in them. If I add the canonical tag to these parameter URLs, will that remove those pages from the Google index, or do I need to do something more to remove those pages from the index? Ex: www.website.com/boats/show/tuna-fishing/?TID=shkfsvdi_dc%ficol (has link pointing here)
Intermediate & Advanced SEO | | partnerf
www.website.com/boats/show/tuna-fishing/ (canonical URL) Thanks for your help. Rob0 -
Could a HTML <select>with large numbers of <option value="<url>">'s affect my organic rankings</option></select>
Hi there, I'm currently redesigning my website, and one particular pages lists hotels in New York. Some functionality I'm thinking of adding in is to let the user find hotels close to specific concert venues in New York. My current thinking is to provide the following select element on the page - selecting any one of the options will automatically redirect to my page for that concert venue. The purpose of this isn't to affect the organic traffic - I'm simply introducing this as a tool to help customers find the right hotel, but I certainly don't want it to have an adverse effect on my organic traffic. I'd love to know your thoughts on this. I must add that in certain cities, such as New York, there could be up to 450 different options in this select element. | <select onchange="location=options[selectedIndex].value;"> <option value="">Show convenient hotels for:</option> <option value="http://url1..">1492 New York</option> <option value="http://url2..">Abrons Arts Center</option> <option value="http://url3..">Ace of Clubs New York</option> <option value="http://url4..">Affairs Afloat</option> <option value="http://url5..">Affirmation Arts New York</option> <option value="http://url6..">Al Hirschfeld Theatre</option> <option value="http://url7..">Alice Tully Hall</option> .. .. ..</select> Many thanks Mike |
Intermediate & Advanced SEO | | mjk260 -
Google+ Personal Page pass link juice?
I noticed recently that a clients google plus business page (Set up as a personal page) has a followed link pointing to their site. They have many links on the web pointing to the google+ page, however that page is an https page. So the question is, would a google+ page that is https still pass authority and link juice to the site linked in the about us tab?
Intermediate & Advanced SEO | | iAnalyst.com0 -
What's the best SEO practice for having dynamic content on the same URL?
Let's use this example... www.miniclip.com and there's a function to log in... If you're logged in and a cookie checks that you're logged in and you're on page, let's say, www.miniclip.com/racing-games however the banners being displayed would have more call to action and offers on the page when a user is not logged in to entice them to sign up but the URL would still be www.miniclip.com/racing-games if and if not logged in, what would be the best URL practice for this? just do it?
Intermediate & Advanced SEO | | AdiRste0 -
Domain Name Change - Best Practices?
Good day guys, We got a restaurant that is changing its name and domain. However they are keeping the same server location, same content and same pages (we are just changing the logo on the website). It just has to go a new domain. We don't want to lose the value of the current site, and we want to avoid any duplicate penalties. Could you please advise of the best practices of doing a domain name change? Thank you.
Intermediate & Advanced SEO | | Michael-Goode0