Any SEO-wizards out there who can tell me why Google isn't following the canonicals on some pages?
-
Hi,
I am banging my head against the wall regarding the website of a costumer: In "duplicate title tags" in GSC I can see that Google is indexing a whole bunch parametres of many of the url's on the page. When I check the rel=canonical tag, everything seems correct. My costumer is the biggest sports retailer in Norway. Their webshop has approximately 20 000 products. Yet they have more than 400 000 pages indexed by Google.
So why is Google indexing pages like this? What is missing in this canonical?https://www.gsport.no/herre/klaer/bukse-shorts?type-bukser-334=regnbukser&order=price&dir=descWhy isn't Google just cutting off the ?type-bukser-334=regnbukser&order=price&dir=desc part of the url?Can it be the canonical-tag itself, or could the problem be somewhere in the CMS?
Looking forward to your answers
- Sigurd
-
Thank you all! I have forwarded this to the owner of the page, so now we'll just sit back and see the effects
-
Hi Inevo,
David and Jake's comments and recommendations are spot on correct. You need to update your robots.txt file. Jake is correct when he said "just because a canonical tag is in place, that doesn't prevent Google from crawling and indexing the page."
Sincerely,
Dana
-
Hi Inevo,
Canonical tags are being used correctly and it doesn't actually look like any of the URLs with query strings are indexed in Google.
I'm going to go off the topic of canonicals now, but still related to the crawl and index of the site:
Has the site changed CMS in the last year or two? It's possible that some of the 400k URLs indexed are old or were not canonicalized properly at some point in time, so they were indexed.
The problem with how the site it currently setup is that it is basically impossible for search engines to crawl because of the product filter. I wrote an article about this a while ago (link), specifically to do with product filters in Magento. Product filters can turn your site into a 'black hole' for search engines - which is definitely happening in this case (try crawling it with Screaming Frog).
I'd recommend blocking product filter URLs from being crawled so that search engines are only crawling important pages on the site.
You should be able to fix this be adding these 3 lines to your Robots.txt:
Disallow: *?
Disallow: *+
Allow: *?p=(Note: please check that you don't need to add more parameters to Allow)
These changes will make crawling your site much more efficient - from millions of crawlable URLs, to probably 30-35k.
Let me know how this goes for you
Cheers,
David
-
I would definitely check to make sure the canonical tag is being properly used. Make sure it is an absolute url vs. a relative url.
That being said, please note that just because a canonical tag is in place, that doesn't prevent Google from crawling and indexing the page, and including the page in search results with the site:domain command. If you see the canonicalized URLs outranking their canonical, then you can start to question why Google isn't honoring the canonical.
Please note that canonical tags are a recommendation and not a directive.. meaning Google doesn't have to honor them if they do not feel the page is truly a canonical.
-Jake
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I still monitor noindex, nofollow pages with Google Analytics?
I have a private/login site where all pages are noindex, nofollow. Can I still monitor external site links with Google Analytics?
Technical SEO | | jasmine.silver0 -
How do I handle a redirect chain issue pertaining to a page that doesn't actually exist on my site?
I have a page showing up on the insights report as being a redirect chain. This page however does not exist as far as I can tell. It is not on my dashboard anywhere and pointing a browser to it produces a messy page with Wordpress theme error code spit out. How do I track this down to clean it up if the page does not exist within my Wordpress installation? The page for reference is https://butlermobility.com/dealers/downloads. As it stands today the dealers and downloads pages are separate. There is no downloads sub page within the dealers section.
Technical SEO | | NiteSkirm0 -
Does Google read dynamic canonical tags?
Does Google recognize rel=canonical tag if loaded dynamically via javascript? Here's what we're using to load: <script> //Inject canonical link into page head if (window.location.href.indexOf("/subdirname1") != -1) { canonicalLink = window.location.href.replace("/kapiolani", ""); } if (window.location.href.indexOf("/subdirname2") != -1) { canonicalLink = window.location.href.replace("/straub", ""); } if (window.location.href.indexOf("/subdirname3") != -1) { canonicalLink = window.location.href.replace("/pali-momi", ""); } if (window.location.href.indexOf("/subdirname4") != -1) { canonicalLink = window.location.href.replace("/wilcox", ""); } if (canonicalLink != window.location.href) { var link = document.createElement('link'); link.rel = 'canonical'; link.href = canonicalLink; document.head.appendChild(link); } script>
Technical SEO | | SoulSurfer80 -
How can I get Google to forget an https version of one page on my site?
Google mysteriously decided to index the broken, https version of one page on my company's site (we have a cert for the site, but this page is not designed to be served over https and the CSS doesn't load). The page already has many incoming links to the http version, and it has a canonical URL with http. I resubmitted it on http with webmaster tools. Is there anything else I could do?
Technical SEO | | BostonWright0 -
Rich Snippets for recipe pages not appearing in Google
We are building a baking website and have implemented rich snippets for our recipe posts. We noticed inconsistent results on competitor sites, and then noticed it was happening to our links as well. Our content has only been live for a week, I know it may take a couple weeks, but other sites that have had their content around for a while have this happening too. For example: When you use this tool: http://www.google.com/webmasters/tools/richsnippets And put in this link (competitor): http://food52.com/recipes/864-deep-chocolate-cake-with-orange-icing and press "Preview," you'll see a nice rich snippet preview. Now go ahead and search for "Deep Chocolate Cake with Orange Icing" using Google, you will see that in the search results the image for this link is not appearing. This is happening to all of our links as well. Why? We are using the schema recipe format, but apparently that doesn't guarantee the image will appear in the actual search results. How does Google determine which images are displayed in rich snippets and which aren't?
Technical SEO | | bakepedia0 -
Would Google Call These Pages Duplicate Content?
Our Web store, http://www.audiobooksonline.com/index.html, has struggled with duplicate content issues for some time. One aspect of duplicate content is a page like this: http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html. When an audio book title goes out-of-publication we keep the page at our store and display a http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html whenever a visitor attempts to visit a specific title that is OOP. There are several thousand OOP pages. Would Google consider these OOP pages duplicate content?
Technical SEO | | lbohen0 -
Can a Joomla template ruin a sites on-page seo?
Have been looking into a potential clients site that performs really badly, when I took a look in 'googlebot view' I see that every on page link appears- [visit camp26.biz] clients link title as expected insures against a negative affect. But having that as the first two words of every link title/anchor in the eyes of Google would seem to be something to be concerned about? Have tried searching for answers to this online but template providers are so prevelant everywhere I can't find any decent information on this issue. If anyone can throw some light on this for me it will be much appreciated : )
Technical SEO | | steve821 -
Pages not indexed by Google
We recently deleted all the nofollow values on our website. (2 weeks ago) The number of pages indexed by google is the same as before? Do you have explanations for this? website : www.probikeshop.fr
Technical SEO | | Probikeshop0