Pages excluded from Google's index due to "different canonicalization than user"
-
Hi MOZ community,
A few weeks ago we noticed a complete collapse in traffic on some of our pages (7 out of around 150 blog posts in question). We were able to confirm that those pages disappeared for good from Google's index at the end of January '18, they were still findable via all other major search engines.
Using Google's Search Console (previously Webmastertools) we found the unindexed URLs in the list of pages being excluded because "Google chose different canonical than user". Content-wise, the page that Google falsely determines as canonical instead has little to no similarity to the pages it thereby excludes from the index.
About our setup:
We are a SPA, delivering our pages pre-rendered, each with an (empty) rel=canonical tag in the HTTP header that's then dynamically filled with a self-referential link to the pages own URL via Javascript. This seemed and seems to work fine for 99% of our pages but happens to fail for one of our top performing ones (which is why the hassle
).
What we tried so far:
- going through every step of this handy guide: https://moz.com/blog/panic-stations-how-to-handle-an-important-page-disappearing-from-google-case-study --> inconclusive (healthy pages, no penalties etc.)
- manually requesting re-indexation via Search Console --> immediately brought back some pages, others shortly re-appeared in the index then got kicked again for the aforementioned reasons
- checking other search engines --> pages are only gone from Google, can still be found via Bing, DuckDuckGo and other search engines
Questions to you:
- How does the Googlebot operate with Javascript and does anybody know if their setup has changed in that respect around the end of January?
- Could you think of any other reason to cause the behavior described above?
Eternally thankful for any help!
-
Hi SvenRi, that's an interesting one! The message you're getting from Google suggests that, rather than not finding the canonical tag, the system has reason to believe that the canonical is not representative of the best content.
One thing I'd bear in mind is that Google doesn't take canonical tags as gospel, but rather guidance, so it can just ignore them without there necessarily being a problem in how you've implemented that tag. Another is that while Google says that their crawlers can parse JavaScript, there's evidence that it doesn't parse the page content perfectly.
What happens when you fetch and render the pages in question using Search Console (both the page you want to rank and the page Google is selecting)? Can you see all of the content? Google uses the same JavaScript rendering as Chrome 41 (see here) have you tried accessing with that? You could also try a tool like Screaming Frog with JavaScript rendering switched on to see what kind of page content comes back. It could be worth making sure the canonical is generated properly but I'd also be checking that the page content is being rendered properly to make sure Google is seeing the pages as different as you describe. I'd also check to make sure there isn't a second, conflicting, canonical tag on the page. I know some SPA frameworks can have issues with double-opening HTML tags when one page is accessed after another, that could be something that would confuse a crawler so you could double-check that.
As ever, there are the rumours that Google will start giving much more weight to mobile in terms of indexing. Given your question about things changing recently - does your site have desktop and mobile parity?
If it looks as though everything is kosher, is it possible that the page Google is suggesting is much more heavily linked to internally or externally? If internally you could consider reviewing your internal linking (Will wrote a post about ways to think about internal linking here). You could use a tool like Majestic to look at who is linking to these pages externally, it may be worth double checking that all the links are genuine.
TL;DR I would start with the whole page content, not just the search directives, to make sure that's always being understood properly, then I would look in to linking. These are mainly areas of investigation and next debug steps, hopefully they'll help narrow down the search for you!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Alternate page with proper canonical tag Status: Excluded in Google webmaster tools.
In Google Webmaster Tools, I have a coverage issue. I am getting this error message: Alternate page with proper canonical tag Status: Excluded. It gives the below blog post page as an example. Any idea how to resolve? At one time, I was using handl utm grabber, but the plugin is deactivated on my website. https://www.savacations.com/turrialba-costa-ricas-garden-city/?utm_source=deleted&utm_medium=deleted&utm_term=deleted&utm_content=deleted&utm_campaign=deleted&gclid=deleted5.
Intermediate & Advanced SEO | | Alancito0 -
I'm looking for a bulk way to take off from the Google search results over 600 old and inexisting pages?
When I search on Google site:alexanders.co.nz still showing over 900 results. There are over 600 inexisting pages and the 404/410 errrors aren't not working. The only way that I can think to do that is doing manually on search console using the "Removing URLs" tool but is going to take ages. Any idea how I can take down all those zombie pages from the search results?
Intermediate & Advanced SEO | | Alexanders1 -
Rel="prev" / "next"
Hi guys, The tech department implemented rel="prev" and rel="next" on this website a long time ago.
Intermediate & Advanced SEO | | AdenaSEO
We also added a canonical tag to the 'own' page. We're talking about the following situation: https://bit.ly/2H3HpRD However we still see a situation where a lot of paginated pages are visible in the SERP.
Is this just a case of rel="prev" and "next" being directives to Google?
And in this specific case, Google deciding to not only show the 1st page in the SERP, but still show most of the paginated pages in the SERP? Please let me know, what you think. Regards,
Tom1 -
Issue with Google Structured Data Testing Toll asking for "logo" - ld+json
Hi I am trying to get schema set up for a number of articles we are putting on our site (eg:https://www.plasticpipeshop.co.uk/temporary-KB-page_ep_88-1.html) the mark up I think I should use is : Google structured data testing tool keeps insisting I have "publisher" and then "logo" but doesn't seem to want accept anything for the "logo" entry no matter how I seem to code it. Any assistance would be much appreciated as after three hours on this I am pulling what little hair I have left out! Bob
Intermediate & Advanced SEO | | BobBawden10 -
Syntax: 'canonical' vs "canonical" (Apostrophes or Quotes) does it matter?
I have been working on a site and through all the tools (Screaming Frog & Moz Bar) I've used it recognizes the canonical, but does Google? This is the only site I've worked on that has apostrophes. rel='canonical' href='https://www.example.com'/> It's apostrophes vs quotes. Could this error in syntax be causing the canonical not to be recognized? rel="canonical"href="https://www.example.com"/>
Intermediate & Advanced SEO | | ccox10 -
I'm noticing that URL that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before?
I'm noticing that URLs that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before? Here's an example:
Intermediate & Advanced SEO | | nystromandy
http://www.thefader.com/2017/01/11/the-carter-documentary-lil-wayne-black-lives-matter0 -
RSS "fresh" content with static page
Hi SEOmoz members, Currently I am researching my competitor and noticed something what i dont really understand. They have hundreds of static pages that dont change, the content is already the same for over 6 months. Every time a customer orders a product they use their rss feed to publish: "Customer A just bought product 4" When i search in Google for product 4 in the last 24 hours, its always their with a new publishing date but the same old content. Is this a good SEO tactic to implant in my own site?
Intermediate & Advanced SEO | | MennoO0 -
Pagination Question: Google's 'rel=prev & rel=next' vs Javascript Re-fresh
We currently have all content on one URL and use # and Javascript refresh to paginate pages, and we are wondering if we transition to the Google's recommended pagination if we will see an improvement in traffic. Has anyone gone though a similar transition? What was the result? Did you see an improvement in traffic?
Intermediate & Advanced SEO | | nicole.healthline0