Do URLs with canonical tags get indexed by Google?
-
Hi, we re-branded and launched a new website in February 2016. In June we saw a steep drop in the number of URLs indexed, and there have continued to be smaller dips since. We started an account with Moz and found several thousand high priority crawl errors for duplicate pages and have since fixed those with canonical tags. However, we are still seeing the number of URLs indexed drop.
Do URLs with canonical tags get indexed by Google? I can't seem to find a definitive answer on this. A good portion of our URLs have canonical tags because they are just events with different dates, but otherwise the content of the page is the same.
-
Thanks so much! Really helpful!
-
Not exactly. Its not so much that the canonical "supersedes" an index, follow tag.... a canonical tag establishes equivalency while a NoIndex is more like a "does not equal." The Index, Follow is still there and being seen by bots as they crawl... in fact, if you had NoIndex on a page with a Canonical Tag, it may not even see the canonical at all since you told it to NoIndex the page. The Meta Robots Index tag comes first allowing the bots to crawl and index the page but then the canonical sets up equivalency to a separate page. So if your canonical tag is being respected, it doesn't wind up doing the same thing as a NoIndex (though it may seem that way) nor does it do the same thing as a 301 (though there are similarities in how equity is passed). Since a canonical establishes an equivalency, you'll find that the Canon Page will eventually take the place of the Canonicalized Page in search results because you're telling them the Canonicalized Page _is _the Canon Page & that the Canon page is the right version of both.
-
Thanks, Mike! So, just to clarify, for a particular URL, if we have Meta Robots set to "Index/Follow" and that same URL has a canonical tag, the canonical tag would supersede the robot command and the URL would not be indexed?
-
If a URL was indexed and has since had a canonical added to it pointing to another page, it will eventually disappear from results. Basically the pages gets consolidated with its canon page. If the bots choose to respect the canonical tag in that instance, all signals get passed to the canon page while still allowing the page and information to be accessible by human visitors. As such, there's no reason to keep the page in the index because you're telling the bots that another page is the correct page instead. This is not the same as NoIndexing a page but will eventually remove a page from the index much in the same way that a 301 will pass equity along to another page while eventually removing the redirected page from the index in favor of the page being redirected to.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Anything new if determining how many of a sites pages are in Google's supplemental index vs the main index?
Since site:mysite.com *** -sljktf stopped working to find pages in the supplemental index several years ago has anyone found another way to identify content that has been regulated to the supplemental index?
Technical SEO | | SEMPassion0 -
Some URLs in the sitemap not indexed
Our company site has hundreds of thousands of pages. Yet no matter how big or small the total page count, I have found that the "URLs Indexed" in GWMT has never matched "URLS in Sitemap". When we were small and now that we have a LOT more pages, there is always a discrepancy of ~10% or so missing from the index. It's difficult to know which pages are not indexed, but I have found some that I can verify are in the Sitemap.xml file but not at all in the index. When I go to GWMT I can "Fetch and Render" missing pages fine - it's not as though it's blocked or inaccessible. Any ideas on why this is? Is this type of discrepancy typical?
Technical SEO | | Mase0 -
Can't get Google to Index .pdf in wp-content folder
We created an indepth case study/survey for a legal client and can't get Google to crawl the PDF which is hosted on Wordpress in the wp-content folder. It is linked to heavily from nearly all pages of the site by a global sidebar. Am I missing something obvious as to why Google won't crawl this PDF? We can't get much value from it unless it gets indexed. Any help is greatly appreciated. Thanks! Here is the PDF itself:
Technical SEO | | inboundauthority
http://www.billbonebikelaw.com/wp-content/uploads/2013/11/Whitepaper-Drivers-vs-cyclists-Floridas-Struggle-to-share-the-road.pdf Here is the page it is linked from:
http://www.billbonebikelaw.com/resources/drivers-vs-cyclists-study/0 -
Why my site is not indexing in google
In google webmaster i have updated my sitemap in Mar 6th..There is around 22000 links..But google fetched only 5300 links for long time...
Technical SEO | | Rajesh.Chandran
I waited for 1 month till no improvement in google index..So apr6th we have uploaded new sitemap (1200 links totally)..,But only 4 links indexed in google ..
why google not indexing my urls? Is this affect our ranking in SERP? How many links are advisable to submit in sitemap for a website?0 -
How do you know what version of your site of Google is in their index?
This is going to sound like a strange question, but I am trying to understand which version of our site is in the index. You might think this is an obvious question, but here is why I am asking: 1. Today I searched for a specific keyword and found the listing. 2. I liked on the right arrow next to the listing and checked the cache date. It says 6/28 and shows the site as of 6/28. 3. I expected to see that we were just indexed as we jumped several pages since yesterday and I had just checked two days ago and we hadn't moved at all. It seems like Google may have taken the changes we made on 7/2 but since it is showing 6/28, I am note sure. Since this is confusing, here is the chronology: 1. Made changes 6/20. 2. Site appeared to be indexed on 6/28. 3. Made changes on 7/2. 4. Checked the site on 7/2 and we were in position 60. Checked the site on 7/4 and we were in position 61. 5.. Checked the site today (7/6) and see we are in position 8. The cache date shows as 6/28. I suspect that Google just indexed us yesterday and is reflecting the changes I made on 7/2. But the fact that it says it was cached on 6/28 seems to sugges otherwise. I want to be sure I know which version got us the good rankings - is there any way to be sure? Thanks!!
Technical SEO | | trophycentraltrophiesandawards0 -
Google Index Speed Opinions
Hello Everyone, Under normal circumstances, new posts to my site are indexed almost instantly by Google. I know this because an occasional search with quotation marks surrounding the 1st paragraph of text displays my newly published page. I use this tactic from time to time to ensure contributors aren't syndicating content. My question is this: I've noticed over the last day or so that my newly published articles are not yet indexed. For example, an article that was published over 24 hours ago does not appear to be indexed yet. Is this cause for concern? Is there an average wait time for indexation? XML issue? Thanks in advance for the help/insight.
Technical SEO | | JSOC0 -
How do I get content to be indexed at the top?
I have a paragraph at the top of my homepage. I was told I could use css to make the content visually appear at the bottom of the page but it would still get indexed at the top of the page, still giving it the same level of importance. Can anyone tell me how to do this?
Technical SEO | | BradBorst0 -
Google caching meta tags from another site?
We have several sites on the same server. On the weekend we relocated some servers, changing IP address. A client has since noticed something freaky with the meta tags. 1. They search for their companyname, and another site from the same server appears in position 1. It is completely unrelated, has never happened before, and the company name is not used in any incoming text links. Eg search for company1 on Google. Company1.com.au appears at position 2, but at position1 is school1.com.au. The words company1 don't appear anywhere on the site. I've analysed all incoming links with a gazillion tools, and can't find any link text of company1, linking to school1. 2. Even more freaky, searching for company1.com.au at Google. The results at Google in position 1 for the last three days has been: Meta Title for school1 (but hovering/clicking actual goes to URL for company1)
Technical SEO | | ozgeekmum
Meta Description for school1
URL for company1.com.au Clicking on the cached copy of result1, it shows a cached version of school1 taken on March 18. Today is 29 March. Logically we are trying to get Google to spider both sites again quickly. We've asked the clients to update their home pages. Resubmitted xml sitemaps. Checked the HTTP status codes - both are happily returning 200s. Different cookies. I found another instance on a forum: http://webmasters.stackexchange.com/questions/10578/incorrect-meta-information-in-google Any ideas?0