Rel=canonical + no index

Morris77

We have been doing an a/b test of our hp and although we placed a rel=canonical tag on the testing page it is still being indexed. In fact at one point google even had it showing as a sitelink . We have this problem through out our website. My question is:

What is the best practice for duplicate pages?

1. put only a rel= canonical pointing to the "wanted original page"

2. put a rel= canonical (pointing to the wanted original page) and a no index on the duplicate version

Has anyone seen any detrimental effect doing # 2?

Thanks

Dr-Pete

Interesting - I've very rarely had issues with GWO, but if a new URL was created and someone linked to it, I can see where you might have a problem.

(1) None of these things are absolute, I'm afraid, but typically, yes - a rel=canonical to a different page should keep the first page out of the index.

(2) Usually, but it depends. The problem here may be that Google just isn't crawling the test variant very often, so they may not be processing the rel=canonical yet.

If it's just a couple of pages, I'd give it time - it's probably not an emergency situation. Again, you could just tell Google to remove them in GWT. I think you're doing the right thing with the canonical tags, but it can take Google time to process them the way you want to, in practice.

Morris77

To answer the second question :

We actually use google's website optimizer to run our test -- the problem started when someone linked to the test page....

Not sure if these scenarios are different for google -- but just trying to understand it

1. if a page was never indexed before and you put a rel= canonical on it (pointing to a different page) than the rel = canonical will keep it out of the index?

2. If a page was already in the index and you put on rel=canonical is that a strong enough signal for google to go and remove it from the index?

obviously both these scenarios are once the pages have been crawled

Dr-Pete

I wouldn't mix those signals - it's nearly impossible to tell what's working if you do. If the canonical on the test page isn't working, there may be a couple of issues:

(1) It could just be taking time. Honestly, it's never as fast as you want it to be.

(2) It may be that the test versions got crawled originally, but now aren't being crawled (on the canonical isn't being processed). Check the cache date on the test page.

The big question is how they got crawled in the first place. It's often better to use some sort of cookie-based implementation so that Google never even sees the B version. That's how most of the A/B test implementations work (specifically to avoid this problem).

If it's just a couple of URLs and you can't shake them, you could request manual removal in GWT. That really depends on the scope and URL structure, though.

AlanMosley

Good point, i was thinking of robots.txt, where the page would not eb read.

But I have not thought about that situation. i am not sure what search engines would do.

But still, just the canonical is needed.

Morris77

A page that has a no index on it still gets crawled and therefore the rel=canonical directive is still "seen" by the bot --- so why wouldn't the rel=canonical pass the credit over?

AlanMosley

Just the rel canonical

if you no index the page, the rel canonical can not be indexed and can not work

Rel canonical simply passes the credit for the content to the canonical page.

no index is like cutting off your hand because you have a splinter. links pointing to a non indexed page are puring link juice into thin air.

You can use a mete noindex , follow so that some of the link juice is returned, but canonical is best for duplicate content.

Actualy getting rid of the duplicate content is best

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Rel=canonical + no index

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Index subpages but not homepage

Link rel="prev" AND canonical

Struggling to get indexed and ranked

WordPress post indexation speed

Canonical Question

Does Google index has expiration?

Getting querystring indexed?

Crawling and indexing content