Using robots.txt to deal with duplicate content

bhsiao

I have 2 sites with duplicate content issues.

One is a wordpress blog.

The other is a store (Pinnacle Cart).

I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?

Hurf

It will be any part of the URL that doesn't handle navigation, so look at what you can delete off the URL without breaking the link to the product page.

Take a look at this: http://googlewebmastercentral.blogspot.com/2009/10/new-parameter-handling-tool-helps-with.html

Remember, this will only work with Google!

This is another interesting video from Matt Cutts about removing content from Google: http://googlewebmastercentral.blogspot.com/2008/01/remove-your-content-from-google.html

bhsiao

If the urls look like this...

http://www.domain.com/bird-cage-covers-page-2/?p=catalog&mode=catalog&parent=18&pg=2&CatalogSetSortBy=name

Would I tell Google to ignore p, mode, parent, or CatalogSetSortBy? Just one of those or all of those?

Thanks!!!

Hurf

For Wordpress try : http://wordpress.org/extend/plugins/canonical/

also look at Yoast's Wordpress SEO plugin referenced on that page - I love it!

and for the duplicate content caused by the dymanic content on the pinnacle cart you can use the Google Webmasters tool to tell the Google to ignore certain parameters - go to Site configuration - Settings - Parameter handling and add the variables you wish to ignore to this list.

bhsiao

Hi,

The two sites are unrelated to each other so my concern is not duplicate content between the two, there is none.

However, on each of the sites I have the duplicate content issues. I do have admin privileges to both sites.

If there is a Wordpress plug in that would be great. Do you have one that you would recommend?

For my ecommerce site using pinnacle cart, I have duplicates because of the way people can search on the site. For example:

|

http://www.domain.com/accessories/

http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=date

http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=name

http://www.domain.com/accessories/?p=catalog&mode=catalog&parent=17&pg=1&CatalogSetSortBy=price

|

These all show as duplicate content in my webmaster tools reports. I don't have the ability to edit each head tag of pages in order to add a canonical link on this site.

Hurf

What are your intentions here? Do you intend to leave both sites running? Can you give us more information on the sites? Are they aged domains, is one/any/both of them currently attracting any inbound links, are they ranking? What is the purpose of the duplicate content?

Are you looking to redirect traffic from one of the sites to the other using 301 redirect?

Or do you want both sites visible - using the Canonical link tag?

(I am concerned that you say you 'cannot edit the tag'? Do you not have full Admin access to either site?

There are dedicated Canonical management plugins for Wordpress (if you have access to the wp-admin area)

You are going to need some admin priviledges to make any alterations to the site so that you can correct this.

Let us know a bit more please!

These articles may be useful as they provide detailed best practice info on redirects:

http://www.google.com/support/webmasters/bin/answer.py?answer=66359

http://www.seomoz.org/blog/duplicate-content-block-redirect-or-canonical

Check out this article on redirects

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.