How to compete with duplicate content in post panda world?

CommercePundit

I want to fix duplicate content issues over my eCommerce website.

I have read very valuable blog post on SEOmoz regarding duplicate content in post panda world and applied all strategy to my website.

I want to give one example to know more about it.

http://www.vistastores.com/outdoor-umbrellas

Non WWW version:

http://vistastores.com/outdoor-umbrellas redirect to home page.

For HTTPS pages:

https://www.vistastores.com/outdoor-umbrellas

I have created Robots.txt file for all HTTPS pages as follow.

https://www.vistastores.com/robots.txt

And, set Rel=canonical to HTTP page as follow.

http://www.vistastores.com/outdoor-umbrellas

Narrow by search:

My website have narrow by search and contain pages with same Meta info as follow.

http://www.vistastores.com/outdoor-umbrellas?cat=7

http://www.vistastores.com/outdoor-umbrellas?manufacturer=Bond+MFG

http://www.vistastores.com/outdoor-umbrellas?finish_search=Aluminum

I have restricted all dynamic pages by Robots.txt which are generated by narrow by search.

http://www.vistastores.com/robots.txt

And, I have set Rel=Canonical to base URL on each dynamic pages.

Order by pages:

http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name

I have restrict all pages with robots.txt and set Rel=Canonical to base URL.

For pagination pages:

http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name&p=2

I have restrict all pages with robots.txt and set Rel=Next & Rel=Prev to all paginated pages.

I have also set Rel=Canonical to base URL.

I have done & apply all SEO suggestions to my website but, Google is crawling and indexing 21K+ pages. My website have only 9K product pages.

Google search result:

https://www.google.com/search?num=100&hl=en&safe=off&pws=0&gl=US&q=site:www.vistastores.com&biw=1366&bih=520

Since last 7 days, my website have affected with 75% down of impression & CTR.

I want to recover it and perform better as previous one.

I have explained my question in long manner because, want to recover my traffic as soon as possible.

KrisRoadruck

Not a complete answer but instead of rel-canonicaling your dynamic pages you may just want to robot.txt block them somthing like:

Disallow: /*?

this will prevent google from crawling any version of the page that includes the ? in the URL. Cannonical is a suggetion whereas robots is more of a command.

as you can see from this query:

https://www.google.com/search?num=100&hl=en&safe=off&pws=0&gl=US&q=site:www.vistastores.com&biw=1366&bih=520#sclient=psy-ab&hl=en&safe=off&pws=0&gl=US&source=hp&q=site:www.vistastores.com%2Fpatio-living-concepts%3F&pbx=1&oq=site:www.vistastores.com%2Fpatio-living-concepts%3F&aq=f&aqi=&aql=&gs_sm=e&gs_upl=8408l8408l1l8644l1l1l0l0l0l0l65l65l1l1l0&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=b03d3d8a434daa&biw=1599&bih=795

Google has indexed 132 versions of that single page rather than follow your rel=canonical suggestion.

To further enforce this you may be able to use a fancy bit of php code to detect if the url is dynamic and do a

robots noindex, noarchive on only the dynamic renderings of the page.

This could be done like this:

I also believe there are some filtering tools for this right within webmaster tools. Worth a peek if your site is registered.

Additionally where you are redirecting non-www subpages to the home page you may instead want to redirect them to their www versions.

this can be done in htaccess like this:

Redirect non-www to www: RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^yourdomain.com [NC] RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]

This will likely provide both a better user experience as well as a better solution in googles eyes.

I'm sure some other folks will come in with some other great suggestions for you as well

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How to compete with duplicate content in post panda world?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

How to avoid duplicate content

Magento products and eBay - duplicate content risk?

Blog tags are creating excessive duplicate content...should we use rel canonicals or 301 redirects?

Category Pages For Distributing Authority But Not Creating Duplicate Content

Real Estate MLS listings - Does Google Consider duplicate content?

Can't seem to get traffic back post Panda / Penguin. WHY?

Could you use a robots.txt file to disalow a duplicate content page from being crawled?

What constitutes duplicate content?