How to compete with duplicate content in post panda world?
-
I want to fix duplicate content issues over my eCommerce website.
I have read very valuable blog post on SEOmoz regarding duplicate content in post panda world and applied all strategy to my website.
I want to give one example to know more about it.
http://www.vistastores.com/outdoor-umbrellas
Non WWW version:
http://vistastores.com/outdoor-umbrellas redirect to home page.
For HTTPS pages:
https://www.vistastores.com/outdoor-umbrellas
I have created Robots.txt file for all HTTPS pages as follow.
https://www.vistastores.com/robots.txt
And, set Rel=canonical to HTTP page as follow.
http://www.vistastores.com/outdoor-umbrellas
Narrow by search:
My website have narrow by search and contain pages with same Meta info as follow.
http://www.vistastores.com/outdoor-umbrellas?cat=7
http://www.vistastores.com/outdoor-umbrellas?manufacturer=Bond+MFG
http://www.vistastores.com/outdoor-umbrellas?finish_search=Aluminum
I have restricted all dynamic pages by Robots.txt which are generated by narrow by search.
http://www.vistastores.com/robots.txt
And, I have set Rel=Canonical to base URL on each dynamic pages.
Order by pages:
http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name
I have restrict all pages with robots.txt and set Rel=Canonical to base URL.
For pagination pages:
http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name&p=2
I have restrict all pages with robots.txt and set Rel=Next & Rel=Prev to all paginated pages.
I have also set Rel=Canonical to base URL.
I have done & apply all SEO suggestions to my website but, Google is crawling and indexing 21K+ pages. My website have only 9K product pages.
Google search result:
Since last 7 days, my website have affected with 75% down of impression & CTR.
I want to recover it and perform better as previous one.
I have explained my question in long manner because, want to recover my traffic as soon as possible.
-
Not a complete answer but instead of rel-canonicaling your dynamic pages you may just want to robot.txt block them somthing like:
Disallow: /*?
this will prevent google from crawling any version of the page that includes the ? in the URL. Cannonical is a suggetion whereas robots is more of a command.
as you can see from this query:
Google has indexed 132 versions of that single page rather than follow your rel=canonical suggestion.
To further enforce this you may be able to use a fancy bit of php code to detect if the url is dynamic and do a
robots noindex, noarchive on only the dynamic renderings of the page.
This could be done like this:
I also believe there are some filtering tools for this right within webmaster tools. Worth a peek if your site is registered.
Additionally where you are redirecting non-www subpages to the home page you may instead want to redirect them to their www versions.
this can be done in htaccess like this:
Redirect non-www to www: RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^yourdomain.com [NC] RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]
This will likely provide both a better user experience as well as a better solution in googles eyes.
I'm sure some other folks will come in with some other great suggestions for you as well
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
May Faceted Navigation via ajax #parameter cause duplicated content issues?
We are going to implement a faceted navigation for an ecommerce site of about 1000 products.
Intermediate & Advanced SEO | | lcourse
Faceted navigation is implemented via ajax/javascript which adds to the URL a large number of #parameters.
Faceted pages are canonicalizing to page without any parameters. We do not want google to index any of the faceted pages at this point. Will google include pages with #parameters in their index?
Can I tell google somehow to ignore #parameters and not to index them?
Could this setup cause any SEO problems for us in terms of crawl bandwidth and or link equity?0 -
How to solve our duplicate content issue? (Possible Session ID problem)
Hi there, We've recently took on a new developer who has no experience in any technical SEO and we're currently redesigning our site www.mrnutcase.com. Our old developer was up to speed on his SEO and any technical issues we never really had to worry about. I'm using Moz as a tool to go through crawl errors on an ad-hoc basis. I've noticed just now that we're recording a huge amount of duplicate content errors ever since the redesign commenced (amongst other errors)! For example, the following page is duplicated 100s of times: https://www.mrnutcase.com/en-US/designer/?CaseID=1128599&CollageID=21&ProductValue=2293 https://www.mrnutcase.com/en-US/designer/?CaseID=1128735&CollageID=21&ProductValue=3387 https://www.mrnutcase.com/en-GB/designer/?CaseID=1128510&CollageID=21&ProductValue=3364 https://www.mrnutcase.com/en-GB/designer/?CaseID=1128511&CollageID=21&ProductValue=3363 etc etc. Does anyone know how I should be dealing with this problem? And is this something that needs to be fixed urgently? This problem has never happened before so i'm hoping it's an easy enough fix. Look forward to your responses and greatly appreciate the help. Many thanks, Danny
Intermediate & Advanced SEO | | DannyNutcase0 -
Could this be seen as duplicate content in Google's eyes?
Hi I'm an in-house SEO and we've recently seen Panda related traffic loss along with some of our main keywords slipping down the SERPs. Looking for possible Panda related issues I was wondering if the following could be seen as duplicate content. We've got some very similar holidays (travel company) on our website. While they are different I'm concerned it may be seen as creating content that is too similar: http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/the-wildlife-and-beaches-of-kenya.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/ultimate-kenya-wildlife-and-beaches.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/wildlife-and-beach-family-safari.aspx They do all have unique text but as you can see from the titles, they are very similar (note from an SEO point of view the tabbed content is all within the same page at source level). At the top level of the holiday pages we have a filtered search:
Intermediate & Advanced SEO | | KateWaite
http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays.aspx These pages have a unique introduction but the content snippets being pulled into the boxes is drawn from each of the individual holiday pages. I'm just concerned that these could be introducing some duplicating issues. Any thoughts?0 -
How should I manage duplicate content caused by a guided navigation for my e-commerce site?
I am working with a company which uses Endeca to power the guided navigation for our e-commerce site. I am concerned that the duplicate content generated by having the same products served under numerous refinement levels is damaging the sites ability to rank well, and was hoping the Moz community could help me understand how much of an impact this type of duplicate content could be having. I also would love to know if there are any best practices for how to manage this type of navigation. Should I nofollow all of the URLs which have more than 1 refinement used on a category, or should I allow the search engines to go deeper than that to preserve the long tail? Any help would be appreciated. Thank you.
Intermediate & Advanced SEO | | FireMountainGems0 -
Magento Duplicate Content Recovery
Hi, we switched platforms to Magento last year. Since then our SERPS rankings have declined considerably (no sudden drop on any Panda/Penguin date lines). After investigating, it appeared we neglected to No index, follow all our filter pages and our total indexed pages rose sevenfold in a matter of weeks. We have since fixed the no index issue and the pages indexed are now below what we had pre switch to Magento. We've seen some positive results in the last week. Any ideas when/if our rankings will return? Thanks!
Intermediate & Advanced SEO | | Jonnygeeuk0 -
How to Fix Duplicate Page Content?
Our latest SEOmoz crawl reports 1138 instances of "duplicate page content." I have long been aware that our duplicate page content is likely a major reason Google has de-valued our Web store. Our duplicate page content is the result of the following: 1. We sell audio books and use the publisher's description (narrative) of the title. Google is likely recognizing the publisher as the owner / author of the description and our description as duplicate content. 2. Many audio book titles are published in more than one format (abridged, unabridged CD, and/or unabridged MP3) by the same publisher so the basic description on our site would be the same at our Web store for each format = more duplicate content at our Web store. Here's are two examples (one abridged, one unabridged) of one title at our Web store. Kill Shot - abridged Kill Shot - unabridged How much would the body content of one of the above pages have to change so that a SEOmoz crawl does NOT say the content is duplicate?
Intermediate & Advanced SEO | | lbohen0 -
Duplicate content mess
One website I'm working with keeps a HTML archive of content from various magazines they publish. Some articles were repeated across different magazines, sometimes up to 5 times. These articles were also used as content elsewhere on the same website, resulting in up to 10 duplicates of the same article on one website. With regards to the 5 that are duplicates but not contained in the magazine, I can delete (resulting in 404) all but the highest value of each (most don't have any external links). There are hundreds of occurrences of this and it seems unfeasible to 301 or noindex them. After seeing how their system works I can canonical the remaining duplicate that isn't contained in the magazine to the corresponding original magazine version - but I can't canonical any of the other versions in the magazines to the original. I can't delete the other duplicates as they're part of the content of a particular issue of a magazine. The best thing I can think of doing is adding a link in the magazine duplicates to the original article, something along the lines of "This article originally appeared in...", though I get the impression the client wouldn't want to reveal that they used to share so much content across different magazines. The duplicate pages across the different magazines do differ slightly as a result of the different Contents menu for each magazine. Do you think it's a case of what I'm doing will be better than how it was, or is there something further I can do? Is adding the links enough? Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0 -
Duplicate content on index.htm page
How do I avoid duplicate content on the index.htm page . I need to redirect the spider from the /index.htm file to the main root of http://www.manandhisvan.com.au and hence avoid duplicate content. Does anyone know of a foolproof way of achieving this without me buggering up the complete site Cheers Freddy
Intermediate & Advanced SEO | | Fatfreddy0