How to publish duplicate content legitimately without Panda problems
-
Let's imagine that you own a successful website that publishes a lot of syndicated news articles and syndicated columnists.
Your visitors love these articles and columns but the search engines see them as duplicate content.
You worry about being viewed as a "content farm" because of this duplicate content and getting the Panda penalty.
So, you decide to continue publishing the content and use...
<meta name="robots" content="noindex, follow">
This allows you do display the content for your visitors but it should stop the search engines from indexing any pages with this code. It should also allow robots to spider the pages and pass link value through them.
I have two questions.....
-
If you use "noindex" will that be enough to prevent your site from being considered as a content farm?
-
Is there a better way to continue publication of syndicated content but protect the site from duplicate content problems?
-
-
Good idea about attributing with rel=canonical.
Thanks!
-
Noindexing the syndicated articles should, in theory, minimize the likelihood of having a Panda problem, but it seems like Panda is constantly evolving. You will probably see some kind of drop in rankings as the number of indexed pages of for site will decrease. If you have say, 1000 pages total on the site and suddenly 900 are taken out of the index, this might be a problem. If it is a much smaller percentage of the site, you might not have a problem at all. Other than the number of indexed pages, I don't think you will have a problem once the syndicated stuff is noindexed.
It will probably take Google a while to re-index/un-index the pages, so hopefully it won't be a fast drop if there is one. In the long run, it is probably better to at least have the appearance of trying to do the right thing. Linking to the source, and maybe using rel=canonical tags to the original article would also be a good practice. -
Thank you, Nick.
We will be using the "noindex" only on the pages with syndicated content. This is a DreamWeaver site and it is easy to place the code on specific pages and does not use excerpts.
Do you still see a potential problem?
The question really is... "Could a site that contains a lot of syndicated content have a Panda problem if the pages that contain that content are noindexed?"
-
I am assuming you intend to use no index only on the duplicate content articles. Using no index on everything would also prevent your content from being indexed and found through Google.
If you are using Wordpress or something else that will allow showing excerpts, you could try making the article pages noindex and show only excerpts on the main page and category pages which would be indexed and followed. I think that would make the articles not appear in searches and avoid duplicate content penalties, while allowing the pages that show the excerpts to still be indexed and rank OK.
The idea here is that the pages showing the excerpts would have enough text to help the home and category pages to rank for the subject matter and hopefully not be seen as what it is - copied content.
You will probably eventually get caught by the Panda, but this may work as a temporary solution until you can get some original content mixed in.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recurring events and duplicate content
Does anyone have tips on how to work in an event system to avoid duplicate content in regards to recurring events? How do I best utilize on-page optimization?
Technical SEO | | megan.helmer0 -
What is the process for allowing someone to publish a blog post on another site? (duplicate content issue?)
I have a client who allowed a related business to use a blog post from my clients site and reposted to the related businesses site. The problem is the post was copied word for word. There is an introduction and a link back to the website but not to the post itself. I now manage the related business as well. So I have creative control over both websites as well as SEO duties. What is the best practice for this type of blog post syndication? Can the content appear on both sites?
Technical SEO | | donsilvernail0 -
Minimising the effects of duplicate content
Hello, We realised that one of our clients, copied a large part of content from our website to his. The normal reaction would be to send a cease and desist letter. Nevertheless this would probably mean loosing a good client. The client dumped the text of several articles (for example:
Technical SEO | | Lvet
http://www.velascolawyers.com/en/property-law/136-the-ley-de-costas-coastal-law.html ) Into the same page:
http://www.freundlinger-partners.com/en/home/faqs-property-law/ I convinced the client to place our authorship tags on this page, but I am wondering if this is enough. What do you think? Cheers
Luca0 -
How to avoid duplicate content when blogging from a site
I have a wordpress plastic surgery website. I have a wordpress blog on the site. My concern is avoiding duplicate content penalties when I blog. I use my blog to add new information about procedures that have pages on the same topic on the main site. Invariably same keywords and phrases can appear in the blog-will this be considered Duplicate content? Also is it black hat to insert anchor text in a blog linking back to site content-ie internal link or is one now and then helpful
Technical SEO | | wianno1680 -
WordPress Duplicate Content Caused By Categories
Hello, We have a wordpress blog that has around 250 categories. Due to our platform we have a hierarchy structure for 3 separate stores. For example iPhone > Apps > Books. Placing a blog post in the books category automatically places it into iPhone and iPhone/Apps category, causing 3 instances of any blog post in this category. Is this an issue? I have seen 2 schools of thought on categories, 1 index follow and 2 noindex follow. I know some of our categories get indexed, but with so many, maybe it is better to noindex them. We also considered reducing our categories to 10 to 12 and use tags to provide the indexed site navigation as follows: Reviews (category) iPhone Book App, iPhone App Store (tags) but this seems a little redundant? Anyone want to take this on? thank you Mike
Technical SEO | | crazymikesapps10 -
How to protect against duplicate content?
I just discovered that my company's 'dev website' (which mirrors our actual website, but which is where we add content before we put new content to our actual website) is being indexed by Google. My first thought is that I should add a rel=canonical tag to the actual website, so that Google knows that this duplicate content from the dev site is to be ignored. Is that the right move? Are there other things I should do? Thanks!
Technical SEO | | williammarlow0 -
Question about duplicate content in crawl reports
Okay, this one's a doozie: My crawl report is listing all of these as separate URLs with identical duplicate content issues, even though they are all the home page and the one that is http://www.ccisolutions.com (the preferred URL) has a canonical tag of rel= http://www.ccisolutions.com: http://www.ccisolutions.com http://ccisolutions.com http://www.ccisolutions.com/StoreFront/IAFDispatcher?iafAction=showMain I will add that OSE is recognizing that there is a 301-redirect on http://ccisolutions.com, but the duplicate content report doesn't seem to recognize the redirect. Also, every single one of our 404-error pages (we have set up a custom 404 page) is being identified as having duplicate content. The duplicate content on all of them is identical. Where do I even begin sorting this out? Any suggestions on how/why this is happening? Thanks!
Technical SEO | | danatanseo1 -
Duplicate content issue with trailing / ?
Hi ,I did a SEOmoz Crawl Test and found most pages show twice, for example: A: www.website.com/index.php/dog/walk B: www.website.com/index.php/dog/walk/ I've checked Google Analytics and 90% of organic search traffic arrives on the URLs with the trailing slash (B). Question 1: Can I assume I've a duplicate content problem? Question 2: Is it best to do 301 redirects from the 'non trailing slash' pages to the 'trailing slash pages'? Question 3: For some reason every web page has a '/index.php' in it (see A&B) above. No idea why. Should it be a SEO concern? Kind regards and thank you in advance Nigel
Technical SEO | | Richard5550