Best way to deal with over 1000 pages of duplicate content?
-
Hi
Using the moz tools i have over a 1000 pages of duplicate content. Which is a bit of an issue!
95% of the issues arise from our news and news archive as its been going for sometime now.
We upload around 5 full articles a day. The articles have a standalone page but can only be reached by a master archive. The master archive sits in a top level section of the site and shows snippets of the articles, which if a user clicks on them takes them to the full page article. When a news article is added the snippets moves onto the next page, and move through the page as new articles are added.
The problem is that the stand alone articles can only be reached via the snippet on the master page and Google is stating this is duplicate content as the snippet is a duplicate of the article.
What is the best way to solve this issue?
From what i have read using a 'Meta NoIndex' seems to be the answer (not that i know what that is). from what i have read you can only use a canonical tag on a page by page basis so that going to take to long.
Thanks Ben
-
Hi Guys,
Thanks for your help.
I decided that updating the robot text would be the best option.
Ben
-
Technically, your URL:
http://www.capitalspreads.com/news
is really:
http://www.capitalspreads.com/news/index.php
So just add this line to robots.txt:
Disallow: /news/index.php
You won't be disallowing the pages underneath it but you will be blocking the page that contains all dupe content.
Also, if you prefer to do this with a meta tag on the news page, you could always do "noindex, follow" to make sure Google follows the links - they just don't index the page.
-
It may not be helpful to you in this situation. I was just saying that if your server creates multiple URLs containing the same content, as long as those URLs also contain the identical rel=canonical directive, a single canonical version of that content will be established.
-
Hi Chris,
I've read about the canonicalization but from what i could work I'd have to tag each of the 400 plus page individually to solve the issue and i don't think this is the best use of anyone's time.
I don't under how placing the tag and pointing back at itself will help? Can you explain a little more.
Ideally i want the full article page to be indexed as this will be more beneficial to the user. By placing the canonical tag on the snippets page and pointing it to itself would i not be telling the spider this is the page to index?
Here some examples
http://www.capitalspreads.com/news - Snippets page
http://www.capitalspreads.com/news/uk-economic-recovery-will-take-years - Full article, that would ideally be the page that wants to be indexed.
Regards
Ben
-
Ben, you use the rel=canonical directive in the header of the page with the original source of the content (pointing to itself), every reproduction of that page that also contains the rel=canonical directive pointing to the original source. So it's not necessarily a page by page solution. Have you read through this yet? Canonicalization and the Canonical Tag - Learn SEO - Moz
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
'duplicate content' on several different pages
Hi, I've a website with 6 pages identified as 'duplicate content' because they are very similar. This pages looks similar because are the same but it show some pictures, a few, about the product category that's why every page look alike each to each other but they are not 'exactly' the same. So, it's any way to indicate to Google that the content is not duplicated? I guess it's been marked as duplicate because the code is 90% or more the same on 6 pages. I've been reviewing the 'canonical' method but I think is not appropriated here as the content is not the same. Any advice (that is not add more content)?
Technical SEO | | jcobo0 -
Getting high priority issue for our xxx.com and xxx.com/home as duplicate pages and duplicate page titles can't seem to find anything that needs to be corrected, what might I be missing?
I am getting high priority issue for our xxx.com and xxx.com/home as reporting both duplicate pages and duplicate page titles on crawl results, I can't seem to find anything that needs to be corrected, what am I be missing? Has anyone else had a similar issue, how was it corrected?
Technical SEO | | tgwebmaster0 -
Duplicate Page Title
Hi I just got back from first crawl report and there were plenty of errors. I know this has been asked before but I am newbie here so bear with me. I captured the video. Any ideas on how to address the issue? ktXKDxRttK
Technical SEO | | mcardenal0 -
Can Page Content & Description Have Same Content?
I'm studying my crawl report and there are several warnings regarding missing meta descriptions. My website is built in WordPress and part of the site is a blog. Several of these missing description warnings are regarding blog posts and I was wondering if I am able to copy the first few lines of content of each of the posts to put in the meta description, or would that be considered duplicate content? Also, there are a few warnings that relate to blog index pages, e.g. http://www.iainmoran.com/2013/02/ - I don't know if I can even add a description of these as I think they are dynamically created? While on the subject of duplicate content, if I had a sidebar with information on several of the pages (same info) while the content would be coming from a WP Widget, would this still be considered duplicate content and would Google penalise me for it? Would really appreciate some thoughts on this,please. Thanks, Iain.
Technical SEO | | iainmoran0 -
Duplicate Page content / Rel=Cannonical
My SEO Moz crawl is showing duplicate content on my site. What is showing up are two articles I submitted to Submit your article (article submission service). I put their code in to my pages i.e. " <noscript><b>This article will only display in JavaScript enabled browsers.</b></noscript> " So do I need to delete these blog posts since they are showing up as dup content? I am having a difficult time understanding rel=cannonical. Isn't this for dup content on within one site? So I could not use rel="cannonical" in this instance? What is the best way to feature an article or press release written for another site, but that you want your clients to see? Rewritting seem ridiculous for a small business like ours. Can we just present the link? Thank you.
Technical SEO | | RoxBrock0 -
Duplicate Content Issue
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors! I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues. Ex: Normal URL - www.example.com/testing.php Duplicate URL - www.example.com/testing.php/helloworld The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page. I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place. Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ... when the pdfs are only located in a pdfs directory! I am very confused on how to fix this problem. Maybe with some sort of redirect?
Technical SEO | | hfranz0 -
How to Solve Duplicate Page Content Issue?
I have created one campaign over SEOmoz tools for my website. I have found 89 duplicate content issue from report. Please, look in to Duplicate Page Content Issue. I am quite confuse to resolve this issue. Can any one suggest me best solution to resolve it?
Technical SEO | | CommercePundit0 -
Duplicate content
I have to sentences that I want to optimize to different pages for. sentence number one is travel to ibiza by boat sentence number to is travel to ibiza by ferry My question is, can I have the same content on both pages exept for the keywords or will Google treat that as duplicate content and punish me? And If yes, where goes the limit/border for duplicate content?
Technical SEO | | stlastla0