Is legacy duplicate content an issue?
-
I am looking for some proof, or at least evidence to whether or not sites are being hurt by duplicate content.
The situation is, that there were 4 content rich newspaper/magazine style sites that were basically just reskins of each other. [ a tactic used under a previous regime ] The least busy of the sites has since been discontinued & 301d to one of the others, but the traffic was so low on the discontinued site as to be lost in noise, so it is unclear if that was any benefit.
Now for the last ~2 years all the sites have had unique content going up, but there are still the archives of articles that are on all 3 remaining sites, now I would like to know whether to redirect, remove or rewrite the content, but it is a big decision - the number of duplicate articles? 263,114 !
Is there a chance this is hurting one or more of the sites? Is there anyway to prove it, short of actually doing the work?
-
Hi Jen
We are in the fortunate/crazy situation where we have a custom CMS so the actual redirects are not really a problem from a technical standpoint, it is just wondering if we should
The main site - the biggest and busiest - has a discussion board and a shop, and a blog which the others don't so the articles are about 10% of the indexed content, and about 11% are unique.. the other 2 sites, one has 0.003% unique articles and the other 1.829% ... sounds pretty bad when I put it like that!
We haven't seen a noticeable dip, just general disappointing performance, I think I will try and rope someone into doing a full CSI on the data
Have you seen anywhere that has recovered from a comparable situation? The pondering at this end was that the damage was already done, and that was that.
thanks
-
Hi Fammy!
One thing you could do is to look at the dates the Panda updates hit (http://moz.com/google-algorithm-change) against your website traffic for those dates. If you see a dip, you probably got hit.
If not, it's still possible that the duplicate content is holding back your visibility in the SERPs. You can sometimes guess this when you're adding new content and it doesn't really perform as you'd expect it to - but unfortunately, you won't know for sure until you take some action.
Another thing to keep in mind is that you risk getting hit in the future - for example, by a manual penalty - which could even result in the sites being removed.
263,114 is a huge number of duplicate articles and I was just wondering what proportion that is to your overall number of site pages. If it is quite a high percentage, the risk is obviously greater.
I'd recommend you take some action personally. Is there any pattern in the way the archive of articles is structured, to make it possible to write a catch-all 301 rule in your htaccess file that redirects them all to one of the three sites?
For example say your archived articles site in a folder called archive - you'd put this in the htaccess on sites 1 and 2:
RewriteEngine on
RewriteBase
RewriteRule ^archive/(.*)$ http://www.yoursite3.com/archive/$1 [R=301,L]
... and this would redirect anything in the archive directory to the archive directory on site 3, assuming the file names are exactly the same.
Alternatively if that's not an option, you could look at which of the articles have decent links going to them on sites 1 and 2, redirect those to chosen site 3 and remove the rest, cutting the workload down a little.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you think this case would be of a duplicated content and what would be the consequences in such case?
At the webpage https://authland.com/ which is a food&wine tours and activities booking platform, primary content - services thumbnails containing information about the destination, title and prices of the particular services, can be found at several sub-pages/urls. For example, service https://authland.com/zadar/zadar-region-food-and-wine-tour/1/. Its thumbnail/card through which the service is available, can be found on multiple pages (Categories, Destinations, All services, Most recent services...) Is this considered a duplicated content? Since all of the thumbnails for services on the platform, are to be found on multiple pages. If it is, which would be the best way to avoid that content being perceived by Google bots as such? Thank you very much!
Intermediate & Advanced SEO | | ZD20200 -
Duplicate content across different domains in different countries?
Hi Guys, We have a 4 sites One in NZ, UK, Canada and Australia. All geo-targeting their respective countries in Google Search Console. The sites are identical. We recently added the same content to all 4 sites. Will this cause duplicate content issues or any issues even though they are in different countries and geo-targeting is set? Cheers.
Intermediate & Advanced SEO | | wickstar0 -
Query based site; duplicate content; seo juice flow.
Hi guys, We're planning on starting a Saas based service where we'll be selling different skins. Let's say WordPress themes, though it's not about that. Say we have an url called site.com/ and we would like to direct all seo juice to the mother landing page /best-wp-themes/ but then have that juice flow towards our additional pages: /best-wp-themes/?id=Mozify
Intermediate & Advanced SEO | | andy.bigbangthemes
/best-wp-themes/?id=Fiximoz /best-wp-themes/?id=Mozicom Challenges: 1. our content would be formatted like this:
a. Same content - features b. Same content - price c. Different content - each theme will have its own set of features / design specs. d. Same content - testimonials. How would be go about not being penalised by SE's for the duplicate content, but still have the /?id=whatever pages be indexed with proper content? 2. How do we go about making sure SEO juice flows to the /?id pages too?Basically it's the same thing with different skins. Thanks for the help!0 -
Moving from http to https: image duplicate issue?
Hello everyone, We have recently moved our entire website virtualsheetmusic.com from http:// to https:// and now we are facing a question about images. Here is the deal: All webpages URLs are properly redirected to their corresponding https if they are called from former http links. Whereas, due to compatibility issues, all images URLs can be called either via http or https, so that any of the following URLs work without any redirect: http://www.virtualsheetmusic.com/images/icons/ResponsiveLogo.png https://www.virtualsheetmusic.com/images/icons/ResponsiveLogo.png Please note though that all internal links are relative and not absolute. So, my question is: Can that be a problem from the SEO stand point? In particular: We have thousands of images indexed on Google, mostly images related to our digital sheet music preview image files, and many of them are ranking pretty well in the image pack search results. Could this change be detrimental in some way? Or doesn't make any difference in the eyes of Google? As I wrote above, all internal links are relative, so an image tag like this one: Hasn't changed at all, it is just loaded in a https context. I'll wait for your thoughts on this. Thank you in advance!
Intermediate & Advanced SEO | | fablau0 -
Proper Hosting Setup to Avoid Subfolders & Duplicate Content
I've noticed with hosting multiple websites on a single account you end up having your main site in the root public_html folder, but when you create subfolders for new website it actually creates a duplicate website: eg. http://kohnmeat.com/ is being hosted on laubeau.com's server. So you end up with a duplicate website: http://laubeau.com/kohn/ Anyone know the best way to prevent this from happening? (i.e. canonical? 301? robots.txt?) Also, maybe a specific 'how-to' if you're feeling generous 🙂
Intermediate & Advanced SEO | | ATMOSMarketing560 -
Login Page = Duplicate content?
I am having a problem with duplicate content with my log in page QuickLearn Online Anytime - Log-in
Intermediate & Advanced SEO | | QuickLearnTraining
http://www.quicklearn.com/maven/login.aspx
QuickLearn Online Anytime - Log-in
http://www.quicklearn.com/maven/login.aspx?ReturnUrl=/maven/purchase.aspx?id=BAM-SP
QuickLearn Online Anytime - Log-in
http://www.quicklearn.com/maven/login.aspx?ReturnUrl=/maven/purchase.aspx?id=BRE-SP
QuickLearn Online Anytime - Log-in
http://www.quicklearn.com/maven/login.aspx?ReturnUrl=/maven/purchase.aspx?id=BTAF
QuickLearn Online Anytime - Log-in
http://www.quicklearn.com/maven/login.aspx?ReturnUrl=/maven/purchase.aspx?id=BTDF What is the best way to handle it? Add a couple sentences to each page to make it unique? Use a rel canonical, or a no index no follow or something completely different? Your help is greatly appreciated!0 -
SEOMoz mistaking image pages as duplicate content
I'm getting duplicate content errors, but it's for pages with high-res images on them. Each page has a different, high-res image on it. But SEOMoz keeps telling me it's duplicate content, even though the images are different (and named different). Is this something I can ignore or will Google see it the same way too?
Intermediate & Advanced SEO | | JHT0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0