Duplicate Content
-
Let's say a blog is publishing original content. Now let's say a second blog steals that original content via bot and publishes it as it's own. Now further assume the original blog doesn't notice this for several years.
How much damage could this do to blog A for Google results? Any opinions?
-
Removing any duplicate text is absolutely essential, as this could potentially negatively affect your business's organic SEO, do you have any duplicated text?
-
Thanks for the response Peter re: the original post.
We are very convinced at this point the issue isn't a technical one. We're not sure however if there's an issue with that duplicate content site we found stealing some of our articles, or as you mentioned a quality score issue. We're approaching it as a need to re-group around the quality issue for now, and monitor results over time. We've identified several areas for improvement in that regard.
This stuff is so frustrating to be honest. I get why Google can't show their cards, but the complete lack of transparency or ability to get some feedback from them makes this a difficult game.
Thanks again for the response, much appreciated.
-
CYNOT: I saw the original question via email (I'll avoid details in the public answer), and unfortunately I'm not seeing any clear signs of technical issues with the original content. This looks more like an aggressive filter than a penalty, but it's really hard to tell if the filter is a sign of quality issues or if Google is treating the wrong site as a duplicate.
-
Unfortunately, a lot of it does depend on the relative authority of the sites. People scrape (including some bots, which do it almost immediately) Moz posts all the time, and they rank, but they don't have nearly our link profile or other ranking signals, and so we don't worry about it. For a smaller site with a relatively new or weak link profile, though, it is possible for a stronger site to outrank you on your own content.
Google does try to look at cache dates and other signals, but a better-funded site can often get indexed more quickly as well. It's rare for this to do serious damage, but it can happen. As Balachandar said, at that point you may have to resort to DMCA take-down requests and other legal actions. Ultimately, that becomes a cost/benefit trade-off, as legal action is going to take time and money.
There's no technical tricks (markup, etc.) to tell Google that a page is the source, although there are certainly tactics, like maintaining good XML sitemaps, that can help Google find your new content more quickly. Of course, you also want to be the site that has that stronger link profile, regardless of whether or not someone is copying you.
-
It will affect your ranking if the second blog steals your content. If the second blog which had stealed your content have high DA, your content will be under-valued. Google updating the algorithms by analyzing which website posts the content in web(date analysis) to solve this problem. You can see traffic drops as an indication to identify that the page is duplicated by some other blogs. If you have big website and many blog posts, you can use DMCA which takes care of all the things. If you have any questions, feel free to ask.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Internal Duplicate Content Question...
We are looking for an internal duplicate content checker that is capable of crawling a site that has over 300,000 pages. We have looked over Moz's duplicate content tool and it seems like it is somewhat limited in how deep it crawls. Are there any suggestions on the best "internal" duplicate content checker that crawls deep in a site?
Intermediate & Advanced SEO | | tdawson091 -
Case Sensitive URLs, Duplicate Content & Link Rel Canonical
I have a site where URLs are case sensitive. In some cases the lowercase URL is being indexed and in others the mixed case URL is being indexed. This is leading to duplicate content issues on the site. The site is using link rel canonical to specify a preferred URL in some cases however there is no consistency whether the URLs are lowercase or mixed case. On some pages the link rel canonical tag points to the lowercase URL, on others it points to the mixed case URL. Ideally I'd like to update all link rel canonical tags and internal links throughout the site to use the lowercase URL however I'm apprehensive! My question is as follows: If I where to specify the lowercase URL across the site in addition to updating internal links to use lowercase URLs, could this have a negative impact where the mixed case URL is the one currently indexed? Hope this makes sense! Dave
Intermediate & Advanced SEO | | allianzireland0 -
How do I use public content without being penalized for duplication?
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized? Thanks, Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup1 -
Will using 301 redirects to reduce duplicate content on a massive scale within a domain hurt the site?
We have a site that is suffering a duplicate content problem. To help resolve this we intend to reduce the amount of landing pages within the site. There are a HUGE amount of pages. We have identified the potential to reduce the pages by half at first by combing the top level directories, as we believe they are semantically similar enough that they no longer warrant being seperated.
Intermediate & Advanced SEO | | Silkstream
For instance: Mobile Phones & Mobile Tablets (Its not mobile devices). We want to remove this directory path and 301 these pages to the others, then rewrite the content to include both phones and tablets on the same landing page. Question: Would a massive amount of 301's (over 100,000) cause any harm to the general health of the website? Would it affect the authority? We are also considering just severing them from the site, leaving them indexed but not crawlable from the site, to try and maintain a smooth transition. We dont want traffic to tank. Has anyone performed anything similar? Id be interested to hear all opinions. Thanks!0 -
Product descriptions & Duplicate Content: between fears and reality
Hello everybody, I've been reading quite a lot recently about this topic and I would like to have your opinion about the following conclusion: ecommerce websites should have their own product descriptions if they can manage it (it will be beneficial for their SERPs rankings) but the ones who cannot won't be penalized by having the same product descriptions (or part of the same descriptions) IF it is only a "small" part of their content (user reviews, similar products, etc). What I mean is that among the signals that Google uses to guess which sites should be penalized or not, there is the ratio "quantity of duplicate content VS quantity of content in the page" : having 5-10 % of a page text corresponding to duplicate content might not be harmed while a page which has 50-75 % of a content page duplicated from an other site... what do you think? Can the "internal" duplicated content (for example 3 pages about the same product which is having 3 diferent colors -> 1 page per product color) be considered as "bad" as the "external" duplicated content (same product description on diferent sites) ? Thanks in advance for your opinions!
Intermediate & Advanced SEO | | Kuantokusta0 -
Duplicate page content query
Hi forum, For some reason I have recently received a large increase in my Duplicate Page Content issues. Currently it says I have over 7,000 duplicate page content errors! For example it says: Sample URLs with this Duplicate Page Content http://dikelli.com.au/accessories/gowns/news.html http://dikelli.com.au/accessories/news.html
Intermediate & Advanced SEO | | sterls
http://dikelli.com.au/gallery/dikelli/gowns/gowns/sale_gowns.html However there are no physical links to any of these page on my site and even when I look at my FTP files (I am using Dreamweaver) these directories and files do not exist. Can anyone please tell me why the SEOMOZ crawl is coming up with these errors and how to solve them?0 -
PDF for link building - avoiding duplicate content
Hello, We've got an article that we're turning into a PDF. Both the article and the PDF will be on our site. This PDF is a good, thorough piece of content on how to choose a product. We're going to strip out all of the links to our in the article and create this PDF so that it will be good for people to reference and even print. Then we're going to do link building through outreach since people will find the article and PDF useful. My question is, how do I use rel="canonical" to make sure that the article and PDF aren't duplicate content? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
PDF on financial site that duplicates ~50% of site content
I have a financial advisor client who has a downloadable PDF on his site that contains about 9 pages of good info. Problem is much of the content can also be found on individual pages of his site. Is it best to noindex/follow the pdf? It would be great to let the few pages of original content be crawlable, but I'm concerned about the duplicate content aspect. Thanks --
Intermediate & Advanced SEO | | 540SEO0