ROI on Policing Scraped Content
-
Over the years, tons of original content from my website (written by me) has been scraped by 200-300 external sites. I've been using Copyscape to identify the offenders. It is EXTREMELY time consuming to identify the site owners, prepare an email with supporting evidence (screen shots), and following up 2, 3, 15 times until they remove the scraped content. Filing DMCA takedowns are a final option for sites hosted in the US, but quite a few of the offenders are in China, India, Nigeria, and other places not subject to DMCA. Sometimes, when a site owner takes down scraped content, it reappears a few months or years later. It's exasperating.
My site already performs well in the SERPs - I'm not aware of a third party site's scraped content outperforming my site for any search phrase.
Given my circumstances, how much effort do you think I should continue to put into policing scraped content?
-
I watch my traffic increases and decreases. You can do that with google analytics. I do it with clicky. When I see an important page show traffic losses, I go looking.
One of my retail sites suddenly was not selling a certain product category very well. I looked into it and hundreds of "made in China" blogs had scraped my content.
Then, I have images that are often grabbed. I watch image search traffic and watch for them.
I have tens of thousands of pages on the web. Its hard to monitor all of them, but it is easy to monitor when you can download a traffic spreadsheet that has % up and % down, sort it and then investigate. So, I am being responsive instead of proactive. And, really, I don't look at it as ROI, it is loss prevention.
-
Thanks for the detailed suggestions!
As a follow up: what metric do you use to decide which offenders to go after, and which ones to ignore? I simply don't have time to go after everybody who has copied my content so I need a way to prioritize.
There are two obvious situations where action is warranted: first, when the infringement is committed by a competitor in my industry, and second, when the infringing content outperforms my own site in the SERPs. What else would you suggest?
Thanks again.
-
Over the years, tons of original content from my website (written by me) has been scraped by 200-300 external sites.
I have the same problem on multiple sites. Most of the time the scraping is not harmful. But, on several occasions it has cost me thousands of dollars and forced me to abandon product lines and donate thousands of dollars worth of inventory to Goodwill. Infringers have included websites of many law firms, a state supreme court. a presidential candidate, an Ivy League law school and many others. Infringers can be using images, video or text.
It is EXTREMELY time consuming to identify the site owners, prepare an email with supporting evidence (screen shots), and following up 2, 3, 15 times until they remove the scraped content. Filing DMCA takedowns are a final option for sites hosted in the US,....
I am not an expert in intellectual property law, so what I do or say is not advice. Filing a DMCA can get you sued even if you are in the right. If you file a DMCA all of the details including your name and why you filed will be easily available to the person or company that you complained about. They can retaliate against you, call begging you to retract the DMCA, they can do anything they want against you.
If I contact someone two or three times without results I go straight to DMCA. One thing that I can say about Google is that they generally respond promptly about removing infringing content from their web SERPs and image SERPs. They also generally respond promptly to infringing content on Blogspot and YouTube. Ebay will shut down auctions en masse in response to a DMCA if a seller or group of sellers are using your images or other property.
When infringing content is on a university, government agency, or prominent company's website they usually respond immediately to notification. I usually contact a provost, legal department, or internal manager instead of writing to "webmaster" - who probably was involved in the problem and simply does not understand intellectual property. I usually don't prepare a big document. An email pointing out the infringing work and offering a resolution of "take it down right away" will usually get fast results.
quite a few of the offenders are in China, India, Nigeria, and other places not subject to DMCA.
If you can't identify the owner of the website or if they are outside of the USA, you can still file a DMCA to have the content removed from search engines or websites like YouTube or Blogspot who have an international user community but are owned by a US company. Some of them will insist that you deal with their infringing member, having an attorney contact them might yield quick results.
A lot of the professional spam is done from outside of the USA but there are a few spammers and simply arrogant cowboys in the USA. DMCA is the route to take, but you do risk retaliation with some of them.
Sometimes, when a site owner takes down scraped content, it reappears a few months or years later. It's exasperating.
Yep.
I spend a good amount of time protecting my content. The problem is so big that I can usually only afford to do it in situations where the scraping, infringing or whatever is costing me or my content is appearing on the website of an established business or organization who should have people in leadership positions who would not want that happening.
I watch my analytics watching for traffic drops, etc. Occasionally I go out looking for infringement. The cost of policing can be astronomical. I could have a full time employee working on this if I was going after everyone - and its not cost effective. Most of the people who are grabbing your stuff are putting it on domains that can't damage your rankings.
A greater problem than verbatim theft, in my opinion, is the people who grab your articles and simply rewrite them. You spent tons of time doing the research and preparing the presentation. They simply do a paragraph-by-paragraph rewrite into something that is not detectable or recognizable beyond structure.
Good luck.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Teaser Content Help!!
I'm in the process of a redesign and upgrade to Drupal 8 and have used Drupal's taxonomy feature to add a fairly large database of Points of Interest, Services etc. initially this was just for a Map/Filter for site users. The developer also wants to use teasers from these content types (such as a scenic vista description) as a way to display the content on relevant pages (such as the scenic vistas page, as well as other relevant pages). Along with the content it shows GPS coordinates and icons related to the description. In short, it looks cool, can be used in multiple relevant locations and creates a great UX. However, many of these teasers would basically be pieces of content from pages with a lot of SEO value, like descriptive paragraphs about scenic viewpoints from the scenic viewpoints page. Below is an example of how the descriptions of the scenic viewpoints would be displayed on the scenic viewpoints pages, as well as other potential relevant pages. HOW WILL THIS AFFECT THE SEO VALUE OF THE CONTENT?? Thanks in advance for any help, I can't find an answer anywhere. About 250 words worth of content about a scenic vista. There’s about 8 scenic vista descriptions like this from the scenic vistas page, so a good chunk of valuable content. There are numerous long form content pages like this that have descriptions and information about sites and points of interest that don't warrant having their own page. For more specific content with a dedicated page, I can just the the intro paragraph as a teaser and link to that specific page of content. Not sure what to do here.
Intermediate & Advanced SEO | | talltrees0 -
Weight of content further down a page
Hi, A client is trying to justify a design decision by saying he needs all the links for all his sub pages on the top level category page as google won't index them; however the links are available on the sub category and the sub category is linked to from the top level page so I have argued as long as google can crawl the links through the pages they will be indexed and won't be penalised. Am I correct? Additionally the client has said those links need to be towards the top of the page as content further down the page carries less weight; I don't believe this is the case but can you confirm? Thanks again, Craig.
Intermediate & Advanced SEO | | CSIMedia1 -
Duplicate Page Content
We have different plans that you can signup for - how can we rectify the duplicate page content and title issue here? Thanks. | http://signup.directiq.com/?plan=100 | 0 | 1 | 32 | 1 | 200 |
Intermediate & Advanced SEO | | directiq
| http://signup.directiq.com/?plan=104 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=116 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=117 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=102 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=119 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=101 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=103 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=5 |0 -
Block lightbox content
I'm working on a new website with aggregator of content.
Intermediate & Advanced SEO | | JohnPalmer
i'll show to my users content from another website in my website in LIGHTBOX windows when they'll click on the title of the items. ** I don't have specific url for these items.
What is the best way to say for SE "Don't index these pages"?0 -
Best practice for expandable content
We are in the middle of having new pages added to our website. On our website we will have a information section containing various details about a product, this information will be several paragraphs long. we were wanting to show the first paragraph and have a read more button to show the rest of the content that is hidden. Whats googles view on this, is this bad for seo?
Intermediate & Advanced SEO | | Alexogilvie0 -
Can a website be punished by panda if content scrapers have duplicated content?
I've noticed recently that a number of content scrapers are linking to one of our websites and have the duplicate content on their web pages. Can content scrapers affect the original website's ranking? I'm concerned that having duplicated content, even if hosted by scrapers, could be a bad signal to Google. What are the best ways to prevent this happening? I'd really appreciate any help as I can't find the answer online!
Intermediate & Advanced SEO | | RG_SEO0 -
Do people associate ads with crap content?
One of our client gets links from his content but he has ads everywhere. We were thinking he would get more links if he took down the ads. I think when people see a bunch of ads on a site they associate it with crap spammy content . What do you guys think? More specific adsense link and banner units. one square one letter board one tiny square all above the fold.
Intermediate & Advanced SEO | | OxzenMedia0 -
How Many Words in Content for Good SEO?
I have heard it's best to have 400+ words of content for strong SEO per page. I believe this is true for the most. I have a project in mind, however, that I am considering doing 100-200 words of content per page. This is for a glossary of terms for my industry, where I have a unique page for each term that describes what that term means w/ 1 image and a few links to related products. Is having just 100-200 words going to be enough? Each page will still be unique, original content. Or is it best to really try for longer articles? In other words, is there a general rule for # of words per page for search engines to see the page as valuable and unique and to give it good ranking? Give me a BIG THUMBS UP if you found this question useful. It won't cost you anything! Thanks!
Intermediate & Advanced SEO | | applesofgold0