What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help finding website content scraping
Hi, I need a tool to help me review sites that are plagiarising / directly copying content from my site. But tools that I'm aware, such as Copyscape, appear to work with individual URLs and not a root domain. That's great if you have a particular post or page you want to check. But in this case, some sites are scraping 1000s of product pages. So I need to submit the root domain rather than an individual URL. In some cases, other sites are being listed in SERPs above or even instead of our site for product search terms. But so far I have stumbled across this, rather than proactively researched offending sites. So I want to insert my root domain & then for the tool to review all my internal site pages before providing information on other domains where an individual page has a certain amount of duplicated copy. Working in the same way as Moz crawls the site for internal duplicate pages - I need a list of duplicate content by domain & URL, externally that I can then contact the offending sites to request they remove the content and send to Google as evidence, if they don't. Any help would be gratefully appreciated. Terry
White Hat / Black Hat SEO | | MFCommunications0 -
How to find if a website has paid or spammy back-links? Latest ways to investigate.
Hi all, I would like to investigate about our website back-links if something is wrong. If there are any paid or spammy back-links. How to proceed on this exercise? We have been using ahrefs and seems like it's quite enough. Is there any way we can pull out the fishy back-links? Do we have any helpful data from webmasters about this? Thanks
White Hat / Black Hat SEO | | vtmoz0 -
Disabling a slider with content...is considered cloaking?
We have a slider on our site www.cannontrading.com, but the owner didn't like it, so I disabled it. And, each slider contains link & content as well. We had another SEO guy tell me it considered cloaking. Is this True? Please give feedbacks.
White Hat / Black Hat SEO | | ACann0 -
Using a geolocation service to serve different banners in homepage. Dangers? Best Practices?
Hello, our website is used by customer in more than 100 countries. Becasuse the countries we serve are so many, we are using one single domain and homepage, without country specific content. Now, we are considering to use an geolocation service to identify the customer location and then to change the contents of one banner in the home page accordingly. Might this be dangerous from a SEO perspective? If yes, any suggesiton on how can we implement this to avoid troubles and penalties form the Search Engines? Thanks in advance for any help,Dario
White Hat / Black Hat SEO | | Darioz0 -
DIV Attribute containing full DIV content
Hi all I recently watched the latest Mozinar called "Making Your Site Audits More Actionable". It was presented by the guys at seogadget. In the mozinar one of the guys said he loves the website www.sportsbikeshop.co.uk and that they have spent a lot of money on it from an SEO point of view (presumably with seogadget) so I decided to look through the source and noticed something I had not seen before and wondered if anyone can shed any light. On this page (http://www.sportsbikeshop.co.uk/motorcycle_parts/content_cat/852/(2;product_rating;DESC;0-0;all;92)/page_1/max_20) there is a paragraph of text that begins with 'The ever reliable UK weather...' and when you via the source of the containing DIV you will notice a bespoke attribute called "threedots=" and within it, is the entire text content for that DIV. Any thoughts as to why they would put that there? I can't see any reason as to why this would benefit a site in any shape or form. Its invalid markup for one. Am I missing a trick..? Thoughts would be greatly appreciated. Kris P.S. for those who can't be bothered to visit the site, here is a smaller version of what they have done: This is an introductory paragraph of text for this page.
White Hat / Black Hat SEO | | yousayjump0 -
What happens when content on your website (and blog) is an exact match to multiple sites?
In general, I understand that having duplicate content on your website is a bad thing. But I see a lot of small businesses (specifically dentists in this example) who hire the same company to provide content to their site. They end up with the EXACT same content as other dentists. Here is a good example: http://www.hodnettortho.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.braces2000.com/blog/2013/02/valentine’s-day-and-your-teeth-2/ http://www.gentledentalak.com/blog/2013/02/valentine’s-day-and-your-teeth/ If you google the title of that blog article you find tons of the same article all over the place. So, overall, doesn't this make the content on these blogs irrelevant? Does this hurt the SEO on these sites at all? What is the value of having completely unique content on your site/blog vs having duplicate content like this?
White Hat / Black Hat SEO | | MorganPorter0 -
How do you optimize a page with Syndicated Content?
Content is syndicated legally (licensed). My questions are: What is the best way to approach this situation? Is there any a change to compete with the original site/page for the same keywords? Is it okay to do so? Will there be any negative SEO impact on my site?
White Hat / Black Hat SEO | | StickyRiceSEO0 -
Possibly a dumb question - 301 from a banned domain to new domain with NEW content
I was wondering if banned domains pass any page rank, link love, etc. My domain got banned and I AM working to get it unbanned, but in the mean time, would buying a new domain, and creating NEW content that DOES adhere to the google quality guidelines, help at all? Would this force an 'auto-evaluation' or 're-evaluation' of the site by google? or would the new domain simply have ZERO effect from the 301 unless that old domain got into google's good graces again.
White Hat / Black Hat SEO | | ilyaelbert0