What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Where can I find either web directories or decent sites that will link back to me...without paying thru the nose?
Hi Community! For starters, from my question, you can probably determine that I am either a novice, or very low intermediate SEO person. We all know, links are king with our friends at Google. I am a recently-retired IT person with a very small freelance IT company (just me), and I'd like to generate more leads/business. I'd be thrilled with one or two small jobs per week to supplement my pension. I've used the MOZ tools only to determine the majority of my competitors have like a thousand links, whereas Google reports me as having 97 links. My on-page grades are all upper 90s and a couple are at 100%. I am not targeting really competitive keywords, so that's not my problem. My problem is my DA, which is sitting at a mere 20...pause to laugh! I don't want to pay thru the nose for high DA links for the obvious reasons. I submitted my URLs to as many directories as I could find. Would anyone have a decent list of sites where I could submit my URL to get some more links? Thanks guys and Gals! Willy
White Hat / Black Hat SEO | | NewSEOguy0 -
Removing duplicated content using only the NOINDEX in large scale (80% of the website).
Hi everyone, I am taking care of the large "news" website (500k pages), which got massive hit from Panda because of the duplicated content (70% was syndicated content). I recommended that all syndicated content should be removed and the website should focus on original, high quallity content. However, this was implemented only partially. All syndicated content is set to NOINDEX (they thing that it is good for user to see standard news + original HQ content). Of course it didn't help at all. No change after months. If I would be Google, I would definitely penalize website that has 80% of the content set to NOINDEX a it is duplicated. I would consider this site "cheating" and not worthy for the user. What do you think about this "theory"? What would you do? Thank you for your help!
White Hat / Black Hat SEO | | Lukas_TheCurious0 -
20-30% of our ecommerce categories contain no extra content, could this be a problem
Hello, About 20-30% of our ecommerce categories have no content beyond the products that are in them. Could this be a problem with Panda? Thanks!
White Hat / Black Hat SEO | | BobGW0 -
Noindexing Thin Content Pages: Good or Bad?
If you have massive pages with super thin content (such as pagination pages) and you noindex them, once they are removed from googles index (and if these pages aren't viewable to the user and/or don't get any traffic) is it smart to completely remove them (404?) or is there any valid reason that they should be kept? If you noindex them, should you keep all URLs in the sitemap so that google will recrawl and notice the noindex tag? If you noindex them, and then remove the sitemap, can Google still recrawl and recognize the noindex tag on their own?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
SEO best practice: Use tags for SEO purpose? To add or not to add to Sitemap?
Hi Moz community, New to the Moz community and hopefully first post/comment of many to come. I am somewhat new to the industry and have a question that I would like to ask and get your opinions on. It is most likely something that is a very simple answer, but here goes: I have a website that is for a local moving company (so small amounts of traffic and very few pages) that was built on Wordpress... I was told when I first started that I should create tags for some of the cities serviced in the area. I did so and tagged the first blog post to each tag. Turned out to be about 12-15 tags, which in turn created 12-15 additional pages. These tags are listed in the footer area of each page. There are less than 20 pages in the website excluding the tags. Now, I know that each of these pages are showing as duplicate content. To me, this just does not seem like best practices to me. For someone quite new to the industry, what would you suggest I do in order to best deal with this situation. Should I even keep the tags? Should I keep and not index? Should I add/remove from site map? Thanks in advance for any help and I look forward to being a long time member of SEOMoz.
White Hat / Black Hat SEO | | BWrightTLM0 -
Using Redirects To Avoid Penalties
A quick question, born out of frustration! If a webpage has been penalised for unnatural links, what would be the effects of moving that page to a new URL and setting up a 301 redirect from the old penalised page to the new page? Will Google treat the new page as ‘non-penalised’ and restore your rankings? It really shouldn’t work, but I’m convinced (although not certain) that our clients competitor has done this, with great effect! I suppose you could also achieve this using canonicalisation too! Many thanks in advance, Lee.
White Hat / Black Hat SEO | | Webpresence0 -
Shadow Pages for Flash Content
Hello. I am curious to better understand what I've been told are "shadow pages" for Flash experiences. So for example, go here:
White Hat / Black Hat SEO | | mozcrush
http://instoresnow.walmart.com/Kraft.aspx#/home View the page as Googlebot and you'll see an HTML page. It is completely different than the Flash page. 1. Is this ok?
2. If I make my shadow page mirror the Flash page, can I put links in it that lead the user to the same places that the Flash experience does?
3. Can I put "Pinterest" Pin-able images in my shadow page?
3. Can a create a shadow page for a video that has the transcript in it? Is this the same as closed captioning? Thanks so much in advance, -GoogleCrush0 -
What does Youtube Consider Duplicate content and will it effect my ranking/traffic?
What does youtube consider duplicated content? If I have a power point type video that I already have on youtube and I want to change the beginning and end call to action, would that be considered duplicate content? If yes then how would this effect my ranking/youtube page. Will it make a difference if I have it embedded on my blog?
White Hat / Black Hat SEO | | christinarule0