What tools do you use to find scraped content?
-
This hasn’t been an issue for our company so far, but I like to be proactive. What tools do you use to find sites that may have scraped your content?
Looking forward to your suggestions.
Vic
-
Oh, this belongs to a different thread: http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
-
Is this part of the original conversation, or something else? Which sites are these?
-
I'm not sure we have been scraped as such though, because the site in question has different content.
It looks as though the offending site has hacked another site (which redirects to the offending site) but the hacked site is ranking for our brand name. Our homepage has lost all rankings it had (our category and product pages seem fine) and has essentially disappeared.
Can anyone else shed any light?
-
Siteliner (Copyscape's big brother) is really great and what we use first (plus I have a bookmarklet for it to make it faster & easy to use.)
Also use Linda's method of taking a bit of content in quotes. Easiest way to show an ecommerce client how much work they're going to require - take three product descriptions into Google, watch the magic, and explain that would happen across all 15,000 products.
-
I spot check on a regular basis by taking a unique chunk out of a post, putting it in quotes, and doing a Google search on it. It's not comprehensive, but it is free. [And the main problems we have had with scrapers have been with sites that have taken huge portions of our content, not just an article or two, and a spot check roots those out.]
-
Thanks, Chris & Jonathan. I will look into Copyscape. Good stuff!
-
Yep, Copyscape is what I use. I use a wordpress plugin that uses the copyscape API and just check my main content every month or so with a simple click.
-
Copyscape works well for us. You can scan a couple of pages for free, and then it's $0.05/page after that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Would this be duplicate content or bad SEO?
Hi Guys, We have a blog for our e-commerce store. We have a full-time in-house writer producing content. As part of our process, we do content briefs, and as part of the brief we analyze competing pieces of content existing on the web. Most of the time, the sources are large publications (i.e HGTV, elledecor, apartmenttherapy, Housebeautiful, NY Times, etc.). The analysis is basically a summary/breakdown of the article, and is sometimes 2-3 paragraphs long for longer pieces of content. The competing content analysis is used to create an outline of our article, and incorporates most important details/facts from competing pieces, but not all. Most of our articles run 1500-3000 words. Here are the questions: Would it be considered duplicate content, or bad SEO practice, if we list sources/links we used at the bottom of our blog post, with the summary from our content brief? Could this be beneficial as far as SEO? If we do this, should be nofollow the links, or use regular dofollow links? For example: For your convenience, here are some articles we found helpful, along with brief summaries: <summary>I want to use as much of the content that we have spent time on. TIA</summary>
White Hat / Black Hat SEO | | kekepeche1 -
Does Google and Other Search Engine crawl meta tags if we call it using react .js ?
We have a site which is having only one url and all other pages are its components. not different pages. Whichever pages we click it will open show that with react .js . Meta title and meta description also will change accordingly. Will it be good or bad for SEO for using this "react .js" ? Website: http://www.mantistechnologies.com/
White Hat / Black Hat SEO | | RobinJA0 -
Is there any SEO impact to using "www" vs. non-"www" preferred domain name?
My client has been using "www" with his domain and before I took over, has used it in marketing etc. I typically don't use "www" in my wordpress setup, and set non-www as the preferred domain in google analytics and google search console. Does it make any difference? Especially when www resolves to non-www? I appreciate some guidance with this.
White Hat / Black Hat SEO | | chill9860 -
Does google give any advantage to Webmaster tools verified sites?
Hello friends, I am seeing a strange pattern. i register 2 new domain and make sites on them and add no backlinks nothing only put content and did on page seo right. After 1month of google indexing. both sites are not showing in search for the targeted keywords, but as soon as i add them to Google Webmaster tools they both automatically comes to the 16th and 24th number for their specific keywords. So my question is does Google give any advantage to sites which are verified and added into its webmaster tools in terms of seo or authority?
White Hat / Black Hat SEO | | RizwanAkbar0 -
I Mistakenly uploaded Disavow File to Non WWW version of Webiste in Webmaster Tools...is this a Problem???
Hey guys and gals, I need some advice on this please. I recently had someone perform a negative S.E.O campaign on my site and I was inundated with 13,000 + spammy links pointing to my website and I had to perform a Disavow in Google Webmaster Tools but for some reason, it is showing that I uploaded the Disavow text file to the Non WWW version of the website but the WWW version of my website is the preferred domain and I have all NON WWW queries being 301 redirected to www.pcmedicsoncall.com My question is should I correct this and upload the Disavow text to the preferred domain in Google Webmaster Tools??? Please advise on how I should proceed with this situation.... Thank you. Cam
White Hat / Black Hat SEO | | CamMcArthur0 -
Penguin Maybe? Ranking low for main term: Trying to find cause and correct
Hello, For nlpca(dot)com one of our main keywords is the term "NLP" We are ranking 25th for that term.Possible causes: 1. keyword stuffing on home page, though we need to use the term over and over again to describe ourselves. Also, competitors like nlpco(dot)com and nlpu(dot)com also mention "NLP" a lot 2. Backlink profile: see this spreadsheet. We have a lot of sites from other countries and many sitewides but all natural and almost all branded. Ou company names are NLP Institute of California, NLP California, and NLP and Coaching Institute. 3. nlpcacoach(dot)org is a sitewide footer link. So is iepdoc.nl. We're going to ask the first site to take our link down. 4. No "What is NLP" article. I think that might help. 5. Most of our 60 articles are posted on other sites. We author about 30 of them. I'm working on authorship via rel="author" and rel="me" links. There's usually 2 authors 6. Most of the title tags used to be 4 keywords separated by pipes -"|" I changed them all after the updates took the keyword "NLP" down. That's about all I can think of. What do we do or clean up?
White Hat / Black Hat SEO | | BobGW0 -
Duplicate Content
Hi, I have a website with over 500 pages. The website is a home service website that services clients in different areas of the UK. My question is, am I able to take down the pages from my URL, leave them down for say a week, so when Google bots crawl the pages, they do not exist. Can I then re upload them to a different website URL, and then Google wont penalise me for duplicate content? I know I would of lost juice and page rank, but that doesnt really matter, because the site had taken a knock since the Google update. Thanks for your help. Chris,
White Hat / Black Hat SEO | | chrisellett0 -
Can Using Google Analytics Make You More Prone to Deindexation?
Hi, I'm aggressively link building for my clients using blog posts and have come upon information that using Google Analytics (as well as GWT, etc.) may increase my chance of deindexation. Anyone have any thoughts on this topic? I'm considering using Piwik as an alternative if this is the case. Thanks for your thoughts, Donna
White Hat / Black Hat SEO | | WebMarketingHUB0