Help finding website content scraping
-
Hi,
I need a tool to help me review sites that are plagiarising / directly copying content from my site. But tools that I'm aware, such as Copyscape, appear to work with individual URLs and not a root domain. That's great if you have a particular post or page you want to check. But in this case, some sites are scraping 1000s of product pages. So I need to submit the root domain rather than an individual URL.
In some cases, other sites are being listed in SERPs above or even instead of our site for product search terms. But so far I have stumbled across this, rather than proactively researched offending sites.
So I want to insert my root domain & then for the tool to review all my internal site pages before providing information on other domains where an individual page has a certain amount of duplicated copy. Working in the same way as Moz crawls the site for internal duplicate pages - I need a list of duplicate content by domain & URL, externally that I can then contact the offending sites to request they remove the content and send to Google as evidence, if they don't.
Any help would be gratefully appreciated.
Terry
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to deal with link echoes of former hacked websites?
Hi all, I'd know which is the best way to deal with link echoes of former hacked websites that Webmaster tool reports. to clarify: when you download the backlink report from Webmaster tool you'll have a list of backlinks discovered, but if you follow one of those links you will see that on that page there is no link to your website. the source code is also clean, no hidden links or other dodgy technique. Since that the topic is usually miles away from my industry I have to assume at some point that site has been hacked by a spammer who placed that backlink. In this case what should I do? Ignore it, disavow the domain or what? Moreover, which is the best procedure when you have to face a site which points a lot of backlinks from only its sub-domains? For example: this dodgy spammy website : http://px949z32.com/ is apparently a desert, but when you do site:http://px949z32.com/ you'll discover 55,200 results! Would be it be enough to just disavow the root domain http://px949z32.com/?
White Hat / Black Hat SEO | | madcow78
As I don't want to wait too long before taking any action, my plan is to disavow all those domains without any mercy, although I can't find a current backlink in one of their pages. I will do this, as at the minute my concern is they will be hacked again and I have to face the same issue again and again Thanks to all, P.0 -
Image Optimization & Duplicate Content Issues
Hello Everyone, I have a new site that we're building which will incorporate some product thumbnail images cut and pasted from other sites and I would like some advice on how to properly manage those images on our site. Here's one sample scenario from the new website: We're building furniture and the client has the option of selecting 50 plastic laminate finish options from the Formica company. We'll cut and paste those 50 thumbnails of the various plastic laminate finishes and incorporate them into our site. Rather than sending our website visitors over to the Formica site, we want them to stay put on our site, and select the finishes from our pages. The borrowed thumbnail images will not represent the majority of the site's content and we have plenty of our own images and original content. As it does not make sense for us to order 50 samples from Formica & photograph them ourselves, what is the best way to handle to issue? Thanks in advance, Scott
White Hat / Black Hat SEO | | ccbamatx0 -
Rankings dropped, should I start a new website?
Hello, my rankings dropped last year (penguin update) - I think it was April 2012 and the website went from 300 visitors per day to 10 per day. This probably happened because I bought links, but I also did a lot of manual and natural SEO (at that time). After the drop, I didn't know what to do... so I did some manual SEO, blog comments, forum posts, article publications (lets say 60 links in total - with diverse anchor texts - brand keywords, etc) and then I paused working on the site to see if there will be any changes... and 1 year latter, there are still no changes. My site used to be in the top results of the first page and now it is totally out of Google. http://getmoreyoutubeviews.com Should I move on and start a new website or do something to fix this one? Thanks Alex
White Hat / Black Hat SEO | | buysocialexposure0 -
Help figuring out if certain paid directories are worth it
The person in my position previously had quite a few paid directories our site was listed on. What is the best resources you guys have used or know of to figure out which ones are good to keep? For instance one that is up for renewal this week is site-sift.com. I know the person previous to me did some not so ethical stuff and I'm trying to clean up messes. Any advice on directories would be much appreciated.
White Hat / Black Hat SEO | | inhouseninja0 -
DIV Attribute containing full DIV content
Hi all I recently watched the latest Mozinar called "Making Your Site Audits More Actionable". It was presented by the guys at seogadget. In the mozinar one of the guys said he loves the website www.sportsbikeshop.co.uk and that they have spent a lot of money on it from an SEO point of view (presumably with seogadget) so I decided to look through the source and noticed something I had not seen before and wondered if anyone can shed any light. On this page (http://www.sportsbikeshop.co.uk/motorcycle_parts/content_cat/852/(2;product_rating;DESC;0-0;all;92)/page_1/max_20) there is a paragraph of text that begins with 'The ever reliable UK weather...' and when you via the source of the containing DIV you will notice a bespoke attribute called "threedots=" and within it, is the entire text content for that DIV. Any thoughts as to why they would put that there? I can't see any reason as to why this would benefit a site in any shape or form. Its invalid markup for one. Am I missing a trick..? Thoughts would be greatly appreciated. Kris P.S. for those who can't be bothered to visit the site, here is a smaller version of what they have done: This is an introductory paragraph of text for this page.
White Hat / Black Hat SEO | | yousayjump0 -
Competitors and Duplicate Content
I'm curious to get people's opinion on this. One of our clients (Company A) has a competitor that's using duplicate sites to rank. They're using "www.companyA.com" and "www.CompanyAIndustryTown.com" (actually, several of the variations). It's basically duplicate content, with maybe a town name inserted or changed somewhere on the page. I was always told that this is not a wise idea. They started doing this in the past month or so when they had a site redesign. So far, it's working pretty well for them. So, here's my questions: -Would you address this directly (report to Google, etc.)? -Would you ignore this? -Do you think it's going to backfire soon? There's another company (Company B) that's using another practice- using separate pages on their domain to address different towns, and using those as landing pages. Similar, in that a lot of the content is the same, just some town names and minor details changed. All on the same domain though. Would the same apply to that? Thanks for your insight!
White Hat / Black Hat SEO | | DeliaAssociates0 -
Help for a complete SEO newbie!
Hi all, I've just joined seomoz today to try and further my very young education on SEO. My major problem is i need my site to rank high in local search engines but feel that none of the customers read much of the content as i am a landscaper and feel they just search "landscaping in Newcastle" and are immediatly looking for a contact number to arrange a free estimate. I dont do any online sales, its just to generate leads. I've spent alot of time building a better site than my local competitors but they still out rank me on alot of keywords i.e. "Driveways in Gateshead" My question is do i keep adding more and more content hoping this will work long term or do i link build with anchor text etc or both? I cannot believe they still out rank me when i feel i have more links more anchor text and a load more origional content and images. I think it may be that my site is still under 1 year old. I feel i am boucing from content to link building then trying something else without any real knowlegde of what i really should be doing or what should be the priority at this young stage for my site. I have managed to get on page 1 of google for most of my keywords in local searches ( obviously not national) but still feel its been more down to luck and effort than actually knowing what i am doing when it comes to site and offsite optimization Any help, tips etc would be greatly appreciated. Many thanks John
White Hat / Black Hat SEO | | totaldriveways0 -
My Google PR is Decreasing HELP!
We have just started in on an SEO campaign after a year or so break from engaging in active SEO efforts. Our rankings and organic traffic seems to be increasing but we just dropped from a PR 5 to a PR 4 after being a PR 5 for probably a couple years. We are not doing anything black hat or sketchy and try hard to make sure all of our links are relevant and quality links. Does anyone know why this might have happened or if it is an indication of anything?
White Hat / Black Hat SEO | | MyNet0