Help finding website content scraping
-
Hi,
I need a tool to help me review sites that are plagiarising / directly copying content from my site. But tools that I'm aware, such as Copyscape, appear to work with individual URLs and not a root domain. That's great if you have a particular post or page you want to check. But in this case, some sites are scraping 1000s of product pages. So I need to submit the root domain rather than an individual URL.
In some cases, other sites are being listed in SERPs above or even instead of our site for product search terms. But so far I have stumbled across this, rather than proactively researched offending sites.
So I want to insert my root domain & then for the tool to review all my internal site pages before providing information on other domains where an individual page has a certain amount of duplicated copy. Working in the same way as Moz crawls the site for internal duplicate pages - I need a list of duplicate content by domain & URL, externally that I can then contact the offending sites to request they remove the content and send to Google as evidence, if they don't.
Any help would be gratefully appreciated.
Terry
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content placement in HTML and display
Does Google penalize for content being placed at the top of the page and display for users at bottom of the page? This technique is done by CSS. Thank you in advance for your feedback!
White Hat / Black Hat SEO | | Aerocasillas0 -
Hreflang/Canonical Inquiry for Website with 29 different languages
Hello, So I have a website (www.example.com) that has 29 subdomains (es.example.com, vi.example.com, it.example.com, etc). Each subdomain has the exact same content for each page, completely translated in its respective language. I currently do not have any hreflang/canonical tags set up. I was recently told that this (below) is the correct way to set these tags up -For each subdomain (es.example.com/blah-blah for this example), I need to place the hreflang tag pointing to the page the subdomain is on (es.example.com/blah-blah), in addition to every other 28 subdomains that have that page (it.example.com/blah-blah, etc). In addition, I need to place a canonical tag pointing to the main www. version of the website. So I would have 29 hreflang tags, plus a canonical tag. When I brought this to a friends attention, he said that placing the canonical tag to the main www. version would cause the subdomains to drop out of the SERPs in their respective country search engines, which I obviously wouldn't want to do. I've tried to read articles about this, but I end up always hitting a wall and further confusing myself. Can anyone help? Thanks!
White Hat / Black Hat SEO | | juicyresults0 -
How do you change the 6 links under your website in Google?
Hello everyone, I have no idea how to ask this question, so I'm going to give it a shot and hopefully someone can help me!! My company is called Eteach, so when you type in Eteach into Google, we come in the top position (phew!) but there are 6 links that appear underneath it (I've added a picture to show what I mean). How do you change these links?? I don't even know what to call them, so if there is a particular name for these then please let me know! They seem to be an organic rank rather than PPC...but if I'm wrong then do correct me! Thanks! zorIsxH.jpg
White Hat / Black Hat SEO | | Eteach_Marketing0 -
NEED HELP, Figuring Out Ranking Drop!
Hello, I need help from somebody, anybody, in trying to figure out why my site dropped so much for the keyword “wildblue” and “wild blue”. On the week of Feb. 13, 2012, my website jumped from middle of the first page to the fourth page, and then a week or two later jumped completely out of the index (or at least off the top 5 pages). We do not engage in any deceptive practices. Our entire website is centered around this keyword, and we are very relevant, and have informative and continually updated content for visitors. I thought at first we got hit by Panda, but our overall organic traffic has not decreased, it has actually been steadily increasing compared to same time last year. I have tried over the past several months to get us back up, or at least figure out what happened, with no luck. If anyone could advise me on what might have happened, how to correct it, or even has any ideas of how I could figure out what happened I would greatly appreciate it. Website is: http://www.mybluedish.com
White Hat / Black Hat SEO | | MyNet0 -
Competitors Developing Spammy Link For My Website
Well Guys there are lot of discussions in almost all the communities, blogs, forums about Post Penguin impact. Google says that if find that you're involved in any link building activities, we may penalize you. People out there have already started their developed links. But what if our competitors would have developed those links. Initially it was okay to develop one way links, I even developed lot of quality, but deliberately, links. around 95% links are placed manually, if return to some favor or money but all links looks natural. Most of the links I developed through content only, like articles, blog comments, PR submission, etc now really skeptical about the quality (after hearing lot of talks and reading n number of posts). Now, can I also submit my competitor's websites in 1000 topic directory (obviously not in any spammy directory), would it effect that website adversely? What if I spun an existing content and submit it into 500 article directories and give backlink to competitor site from using only one anchor text (which is obviously the main keywords - highest sales generating keyword) I look forward to some experts comments.
White Hat / Black Hat SEO | | Khem_Raj70 -
How to Not Scrap Content, but still Being a Hub
Hello Seomoz members. I'm relatively new to SEO, so please forgive me if my questions are a little basic. One of the sites I manage is GoldSilver.com. We sell gold and silver coins and bars, but we also have a very important news aspect to our site. For about 2-3 years now we have been a major hub as a gold and silver news aggregator. At 1.5 years ago (before we knew much about SEO), we switched from linking to the original news site to scraping their content and putting it on our site. The chief reason for this was users would click outbound to read an article, see an ad for a competitor, then buy elsewhere. We were trying to avoid this (a relatively stupid decision with hindsight). We have realized that the Search Engines are penalizing us, which I don't blame them for, for having this scraped content on our site. So I'm trying to figure out how to move forward from here. We would like to remain a hub for news related to Gold and Silver and not be penalized by SEs, but we also need to sell bullion and would like to avoid loosing clients to competitors through ads on the news articles. One of the solutions we are thinking about is perhaps using an iFrame to display the original url, but within our experience. An example is how trap.it does this (see attached picture). This way we can still control the experience some what, but are still remaining a hub. Thoughts? Thank you, nick 3dLVv
White Hat / Black Hat SEO | | nwright0 -
Help for a complete SEO newbie!
Hi all, I've just joined seomoz today to try and further my very young education on SEO. My major problem is i need my site to rank high in local search engines but feel that none of the customers read much of the content as i am a landscaper and feel they just search "landscaping in Newcastle" and are immediatly looking for a contact number to arrange a free estimate. I dont do any online sales, its just to generate leads. I've spent alot of time building a better site than my local competitors but they still out rank me on alot of keywords i.e. "Driveways in Gateshead" My question is do i keep adding more and more content hoping this will work long term or do i link build with anchor text etc or both? I cannot believe they still out rank me when i feel i have more links more anchor text and a load more origional content and images. I think it may be that my site is still under 1 year old. I feel i am boucing from content to link building then trying something else without any real knowlegde of what i really should be doing or what should be the priority at this young stage for my site. I have managed to get on page 1 of google for most of my keywords in local searches ( obviously not national) but still feel its been more down to luck and effort than actually knowing what i am doing when it comes to site and offsite optimization Any help, tips etc would be greatly appreciated. Many thanks John
White Hat / Black Hat SEO | | totaldriveways0 -
Multiple doamin with same content?
I have multiple websites with same content such as http://www.example.com http://www.example.org and so on. My primary url is http://www.infoniagara.com and I also placed a 301 on .org. Is that enough to keep away my exampl.org site from indexing on google and other search engines? the eaxmple.org also has lots of link to my old html pages (now removed). Should i change that links too? or will 301 redirection solve all such issues (page not found/crawl error) of my old webpages? i would welcome good seo practices regarding maintaining multiple domains thanks and regards
White Hat / Black Hat SEO | | VipinLouka780