Duplicate Content - Bulk analysis tool?
-
Hi
I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk?
I used Copyscape a while ago, but don't remember this having a bulk feature?
Thank you!
-
Great thank you!
I'll give both a go!
-
Great thanks
Yes I use screaming frog for this, but it was to look at actual page content. So yes to see if sites copy our content, but also to see whether we need to update our product content as some products are very similar.
I'll check the batch process on copyscape thanks!
-
I have not used this tool in this way, but have used it for other crawler projects related to content clean up and it is rock solid. They have been very responsive to me on questions related to use of the software. http://urlprofiler.com/
Duplicate content search is the project next on my list, here is how they do it.
http://urlprofiler.com/blog/duplicate-content-checker/
You let URL profiler crawl the section of your site that is most likely to be copied (say your blog) and you tell URL profiler what section of your HTML to compare against (i.e. the content section vs the header or footer). URL profiler then uses proxies (you have to buy the proxies) to perform Google searches on sentences from your content. It crawls those results to see if there is a site in the Google SERPs that has sentences from your content word for word (or pretty close).
I have played with Copyscape, but my markets are too niche for it to work for me. The logic here from URL profilers is that you are searching the database that most matters, Google.
Good luck!
-
I believe you might be able to use List Mode in ScreamingFrog to accomplish this, however it depends on ultimately what your goal is to check for duplicate content. Do you simply want to find duplicate titles or duplicate descriptions? Or do you want to find pages with sufficiently similar text as to warrant concern?
== Ooops! ==
It didn't occur to me that you were more interested in duplicate content caused by other sites copying your content rather than duplicate content among your list of URLs.
Copyscape does have a "Batch Process" tool but it is only available to paid subscribers. It does work quite nicely though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Titles - Recruitment Agency
Hi All, I was just wondering if anyone had come across this problem before - We are a recruitment agency so we have multiple of the same jobs at the same time, so we're looking into ways to differentiate and change it but we've got the same Duplicate Titles - has anyone faced this problem before? And if so what did they do? Thanks in advance!
On-Page Optimization | | marketingmc0 -
Should I be worried about our 'Duplicate' content
Hi guys... I've just been working through some issues to give our site a little cleanup. I'm working through our duplicate content issues (we have some legitimate duplicate pages that need removing, and some of our dynamic content is problematic. Are web developers are going to sort with canonical tags this week.) However... There are some pages that are actually different products, but are very similar pages that are 'triggering' MOZ to say we have duplicate pages. Here an example... http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/fortnight-view-filofax-personal and http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/week-to-a-view-filofax-personal They are very similar refill products, it's just the diary format is different. Question: Should I be worried about this? I've never seen our rankings change in the past when 'cleaning up' duplicate content. What do you guys think? Isaac.
On-Page Optimization | | isaac6630 -
How to explain to a client that duplicate content is bad...
Afternoon! An SEO client of ours has copied a load of landing/category page content from other sites. Lots of emails have been sent back and forth asking them to remove it, but they are adamant to keep it up there until we have time to amend it. We have explained to them: The Google penalty risks The copyright risks The short and long-term implications for their brand new business/website The money they are spending on our SEO package could be completely wasted if they're caught I think the above is pretty black and white, but the director of this company will not budge. Does anyone have any different approaches? The director said he's happy for us to amend the content but, in the meantime, the plagiarised content will not be removed. Cheers, Lewis
On-Page Optimization | | PeaSoupDigital0 -
Duplicate Issue
Hello Mozzers! We have a client going through a website revamp. The client is The Michelangelo Hotel, and they are part of Star Hotels. Star Hotels plans to create a section on their site for The Michelangelo, as opposed to maintaining a stand alone site. They will then take the michelangelohotel.com domain, and point it to the corresponding pages on the Star site. The guest will key in www.michelangelohotel.com, and will see the same content that can be found on www.starhotel.com/en/michelangelo-hotel-new-york. The problem we have is this: Essentially the same content will be indexed twice, once on starhotels.com and once on michelangelohotel.com. This would seem to cause a duplicate content issue. What are your thoughts? Edit: I apologize, because I was not nearly clear enough here. The Star Hotels site will have 5 pages dedicated to The Michelangelo Hotel. The content will sit solely on that server as those 5 pages. Those 5 pages will each be indexed as 2 URLs. www.michelangelohotel.com <-> www.starhotels.com/en/michelangelo/ www.michelangelohotel.com/accommodations <-> www.starhotels.com/en/michelangelo/accommodations And so on. Thanks!
On-Page Optimization | | FrankSweeney0 -
Boat broker - issues with duplicate content and indexing search results
Hello, I have read a lot about optimising product pages and not indexing search results or category pages as ideally a person should be directed straight to a product page. I am interested in how best to approach a site that is listing second hand products for sale - essentially a marketplace of second hand goods (in my case, www.boatshed.com - international boat brokers). For example, we currently have 5 Colvic Sailer 26 boats for sale across the world - that is 5 boats of the same make and model but differing years, locations, sellers and prices. My concern is with search results and 'category' pages. Unlike typical e-commerce sites, when someone searches for a 'Colvic sailer 26 for sale' I want them to go to a search results style page as it is more useful for them to see a list of boats than one random one that Google decides is most important (or possibly one it can match by location). Currently we have 3 different URL types to show search results style pages (i.e. paginated lists of boats that include name, image and short description):
On-Page Optimization | | pbscreative
manufacturer URL's e.g. http://www.boatshed.com/colvic-manufacturer-145.html
category URL's e.g. barges http://www.boatshed.com/barges-category-55.html
and normal search results e.g. dosearch.php?form_boattype_textbox=&.... I have noindexed the search results pages but our category and manufacturer URLs show up in search results and ultimately these are pages I want people to land on. I am however getting duplicate content warnings in Moz. Most boats are in several categories and all will come up on 1 manufacturer and one manufacturer and model page. Both sets of URL's are in my opinion needed; lots of users search for exact makes / models and lots of users just search for the type of boat e.g. 'barge for sale' so both sets of landing pages are useful. Any suggestions or thoughts greatly appreciated Thanks Ben0 -
Mobile website content
What is the point of optimizing (on-page SEO) a parallel mobile website if the mobile search results are taken from the general (desktop) index?
On-Page Optimization | | echo10 -
How to avoid duplicate page content
I have over 5.000 duplicate page content because my urls contains ?district=1&sort=&how=ASC¤cy=EUR. How can I fix this?
On-Page Optimization | | bruki0 -
Term Extractor Tool?
I want to check content (Keyword Density and such) for a page before I load it to the server. The Term Extractor Tool is great for pages already loaded on the site but what if I want to scan content before I upload it? Is there a tool out there where I can cut and paste content from a program like word and have it scanned for keyword relevancy prior to uploading it? Thanks
On-Page Optimization | | fun52dig
Gary0