Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate Content - Bulk analysis tool?
-
Hi
I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk?
I used Copyscape a while ago, but don't remember this having a bulk feature?
Thank you!
-
Great thank you!
I'll give both a go!
-
Great thanks
Yes I use screaming frog for this, but it was to look at actual page content. So yes to see if sites copy our content, but also to see whether we need to update our product content as some products are very similar.
I'll check the batch process on copyscape thanks!
-
I have not used this tool in this way, but have used it for other crawler projects related to content clean up and it is rock solid. They have been very responsive to me on questions related to use of the software. http://urlprofiler.com/
Duplicate content search is the project next on my list, here is how they do it.
http://urlprofiler.com/blog/duplicate-content-checker/
You let URL profiler crawl the section of your site that is most likely to be copied (say your blog) and you tell URL profiler what section of your HTML to compare against (i.e. the content section vs the header or footer). URL profiler then uses proxies (you have to buy the proxies) to perform Google searches on sentences from your content. It crawls those results to see if there is a site in the Google SERPs that has sentences from your content word for word (or pretty close).
I have played with Copyscape, but my markets are too niche for it to work for me. The logic here from URL profilers is that you are searching the database that most matters, Google.
Good luck!
-
I believe you might be able to use List Mode in ScreamingFrog to accomplish this, however it depends on ultimately what your goal is to check for duplicate content. Do you simply want to find duplicate titles or duplicate descriptions? Or do you want to find pages with sufficiently similar text as to warrant concern?
== Ooops! ==
It didn't occur to me that you were more interested in duplicate content caused by other sites copying your content rather than duplicate content among your list of URLs.
Copyscape does have a "Batch Process" tool but it is only available to paid subscribers. It does work quite nicely though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical: Same content but different countries
I'm building a website that has content made for specific countries. The url format is: MyWebsite.com/<country name="">/</country> Some of the pages for <specific url="">are the same for different countries, the <specific url="">would be the same as well. The only difference would be the <country name="">.</country></specific></specific> How do I deal with canonical issues to avoid Google thinking I'm presenting the same content?
On-Page Optimization | | newbyguy0 -
How to fix duplicate content for homepage and index.html
Hello, I know this probably gets asked quite a lot but I haven't found a recent post about this in 2018 on Moz Q&A, so I thought I would check in and see what the best route/solution for this issue might be. I'm always really worried about making any (potentially bad/wrong) changes to the site, as it's my livelihood, so I'm hoping someone can point me in the right direction. Moz, SEMRush and several other SEO tools are all reporting that I have duplicate content for my homepage and index.html (same identical page). According to Moz, my homepage (without index.html) has PA 29 and index.html has PA 15. They are both showing Status 200. I read that you can either do a 301 redirect or add rel=canonical I currently have a 301 setup for my http to https page and don't have any rel=canonical added to the site/page. What is the best and safest way to get rid of duplicate content and merge the my non index and index.html homepages together these days? I read that both 301 and canonical pass on link juice but I don't know what the best route for me is given what I said above. Thank you for reading, any input is greatly appreciated!
On-Page Optimization | | dreservices0 -
Does using Yoast variables for meta content overwrite any pages that already have custom meta content?
The question is about the Yoast plugin for WP sites. Let's say I have a site with 200 pages and custom meta descriptions / title tags already in place for the top 30 pages. If I use the Yoast variable tool to complete meta content for the remaining pages (and make my Moz issue tracker look happier), will that only affect the pages without custom meta descriptions or will it overwrite even the pages with the custom meta content that I want? In this situation, I do want to keep the meta content that is already in place on select pages. Thanks! Zack
On-Page Optimization | | rootandbranch0 -
Duplicate Content Re: Product listing body copy on Website, Amazon & Ebay - issues ?
Hi Is it ok to have identical product body copy on market/platform listings same as the websites product listings ? In this case the products are the websites/own brand products (all pages canonicalised), so i take it shouldn't cause any issues or are you supposed to differentiate the product body copy on marketplace listings ? Im asking re seo reasons All Best Dan
On-Page Optimization | | Dan-Lawrence0 -
Duplicate Content when Using "visibility classes" in responsive design layouts? - a SEO-Problem?
I have text in the right column of my responsive layout which will show up below the the principal content on small devices. To do this I use visibility classes for DIVs. So I have a DIV with with a unique style text that is visible only on large screen sizes. I copied the same text into another div which shows only up only on small devices while the other div will be hidden in this moment. Technically I have the same text twice on my page. So this might be duplicate content detected as SPAM? I'm concerned because hidden text on page via expand-collapsable textblocks will be read by bots and in my case they will detect it twice?Does anybody have experiences on this issue?bestHolger
On-Page Optimization | | inlinear0 -
Duplicate Content for Spanish & English Product
Hi There, Our company provides training courses and I am looking to provide the Spanish version of a course that we already provide in English. As it is an e-commerce site, our landing page for the English version gives the full description of the course and all related details. Once the course is purchased, a flash based course launches within a player window and the student begins the course. For the Spanish version of the course, my target customers are English speaking supervisors purchasing the course for their Spanish speaking workers. So the landing page will still be in English (just like the English version of the course) with the same basic description, with the only content differences on that page being the inclusion of the fact that this course is in Spanish and a few details around that. The majority of the content on these two separate landing pages will be exactly the same, as the description for the overall course is the same, just that it's presented in a different language, so it needs to be 2 separate products. My fear is that Google will read this as duplicate content and I will be penalized for it. Is this a possibility or will Google know why I set it up this way and not penalize me? If that is a possibility, how should I go about doing this correctly? Thanks!
On-Page Optimization | | NiallTom0 -
Best practice for franchise sites with duplicated content
I know that duplicated content is a touchy subject but I work with multiple franchise groups and each franchisee wants their own site, however, almost all of the sites use the same content. I want to make sure that Google sees each one of these sites as unique sites and does not penalize them for the following issues. All sites are hosted on the same server therefor the same IP address All sites use generally the same content across their product pages (which are very very important pages) *templated content approved by corporate Almost all sites have the same design (A few of the groups we work with have multiple design options) Any suggestions would be greatly appreciated. Thanks Again Aaron
On-Page Optimization | | Shipyard_Agency0 -
Avoiding "Duplicate Page Title" and "Duplicate Page Content" - Best Practices?
We have a website with a searchable database of recipes. You can search the database using an online form with dropdown options for: Course (starter, main, salad, etc)
On-Page Optimization | | smaavie
Cooking Method (fry, bake, boil, steam, etc)
Preparation Time (Under 30 min, 30min to 1 hour, Over 1 hour) Here are some examples of how URLs may look when searching for a recipe: find-a-recipe.php?course=starter
find-a-recipe.php?course=main&preperation-time=30min+to+1+hour
find-a-recipe.php?cooking-method=fry&preperation-time=over+1+hour There is also pagination of search results, so the URL could also have the variable "start", e.g. find-a-recipe.php?course=salad&start=30 There can be any combination of these variables, meaning there are hundreds of possible search results URL variations. This all works well on the site, however it gives multiple "Duplicate Page Title" and "Duplicate Page Content" errors when crawled by SEOmoz. I've seached online and found several possible solutions for this, such as: Setting canonical tag Adding these URL variables to Google Webmasters to tell Google to ignore them Change the Title tag in the head dynamically based on what URL variables are present However I am not sure which of these would be best. As far as I can tell the canonical tag should be used when you have the same page available at two seperate URLs, but this isn't the case here as the search results are always different. Adding these URL variables to Google webmasters won't fix the problem in other search engines, and will presumably continue to get these errors in our SEOmoz crawl reports. Changing the title tag each time can lead to very long title tags, and it doesn't address the problem of duplicate page content. I had hoped there would be a standard solution for problems like this, as I imagine others will have come across this before, but I cannot find the ideal solution. Any help would be much appreciated. Kind Regards5