Identifying Duplicate Content
-
Hi looking for tools (beside Copyscape or Grammarly) which can scan a list of URLs (e.g. 100 pages) and find duplicate content quite quickly.
Specifically, small batches of duplicate content, see attached image as an example.
Does anyone have any suggestions?
Cheers.
-
I'm going to recommend Screaming Frog here. Run a scan of your site and then filter it by duplicate title tags, duplicate meta descriptions, and (my favorite) word count. Usually I don't need to go any further than duplicate title tags.
There's also www.siteliner.com. I've used that regularly and it has been tremendously helpful for pages that have duplicate content in the body but not in the META.
Finally, Google Search Console. Go to Search Appearance and click on HTML Improvements. You can also find all your duplicate title tags there, which should help you identify duplicate content easily.
-
Exactly what i was looking for!
Thankyou.
-
Hi Jay! Great question here.
First of all, kudos to you for looking to kill duplicate content with fire. As a marketer but foremost a writer, I am all about great writing and not doing this duplicated/spun stuff to try to rank. It won't convert anyways.
I put out a call to my followers on Twitter and one of them recommended https://www.killduplicate.com/en. I haven't personally used it, but give it a shot! It comes highly recommended.
Hope that's helpful!
John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Wondering if creating 256 new pages would cause duplicate content issues
I just completed a long post that reviews 16 landing page tools. I want to add 256 new pages that compare each tool against each other. For example: Leadpages vs. Instapage Leadpages vs. Unbounce Instapage vs. Unbounce, etc Each page will have one product's information on the left and the other on the right. So each page will be a unique combination BUT the same product information will be found on several other pages (its other comparisons vs the other 15 tools). This is because the Leadpages comparison information (a table) will be the same no matter which tool it is being compared against. If my math is correct, this will create 256 new pages - one for each combination of the 16 tools against each other! My site now is new and only has 6 posts/pages if that matters. Want to make sure I don't create a problem early on...Any thoughts?
Intermediate & Advanced SEO | | martechwiz0 -
Duplicate page content errors for Web App Login
Hi There I have 6 duplicate content errors, but they are for the WebApp login from our website. I have put a Noindex on the Sitemap to stop google from indexing them to see if that would work. But it didn't. These links as far as I can see are not even on the website www.skemaz.net, but are links beyond the website and on the Web App itself eg : <colgroup><col width="529"></colgroup>
Intermediate & Advanced SEO | | Skemazer
| http://login.skemaz.net |
| http://login.skemaz.net/LogIn?ReturnUrl=%2Fchangepassword |
| http://login.skemaz.net/Login |
| http://login.skemaz.net/LogIn?ReturnUrl=%2FHome | Any suggestions would be greatly appreciated. Kind regards Sarah0 -
Duplicate content on recruitment website
Hi everyone, It seems that Panda 4.2 has hit some industries more than others. I just started working on a website, that has no manual action, but the organic traffic has dropped massively in the last few months. Their external linking profile seems to be fine, but I suspect usability issues, especially the duplication may be the reason. The website is a recruitment website in a specific industry only. However, they posts jobs for their clients, that can be very similar, and in the same time they can have 20 jobs with the same title and very similar job descriptions. The website currently have over 200 pages with potential duplicate content. Additionally, these jobs get posted on job portals, with the same content (Happens automatically through a feed). The questions here are: How bad would this be for the website usability, and would it be the reason the traffic went down? Is this the affect of Panda 4.2 that is still rolling What can be done to resolve these issues? Thank you in advance.
Intermediate & Advanced SEO | | iQi0 -
Semi-duplicate content yet authoritative site
So I have 5 real estate sites. One of those sites is of course the original, and it has more/better content on most of the pages than the other sites. I used to be top ranked for all of the subdivsion names in my town. Then when I did the next 2-4 sites, I had some sites doing better than others for certain keywords, and then I have 3 of those sites that are basically the same URL structures (besides the actual domain) and they aren't getting fed very many visits. I have a couple of agents that work with me that I loaned my sites to to see if that would help since it would be a different name. My same youtube video is on each of the respective subdivision pages of my site and theirs. Also, their content is just rewritten content from mine about the same length of content. I have looked over and seen a few of my competitors who only have one site and their URL structures arent good at all, and their content isn't good at all and a good bit of their pages rank higher than my main site which is very frustrating to say the least since they are actually copy cats to my site. I sort of started the precedent of content, mapping the neighborhood, how far that subdivision is from certain landmarks, and then shot a video of each. They have pretty much done the same thing and are now ahead of me. What sort of advice could you give me? Right now, I have two sites that are almost duplicate in terms of a template and same subdivsions although I did change the content the best I could, and that site is still getting pretty good visits. I originally did it to try and dominate the first page of the SERPS and then Penguin and Panda came out and seemed to figure that game out. So now, I would still like to keep all the sites, but I'm assuming that would entail making them all unique, which seems to be tough seeing as though my town has the same subdivisions. Curious as to what the suggestions would be, as I have put a lot of time into these sites. If I post my site will it show up in the SERPS? Thanks in advance
Intermediate & Advanced SEO | | Veebs0 -
Scraped content ranking above the original source content in Google.
I need insights on how “scraped” content (exact copy-pasted version) rank above the original content in Google. 4 original, in-depth articles published by my client (an online publisher) are republished by another company (which happens to be briefly mentioned in all four of those articles). We reckon the articles were re-published at least a day or two after the original articles were published (exact gap is not known). We find that all four of the “copied” articles rank at the top of Google search results whereas the original content i.e. my client website does not show up in the even in the top 50 or 60 results. We have looked at numerous factors such as Domain authority, Page authority, in-bound links to both the original source as well as the URLs of the copied pages, social metrics etc. All of the metrics, as shown by tools like Moz, are better for the source website than for the re-publisher. We have also compared results in different geographies to see if any geographical bias was affecting results, reason being our client’s website is hosted in the UK and the ‘re-publisher’ is from another country--- but we found the same results. We are also not aware of any manual actions taken against our client website (at least based on messages on Search Console). Any other factors that can explain this serious anomaly--- which seems to be a disincentive for somebody creating highly relevant original content. We recognize that our client has the option to submit a ‘Scraper Content’ form to Google--- but we are less keen to go down that route and more keen to understand why this problem could arise in the first place. Please suggest.
Intermediate & Advanced SEO | | ontarget-media0 -
Duplicate content, the distrubutors are copying the content of the manufacturer
Hi everybody! While I was checking all points of the Technical Site Audit Checklist 2015 (great checklist!), I found that the distrubutors of my client are copying part of the content to add it in their websites. When I take a content snippet, and put it in quotes and search for it I get four or five sites that have copied the content. They are distributors of my client. The first result is still my client (the manufacturer), but... should I recommend any action to this situation. We don't want to bother the distributors with obstacles. This situation could be a problem or is it a common situation and Google knows perfectly where the content is comming from? Any recommendation? Thank you!
Intermediate & Advanced SEO | | teconsite0 -
Duplicate Content for Deep Pages
Hey guys, For deep, deep pages on a website, does duplicate content matter? The pages I'm talk about are image pages associated with products and will never rank in Google which doesn't concern me. What I'm interested to know though is whether the duplicate content would have an overall effect on the site as a whole? Thanks in advance Paul
Intermediate & Advanced SEO | | kevinliao1 -
Joomla duplicate content
My website report says http://www.enigmacrea.com/diseno-grafico-portafolio-publicidad and http://www.enigmacrea.com/diseno-grafico-portafolio-publicidad?limitstart=0 Has the same content so I have duplicate pages the only problem is the ?limitstart=0 How can I fix this? Thanks in advance
Intermediate & Advanced SEO | | kuavicrea0