Identifying Duplicate Content
-
Hi looking for tools (beside Copyscape or Grammarly) which can scan a list of URLs (e.g. 100 pages) and find duplicate content quite quickly.
Specifically, small batches of duplicate content, see attached image as an example.
Does anyone have any suggestions?
Cheers.
-
I'm going to recommend Screaming Frog here. Run a scan of your site and then filter it by duplicate title tags, duplicate meta descriptions, and (my favorite) word count. Usually I don't need to go any further than duplicate title tags.
There's also www.siteliner.com. I've used that regularly and it has been tremendously helpful for pages that have duplicate content in the body but not in the META.
Finally, Google Search Console. Go to Search Appearance and click on HTML Improvements. You can also find all your duplicate title tags there, which should help you identify duplicate content easily.
-
Exactly what i was looking for!
Thankyou.
-
Hi Jay! Great question here.
First of all, kudos to you for looking to kill duplicate content with fire. As a marketer but foremost a writer, I am all about great writing and not doing this duplicated/spun stuff to try to rank. It won't convert anyways.
I put out a call to my followers on Twitter and one of them recommended https://www.killduplicate.com/en. I haven't personally used it, but give it a shot! It comes highly recommended.
Hope that's helpful!
John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
International SEO and duplicate content: what should I do when hreflangs are not enough?
Hi, A follow up question from another one I had a couple of months ago: It has been almost 2 months now that my hreflangs are in place. Google recognises them well and GSC is cleaned (no hreflang errors). Though I've seen some positive changes, I'm quite far from sorting that duplicate content issue completely and some entire sub-folders remain hidden from the SERP.
Intermediate & Advanced SEO | | GhillC
I believe it happens for two reasons: 1. Fully mirrored content - as per the link to my previous question above, some parts of the site I'm working on are 100% similar. Quite a "gravity issue" here as there is nothing I can do to fix the site architecture nor to get bespoke content in place. 2. Sub-folders "authority". I'm guessing that Google prefers sub-folders over others due to their legacy traffic/history. Meaning that even with hreflangs in place, the older sub-folder would rank over the right one because Google believes it provides better results to its users. Two questions from these reasons:
1. Is the latter correct? Am I guessing correctly re "sub-folders" authority (if such thing exists) or am I simply wrong? 2. Can I solve this using canonical tags?
Instead of trying to fix and "promote" hidden sub-folders, I'm thinking to actually reinforce the results I'm getting from stronger sub-folders.
I.e: if a user based in belgium is Googling something relating to my site, the site.com/fr/ subfolder shows up instead of the site.com/be/fr/ sub-sub-folder.
Or if someone is based in Belgium using Dutch, he would get site.com/nl/ results instead of the site.com/be/nl/ sub-sub-folder. Therefore, I could canonicalise /be/fr/ to /fr/ and do something similar for that second one. I'd prefer traffic coming to the right part of the site for tracking and analytic reasons. However, instead of trying to move mountain by changing Google's behaviour (if ever I could do this?), I'm thinking to encourage the current flow (also because it's not completely wrong as it brings traffic to pages featuring the correct language no matter what). That second question is the main reason why I'm looking out for MoZ's community advice: am I going to damage the site badly by using canonical tags that way? Thank you so much!
G0 -
SEO: How to change page content + shift its original content to other page at the same time?
Hello, I want to replace the content of one page of our website (already indexeed) and shift its original content to another page. How can I do this without problems like penalizations etc? Current situation: Page A
Intermediate & Advanced SEO | | daimpa
URL: example.com/formula-1
Content: ContentPageA Desired situation: Page A
URL: example.com/formula-1
Content: NEW CONTENT! Page B
URL: example.com/formula-1-news
Content: ContentPageA (The content that was in Page A!) Content of the two pages will be about the same argument (& same keyword) but non-duplicate. The new content in page A is more optimized for search engines. How long will it take for the page to rank better?0 -
A/B Testing - Should I add product descriptions on my category landing pages as well as on product pages and if so . how to do this to avoid duplicate content
Hi All, I recently relaunched a new design on my tool hire eCommerce website and now display my products in grid form on my category landing pages as opposed to just a list view which we previously had on the old design. My bounce rates are alot higher than they use to be and my gut instinct is telling me maybe this is wrong . I want to do some a/b testing using a list view. My question is , previously in our list views we just showed the images and pricing and had on page content on the bottom of the page. The user would click on the product image and they would then we taken to the product page which has the product description , t&c, etc etc.. If I was to do this in my a/b testing but change it so we also displayed the product descriptions as well on the category landing pages . Is there a special way to do this as in effect, we would have duplicate content as the product descriptions are also on the product page?. Does anyone have any thoughts on this as to whether its a No No from an SEO point of view ?... Heres a short url link to one of my category pages - http://goo.gl/QJv5gw Historically we use to rank well for the category landing pages and not for the product pages.Our Rankings are down , bounce rates are higher so I am trying to sort both. We have good content on pages etc. Any advice greatly appreciated as always thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Potential Pagination Issue/ Duplicate content issue
Hi All, We upgraded our framework , relaunched our site with new url structures etc and re did our site map to Google last week. However, it's now come to light that the rel=next, rel=Prev tags we had in place on many of our pages are missing. We are putting them back in now but my worry is , as they were previously missing when we submitted the , will I have duplicate content issues or will it resolve itself , as Google re-crawls the site over time ?.. Any advice would be greatly appreciated? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Google WMT Showing Duplicate Content, But There is None
In the HTML improvements section of Google Webmaster Tools, it is showing duplicate content and I have verified that the duplicate content they are listing does not exist. I actually have another duplicate content issue I am baffled by, but that it already being discussed on another thread. These are the pages they are saying have duplicate META descriptions, http://www.hanneganremodeling.com/bathroom-remodeling.html (META from bathroom remodeling page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Bathroom Remodeling Washington DC, Bathroom Renovation Washington DC, Bath Remodel, Northern Virginia,DC, VA, Washington, Fairfax, Arlington, Virginia</a>" /> http://www.hanneganremodeling.com/estimate-request.html (META From estimate page) <meta name="<a class="attribute-value">description</a>" content="<a class="attribute-value">Free estimates basement remodeling, bathroom remodeling, home additions, renovations estimates, Washington DC area</a>" /> WlO9TLh
Intermediate & Advanced SEO | | WebbyNabler0 -
Can PDF be seen as duplicate content? If so, how to prevent it?
I see no reason why PDF couldn't be considered duplicate content but I haven't seen any threads about it. We publish loads of product documentation provided by manufacturers as well as White Papers and Case Studies. These give our customers and prospects a better idea off our solutions and help them along their buying process. However, I'm not sure if it would be better to make them non-indexable to prevent duplicate content issues. Clearly we would prefer a solutions where we benefit from to keywords in the documents. Any one has insight on how to deal with PDF provided by third parties? Thanks in advance.
Intermediate & Advanced SEO | | Gestisoft-Qc1 -
Affiliate Site Duplicate Content Question
Hi Guys I have been un-able to find a definite answer to this on various forums, your views on this will be very valuable. I am doing a few Amazon affiliate sites and will be pulling in product data from Amazon via a Wordpress plugin. The plugin pulls in titles, descriptions, images, prices etc, however this presents a duplicate content issue and hence I can not publish the product pages with amazon descriptions. Due to the large number of products, it is not feasible to re-write all descriptions, but I plan re-write descriptions and titles for 50% of the products and publish then with “index, follow” attribute. However, for the other 50%, what would be the best way to handle them? Should I publish them as “noindex,follow”? **- Or is there another solution? Many thanks for your time.**
Intermediate & Advanced SEO | | SamBuck0 -
Capitals in url creates duplicate content?
Hey Guys, I had a quick look around however I couldn't find a specific answer to this. Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it. However, a heap of those errors are relating to random capitals in the urls. for example. "www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals). Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation? Any help is greatly appreciated. LM.
Intermediate & Advanced SEO | | CarlS0