Checking for content duplication against content on your own site.
-
We are currently trying to rewrite our product descriptions and I'm afraid some of the salespeople that are writing the descriptions are plagiarizing one-another's writing. Is there a content duplication checker that will allow you to check a piece of writing against a specific site rather than all of the web?
-
I assume that you have an admin section in the CMS where you are editing and entering these articles before they go live.
You need to get a developer to simply write a search algo that when you create a new article and before it goes live, it takes sections of your content and looks for matches/duplicates. You can set a requirement that it has to match on a minimum of a 4 to 5 word string and other such limitations to make sure you are not matching too many items. It will take a few tests to find a sweet spot of too many matches vs not enough.
With 17K pages, this is the only way you can really do this in an efficient way, you need some IT support/development. They may have to create a reporting layer as well to help you sift through the results.
Good luck.
-
I have two dev servers, one of which it is possible to do what you're talking about but that is the absolute least efficient tool to use for this.
The crawl diagnostics are updated about once a week which means I would have to post the new content and hope I got it online in time for the crawl. If I didn't then I would have to wait an additional week to see results.
The crawl diagnostics also limits the amount of pages it will crawl on your site to 10,000. I stated before that I have over 17,000 pages. So even if I did use this method, the chances of that page being crawled is little better than 50/50.
Also, the crawl diagnostics only tell you what pages have duplicate content - not the exact content that was duplicated. That means I'd have to manually find the page I'm targeting, then follow the supposed duplicate content suggestions proposed by the crawler and find the similarities myself.
I think it's very safe to say that the crawl diagnostics, nor any product that SEOmoz provides, is an answer to my issue. If I thought it was, I would have already been using it and would not have posted this question.
-
Hi Michael,
Having a website that big means that you might have a test or dev environment.
If not create one.
if you have something like test.yourwebsite.com and submit it to the SEOmoz tools as a new project you can see a report before your website goes live.
Cornel
-
Those are good answers and would work on a smaller scale site. We currently have over 17,000 product pages so I can't really use either method. It's looking like a google custom search is the best bet even though I can't search an entire paragraph at a time.
-
Just off the top of my head, there are a few low tech ways to do it....
If you have Win 7 the searching has improved greatly - just move all files to a local machine - and search the directory you placed in for the content you are wanting to check - it will give all files that contain the words. (but can become overloading)
If you have dreamweaver or other enterprise level editor - almost all have a site search function to where you can search/profile code/text and have it find one by one which pages contain the searched terms - or globally list them.
Other than that, probably a custom script -or a google search for an HTML profiler might help?
Shane
-
That's for pages that are already published and crawled. I want to able to search my site for entire sentences and/or paragraphs of text that I have yet to publish so I can make sure it's not being used elsewhere on the site. The crawl diagnostics tell me I have duplicate content after the fact - I'm trying to take a proactive approach rather than reactive.
-
The duplicate content from you website is shown in the SEOmoz tools.
Check the Crawl Diagnostics Summary:
Cornel
-
That site searches the entire web for copies. I'm looking for something to crawl my own site for duplicate content.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solve duplicate content issues by using robots.txt
Hi, I have a primary website and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. Because of having many URL duplicate contents, so I want to use the robots.txt file to prevent google index the secondary websites to fix the duplicate content issue. Is it ok? Thank for any help!
On-Page Optimization | | JohnHuynh0 -
Duplicate content with mine other websites
Hi,
On-Page Optimization | | JohnHuynh
I have a primary website www.vietnamvisacorp.com and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. What the best solution for this issue? Please help! p/s: I am a webmaster of all websites duplicate content Thank you!0 -
Duplicate content
Are images considered duplicate content too? Example:
On-Page Optimization | | BridalHotspot
I've got a size chart on each my lingerie pages. All written content is unique but I'm using the same chart for all those pages.0 -
Is there a guide to best practices for site content and blogs?
We have been working hard producing good content for our sites and now we need to know what are the most current best practices regarding placing and organizing content. We do the usual social media blast with Twitter, FB, G+ with each blog post. But it seems there is more that can and should be done. What about authorship and schema tags?
On-Page Optimization | | devonkrusich0 -
Can you have more than 1 site on the first page if site look and content is completely different but keywords are the sam.
I have a client that wants to build another completely different site than his main site and optimize it to have 2 websites on the first page for his keywords. The content and look and feel of the website would be completely different. One of his competitors is doing it and getting away with it. What is your advice.
On-Page Optimization | | Roots70 -
Duplicate content on ecommerce
We have a website that we created a little over a year ago and have included our core products we have always focused on such as mobility scooters and power wheelchairs. We have been going through and updating product descriptions, adding product reviews that our customers have provided etc in order to improve on our SEO rankings and not be penalized by the Panda update. We were approached by a manufacturer last year about their products and they had close to 10k products that we were able to upload easily into our system. Obviously these all have standard manufacturers descriptions many sites are also using. It will take us forever to go through and change all of these and many products are similar to each other anyway they just vary in size, color etc. Will it help our rankings for our core products to simply go through and delete all of these additional products and categories and just add them one by one with unique descriptions and more detailed information when we have time? We aren't really selling many of them anyway so it won't hurt our sales. I'm clearly new to SEO and any help at all would be greatly appreciated. My main website is www.bestmedicalsuppliesonsale dot com A sample core category that we have changed descriptions for is http://www.bestmedicalsuppliesonsale.com/mobility-scooters-s/36.htm A sample of a category and products we simply uploaded would be at http://www.bestmedicalsuppliesonsale.com/Wound-Care-s/4837.htm I'm open to all suggestions I would just like to see my traffic and obviously sales increase. If there are any other glaring problems please let me know. I need help!
On-Page Optimization | | BestMedical0 -
Tags creating duplicated content issue?
Hello i believe a lot of us use tags in our blogs as a way to categorize content and make it easy searchable but this usually (at lease in my case) cause duplicate content creation. For example, if one article has 2 tags like "SEO" & "Marketing", then this article will be visible and listed in 2 urls inside the blog like this domain.com/blog/seo and domain.com/blog/marketing In case of a blog with 300+ posts and dozens of different tags this is creating a huge issue. My question is 1. Is this really bad? 2. If yes how to fix it without removing tags?
On-Page Optimization | | Lakiscy0 -
Site Structure
I'm confused about the best way for seo to set up the site structure . i understand the examples of the pyramid diagrams and how link juice flows, however does this mean that global navigation is not good? It appears the pyramid structure leads to the designated number of category pages (we'll use five) and they lead to the 5 content pages etc and some "superman pages" can be linked to from the home page but is this is global navigation or anchor text navigation and is gloval navigation acdeptable for content pages? Any input would be greatly appreciated. Thank you.
On-Page Optimization | | JulB0