Finding Duplicate Content Spanning more than one Site?
-
Hi forum, SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site to another site to see if they share "duplicate content?" Thanks!
-
The Alert thing is great! I use it when we write new content (along with CopyScape after a week or so) just so I can make sure I'm outranking it. lol
-
Yes. I totally agree with Darin. There isn't a duplicate content penalty, per se, and the tools he listed are quite good suggestions as well.
-
IMHO, even if the HTML is different you could have duplicate content if the H1 or paragraph text is substantially similar. However, is this automatically penalized? No. Syndication of content can be quite prevalent on the Web. For example the AP breaks a news story and posts it online and it is subsequently picked up by the New York Times and Wall Street Journal. Wherever the content appeared first, particularly if it has a canonical tag in place, that source will be credited with having the original content. The other sites aren't going to be penalized, but they aren't going to benefit from it either.
Similar things happen on large e-commerce sites all the time. For example, 100's of e-commerce stores sell lightbulbs. Those descriptions are most certainly "substantially similar." It'd be kind of strange if they weren't. They aren't penalized for that.
I hope this is helpful! It is always good to set up a Google Alert for any great pieces of content you do write, just so you can be aware of who might be copying your stuff! (Tynt.com can also be very useful for this).
Good luck!
Dana
-
Just for the record there isn't any "Duplicate Content Penalty" so don't worry to much about this. Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results.
However, to answer your question I use copyscape to do this but you have to insert a URL and not just lines at a time.
Here are some other ones I've heard good things about:
I agree with Dana on the Google thing too. Like she said, "Just be sure to put quotes around your snippet."
-
This helps, thanks Dana. Is the actual paragraph content the main source of a duplicate content penalty? For example, what if the pages share different metadata and the HTML is entirely different except for the H1 text and paragraph content?
-
Hi Zora,
This best way to do this is to grab a random section of text from the page and go to Google, then paste that section of text in the search bar inside "quotes." For example, from your question above, I could search:
"SEOMoz's crawler identifies duplicate content within your own site, which is great. How can I compare my site"
you will see that the result in Google is a result to this page (once it's been indexed, which hasn't happened quite yet) - Just be sure to put quotes around your snippet.
Hope that helps!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will I be flagged for duplicate content by Google?
Hi Moz community, Had a question regarding duplicate content that I can't seem to find the answer to on Google. My agency is working on a large number of franchisee websites (over 40) for one client, a print franchise, that wants a refresh of new copy and SEO. Each print shop has their own 'microsite', though all services and products are the same, the only difference being the location. Each microsite has its own unique domain. To avoid writing the same content over and over in 40+ variations, would all the websites be flagged by Google for duplicate content if we were to use the same base copy, with the only changes being to the store locations (i.e. where we mention Toronto print shop on one site may change to Kelowna print shop on another)? Since the print franchise owns all the domains, I'm wondering if that would be a problem since the sites aren't really competing with one another. Any input would be greatly appreciated. Thanks again!
Intermediate & Advanced SEO | | EdenPrez0 -
SEO effect of content duplication across hub of sites
Hello, I have a question about a website I have been asked to work on. It is for a real estate company which is part of a larger company. Along with several other (rival) companies it has a website of property listings which receives a feed of properties from a central hub site - so lots of potential for page, title and meta content duplication (if if isn't already occuring) across the whole network of sites. In early investigation I don't see any of these sites ranking very well at all in Google for expected search phrases. Before I start working on things that might improve their rankings, I wanted to ask some questions from you guys: 1. How would such duplication (if it is occuring) effect the SEO rankings of such sites individually, or the whole network/hub collectively? 2. Is it possible to tell if such a site has been "burnt" for SEO purposes, especially if or from any duplication? 3. If such a site or the network has been totally burnt, are there any approaches or remedies that can be made to improve the site's SEO rankings significantly, or is the only/best option to start again from scratch with a brand new site, ensuring the use of new meta descriptions and unique content? Thanks in advance, Graham
Intermediate & Advanced SEO | | gmwhite9991 -
PDF for link building - avoiding duplicate content
Hello, We've got an article that we're turning into a PDF. Both the article and the PDF will be on our site. This PDF is a good, thorough piece of content on how to choose a product. We're going to strip out all of the links to our in the article and create this PDF so that it will be good for people to reference and even print. Then we're going to do link building through outreach since people will find the article and PDF useful. My question is, how do I use rel="canonical" to make sure that the article and PDF aren't duplicate content? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
Two Sites Similar content?
I just started working at this company last month. We started to add new content to pages like http://www.rockymountainatvmc.com/t/49/-/181/1137/Bridgestone-Motorcycle-Tires. This is their main site. Then i realized it also put the new content on their sister site http://www.jakewilson.com/t/52/-/343/1137/Bridgestone-Motorcycle-Tires. the first site is the main site and I think will get credit for the unique new content. The second one I do not think will get credit and will more than likely be counted as duplicate content. We are changing this so it will no longer be the same. However, I am curious to see ways people think we could fix this issues? Also is it effecting both sits for just the second one?
Intermediate & Advanced SEO | | DoRM0 -
Our Site's Content on a Third Party Site--Best Practices?
One of our clients wants to use about 200 of our articles on their site, and they're hoping to get some SEO benefit from using this content. I know standard best practices is to canonicalize their pages to our pages, but then they wouldn't get any benefit--since a canonical tag will effectively de-index the content from their site. Our thoughts so far: add a paragraph of original content to our content link to our site as the original source (to help mitigate the risk of our site getting hit by any penalties) What are your thoughts on this? Do you think adding a paragraph of original content will matter much? Do you think our site will be free of penalty since we were the first place to publish the content and there will be a link back to our site? They are really pushing for not using a canonical--so this isn't an option. What would you do?
Intermediate & Advanced SEO | | nicole.healthline1 -
Virtual Domains and Duplicate Content
So I work for an organization that uses virtual domains. Basically, we have all our sites on one domain and then these sites can also be shown at a different URL. Example: sub.agencysite.com/store sub.brandsite.com/store Now the problem comes up often when we move the site to a brand's URL versus hosting the site on our URL, we end up with duplicate content. Now for god knows what damn reason, I currently cannot get my dev team to implement 301's but they will implement 302's. (Dont ask) I also am left with not being able to change the robots.txt file for our site. They say if we allowed people to go in a change this stuff it would be too messy and somebody would accidentally block a site that was not supposed to be blocked on our domain. (We are apparently incapable toddlers) Now I have an old site, sub.agencysite.com/store ranking for my terms while the new site is not showing up. So I am left with this question: If I want to get the new site ranking what is the best methodology? I am thinking of doing a 1:1 mapping of all pages and set up 302 redirects from the old to the new and then making the canonical tags on the old to reflect the new. My only thing here is how will Google actually view this setup? I mean on one hand I am saying
Intermediate & Advanced SEO | | DRSearchEngOpt
"Hey, Googs, this is just a temp thing." and on the other I am saying "Hey, Googs, give all the weight to this page, got it? Graci!" So with my limited abilities, can anybody provide me a best case scenario?0 -
Having a hard time with duplicate page content
I'm having a hard time redirecting website.com/ to website.com The crawl report shows both versions as duplicate content. Here is my htaccess: RewriteEngine On
Intermediate & Advanced SEO | | cgman
RewriteBase /
#Rewrite bare to www
RewriteCond %{HTTP_HOST} ^mywebsite.com
RewriteRule ^(([^/]+/)*)index.php$ http://www.mywebsite.com/$1 [R=301,L] RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php [NC,L]
RewriteCond %{HTTP_HOST} !^.localhost$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}$1 [R=301,L] I added the last 2 lines after seeing a Q&A here, but I don't think it has helped.0 -
One site for multiple regions with a twist?
Hi there. I'm hoping to tap into the collective wisdom of this great community. I've just become involved in a business that spans two countries Australia and New Zealand. Currently we have one site in Australia "AAAA.com.au" and a similar site in New Zealand which is a joint venture with another company so for some reason they chose to merge the names to get "BBAAAA.co.nz". The Australian site is hosted in Australia and ranks #1 for targeted keywords in a competitive industry. The New Zealand site is hosted in NZ and has been live for nearly 2 years but ranks very poorly with targeted keywords i.e. not in the top 50! The content on sites is similar but not the same and phone numbers and location details are different etc. The NZ site has not been link building which is likely the main issue. What I want to do is now change the BBAAAA.co.nz site to AAAA.co.nz (the other company has agreed the name change is warranted) and service New Zealand from Australia using our well performing site. Any thoughts on the best way to achieve this to maximise the good ranking of the Australian site? The Australian site has a lot of back links from a range of sites. I've taken into account the following info at Google but I'm still stuck for the best answer given our tricky situation. http://www.google.com/support/webmasters/bin/answer.py?answer=182192#2 Would love to hear your thoughts on how to approach this one. Cheers in advance.
Intermediate & Advanced SEO | | ICMI0