Duplicate content that looks unique

neooptic

OK, bit of an odd one. The SEOmoz crawler has flagged the following pages up as duplicate content. Does anyone have any idea what's going on?

http://www.gear-zone.co.uk/blog/november-2011/gear$9zone-guide-to-winter-insulation

http://www.gear-zone.co.uk/blog/september-2011/win-a-the-north-face-nuptse-2-jacket-with-gear-zone

http://www.gear-zone.co.uk/blog/july-2011/telephone-issues-$9-2nd-july-2011

http://www.gear-zone.co.uk/blog/september-2011/gear$9zone-guide-to-nordic-walking-poles

http://www.gear-zone.co.uk/blog/september-2011/win-a-the-north-face-nuptse-2-jacket-with-gear-zone

https://www.google.com/webmasters/tools/googlebot-fetch?hl=en&siteUrl=http://www.gear-zone.co.uk/

Cyrus-Shepard

Good question, because those pages look different to a human. The SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.

In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.

Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/

I ran the pages through a couple of tools which showed 96% HTML similarity.

(but only a 92% text similarity - which is good, but not great)

For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:4fKrbNTUnegJ:www.gear-zone.co.uk/blog/september-2011/win-a-the-north-face-nuptse-2-jacket-with-gear-zone+http://www.gear-zone.co.uk/blog/september-2011/win-a-the-north-face-nuptse-2-jacket-with-gear-zone&hl=en&gl=us&strip=1G

Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.

There are 2 ways to look at this: First of all, Google is much more sophisticated than SEOmoz at detecting duplicate content, and they are also better at contextual analysis - so they can probably tell these are not true duplicates.

Hope this helps! Best of luck with your SEO.

KeriMorgret

SEOmoz looks at the code on the page when it looks at duplicate content scores. My hunch is that there's a lot of identical code on those pages, which is causing the warning.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate content that looks unique

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

International SEO and duplicate content: what should I do when hreflangs are not enough?

Duplicate page content on numerical blog pages?

Duplicate Content / Canonical Conundrum on E-Commerce Website

Is This Considered Duplicate Content?

Duplicate Content... Really?

Duplicate content on subdomains.

Duplicate Content Warning For Pages That Do Not Exist

I need to add duplicate content, how to do this without penalty