Duplicate content check picking up weird urls

erinhealthchoices

Hi everyone,

I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content.

I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical.

Thanks,

Erin

Laura.Lippay

Hi Erin -

Is that a Google Webmaster file?

Looking at those URLs in SERPS, it seems you have some content causing duplicates (although the file doesnt seem to represent it that way).

Here's the URLs in Google search results for Term-Life-Insurance:

Looking at the first two as an example, when you look at th pages themselves they are currently not exact duplicates. The first one is a video of a guy talking about term life insurance with some other video links, and the second page is a page that has an error "Error: Video Category Page is currently unavailable." where the page content should be. But that page had previously been an exact duplicate of the first URL the last time Google visited the page.

Here is the first page again:

http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance

Here is the cached version of the second (duplicate) page (as I'm currently seeing it, it was last cached on Apr 19, 2011):

http://webcache.googleusercontent.com/search?q=cache:lBiovAAyiF0J:www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/montreal/quebec+site:www.healthchoices.ca+inurl:Term-Life-Insurance&cd=3&hl=en&ct=clnk&gl=us&client=firefox-a&source=www.google.com

To see these pages (or any potential duplicate URL issues), do this search in Google:

site:www.healthchoices.ca
To find pages with a specific URL pattern (like the term life insurance pages) try "site:www.healthchoices.ca inurl:Term-Life-Insurance" (without the quotation marks)
Then at the end of the URL you see in the address bar, add "&filter=0" (without the quoutes).

So what is in your browser address bar would look like this (although it may have some additional thinkgs in your URL like your previous query and your browser and language for example - that's ok):

http://www.google.com/search?q=site:www.healthchoices.ca+inurl:Term-Life-Insurance&filter=0

I'm not sure what the URL issue is that you're referring to exactly based on the info you pasted and where you may have gotten it from - but I hope this is helpful.

mark-johnstone

Hi Erin,

Can I enquire a little more about where you are lifting these URLs from. I'm assuming you are downloading them from a Campaign? Are the URLs in question lifted from the same row in the CSV? What is the header of the columns they are lifted from? Just need a little more specificity about what we're looking at here in order to respond fully.

erinhealthchoices

Thanks for your responses. Hmm...I'm not sure how to do a screen shot as the only way I could see the errors was to download the file. I've pasted a few below straight from the doc

aarondicks

Erin, what tool are you using to find this? It might be something to do with the language that your CMS is written in - it might also be a matter of a trailing slash or a non www. version.

I'd be happy to help if you could provide a little more info, perhaps a screen shot?

Aaron

DanDeceuster

Duplicate content by definition is having the same content on different URL's. I've never had the tool tell me I have duplicate content on the same URL. You must be missing something. Is it www vs non-www perhaps? I don't know how you can get identical url's showing up in there.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate content check picking up weird urls

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is it possible to deindex old URLs that contain duplicate content?

Do mobile and desktop sites that pull content from the same source count as duplicate content?

Despite canonical duplicate content in WMT

Duplicate Content - Reverse Phone Directory

Index.php duplicate content

Bad Duplicate content issue

Help With Joomla Duplicate Content

Query string in url - duplicate content?