Advice needed on how to handle alleged duplicate content and titles

SearchPM

Hi

I wonder if anyone can advise on something that's got me scratching my head.

The following are examples of urls which are deemed to have duplicate content and title tags. This causes around 8000 errors, which (for the most part) are valid urls because they provide different views on market data. e.g. #1 is the summary, while #2 is 'Holdings and Sector weightings'.

#3 is odd because it's crawling the anchored link. I didn't think hashes were crawled?

I'd like some advice on how best to handle these, because, really they're just queries against a master url and I'd like to remove the noise around duplicate errors so that I can focus on some other true duplicate url issues we have.

Here's some example urls on the same page which are deemed as duplicates.

1) http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE

What's the best way to handle this?

kyleNeedham

I would defiantly not tell Google to ignore parameters since you have pages ranking high with URL parameters in them.

Be careful if you do implement a canonical, because you could end up removing a few good ranking pages since the URL parameter pages are the ones currently ranking best.

Personally i would just ignore these errors since Google has done a pretty good job choosing the best page already.

You could block Rogerbot from crawling parameter pages.

SearchPM

Thanks. This is the only solution I can think of too but the information on each of the tabs is actually different, so technically it is a unique page.

That said the likelihood of someone searching for such a specific subset of that data associated with one company or fund is arguably extremely low, which is why i wasn't sure whether to apply a canonical or not, just to reduce the noise.

I suppose another approach is to tell Google to ignore parameter 's' which forms part of the query which loads one of the subsets of data?

Slightly wary of doing that

kyleNeedham

Hi,

The best way to fix this would be to implement the canonical tag, this would stop Google/Rogerbot thinking those pages are duplicated and focus on the URL you specified.

Check this post from Google explaining all about it.

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394

Kyle

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Advice needed on how to handle alleged duplicate content and titles

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Http vs. https - duplicate content

Self referencing canonicals and paginated content - advice needed

Duplicate content on URL trailing slash

Are all duplicate content issues bad? (Blog article Tags)

Does rel=canonical fix duplicate page titles?

Load balancing - duplicate content?

Subdomains - duplicate content - robots.txt

Nuanced duplicate content problem.