Duplicate Page Content / Titles Help

gaz3342

Hi guys,

My SEOmoz crawl diagnostics throw up thousands of Dup Page Content / Title errors which are mostly from the forum attached to my website.

In-particular it's the forum user's profiles that are causing the issue, below is a sample of the URLs that are being penalised:

http://www.mywebsite.com/subfolder/myforum/pop_profile.asp?mode=display&id=1308

I thought that by adding - http://www.mywebsite.com/subfolder/myforum/pop_profile.asp to my robots.txt file under 'Ignore' would cause the bots to overlook the thousands of profile pages but the latest SEOmoz crawl still picks them up.

My question is, how can I get the bots to ignore these profile pages (they don't contain any useful content) and how much will this be affecting my rankings (bearing in mind I have thousands of errors for dup content and dup page titles).

Thanks guys

Gareth

gaz3342

Hi,

Just had my latest crawl test completed and unfortunately I am still getting thousands of Dup Page Title / Content errors.

It seems that adding /subfolder/my__forum/pop_profile.asp?mode=display&id=* to my disallowed list hasn't worked.

Can you think of anyting else I can try?

Thanks guys

Gareth

gaz3342

Fantastic!

Thanks for this Agents of Value.

Gareth

AgentsofValue

You could use a wildcard operator in your robots.txt to deal with this.

Something like this should work:

Disallow: /subfolder/myforum/pop_profile.asp?mode=display&id=*

See here for more details:

http://sanzon.wordpress.com/2008/04/29/advanced-usage-of-robotstxt-w-querystrings/

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate Page Content / Titles Help

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

SEO: How to change page content + shift its original content to other page at the same time?

Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)

Content per page?

Http://blogsearch.google.com/ping

Wordpress Duplicate Content

Duplicate Content Issue

Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash.

How to manage duplicate content?