Duplicate Content

Force7

Hi,

It looks like Seomoz (and Screaming Frog) is showing my home page as duplicate content.

http://www.mydomain.com Page Authority 61 Linking root Domain 321

http://www.mydomain.com/ Page Authority 61 Linking root Domain 321

[Screaming Frog shows duplicate as]
www.mydomain.com/
www.mydomain.com/index.html}

Years ago I hired someone to write the code for a rewrite for non www to be 301 redirected to www version. I was surprised at finding out that I still have a problem.

Here is the code on my htaccess page.

<ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.mydomain.com [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301]</ifmodule>

Was this code not properly written ?

One more question, we were hit hard by Panda and Penguin, would something like this be that much of a factor.

Thanks in advance,

Force7

Titan552

Thanks for the great advice. But once you've added the non-www to www redirect as you wrote above, why not just do this in .htaccess for the ".html to /" issue?

Redirect 301 /index.html http:/www.mydomain.com/

Or

In this case if you've done the ref canonical on he "/" home page, is that good enough or do you still need to redirect /index.html to "/" ?

Thanks!

Force7

So if I understand correctly, I should have..

<ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.mydomain.com [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301]</ifmodule>

on the .htaccess and then also add

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.index.html\ HTTP/ RewriteRule ^(.)index.html$ /$1 [R=301,L]

AND

RewriteCond %{HTTP_HOST} !^.mydomain.com$ [NC] RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

The internal linking structure of the site is as follows: main navigation is usually absolute, http://www.domain.com/page.php but throughout the site if I link a keyword i use the "/folder/page.php

When I do a "site" command on Google I see the

www.TranslationSoftware4u.com/ as the only one I saw listed

Our hits are down 70% so I am paranoid about making a mistake during the process of trying to find out how to recover from the latest update.

Appreciate the time you are taking to help answer this Matthew!

Thanks,

Force7

Matthew_Edgar

Hey, You are solving multiple problems. The code looks properly written to solve one of those problems--the naked domain to www domain redirect. So long as going to http://mydomain.com 301 redirects to www.mydomain.com, then you know that piece is working.

The second "problem" you have is that you can reach your home page with /index.html and without /index.html in the URL. So long as only one is indexed by Google, this isn't that big of a problem. You should however put in a canonical on your home page to make it clear which version you do want indexed. Then make sure all internal links go to that URL.

Alternatively, you can 301 redirect /index.html to the root via the htacess file. That code would go something like this:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.index.html\ HTTP/ RewriteRule ^(.)index.html$ /$1 [R=301,L]

The third "problem" is that your home page can be accessed with or without a trailing slash. Again, I wouldn't say this is a problem unless both versions are indexed. And, like the home page, you can fix this by adding a canonical element to the home page and link to that canonical consistently within your site.

However, this too can be fixed via htaccess. Here is an example of the htaccess code:

RewriteCond %{HTTP_HOST} !^.mydomain.com$ [NC] RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Thanks,
Matthew

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate Content - Home Page even wth Mod Rewrite 301

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Utilising Wordpress Attachment Pages Without Getting Duplicate Content Warnings.

Minimising the effects of duplicate content

Why are my Duplicated Pages not being updated?

Wordpress tags and duplicate content?

How to protect against duplicate content?

How to prevent duplicate content at a calendar page

How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.