Mod Rewrite / .htaccess avoid duplicate content
-
I have been searching and testing for hours but cannot find a solution. I am able to get a URL to display with out the file exntension.
i.e domain.com/file instead of domain.com/file.php
The problem is both versions of the URL above work, therefore a duplicate content issue. How can I force the URL with the file extension not to resolve and give a 404 error? Or just redirect to the non extension URL?
IF it helps here is my code.
Options +FollowSymLinks
RewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+)$ $1.php [L,QSA] -
Hi Erik,
No problem, glad I could help
To answer your question, No it doesn't matter which you use because the end result will be re-written to remove the file extension and add a forward slash at the end.
For consistency I would suggest having it without the .php inside your content though. If nothing else it would save you the pain of having to remove .php from your content if you moved to a content management system in the future.
If you've got any other questions let me know, and I'll be happy to help.
Ben
-
Didnt say thanks before, so thank you. One question I did not think of. Should the internal linking of the site be to the file name with extension or no extension?
I think it should be without extension but just want to double check.
-
Hi Ben. I tried this code on another hosting account and it did worked. The first account was a VPS account from Godaddy. The second was a shared account from the same hosting company. Im not sure why it works on one and not on the other. I did see the mod_rewrite option enabled.
-
Just tried this on my development server and it worked fine:
RewriteBase / RewriteEngine on RewriteCond %{HTTP_HOST} ^test.local RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP RewriteRule (.).php$ $1 [R=301]
remove index RewriteRule (.*)index$ $1 [R=301]
remove slash if not directory RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ RewriteRule (.)/ $1 [R=301] # add .php to access file, but don't redirect RewriteCond %{REQUEST_FILENAME}.php -f RewriteCond %{REQUEST_URI} !/$RewriteRule (.) $1.php [L]
The dev URL is test.local so you would want to change this to www.yourdomain.co.ukI had a page called about.php if I entered http://test.local/about.php or http://test.local/about it would show http://test.local/about in the address bar
-
Hi Ben. Thanks for your help but this does not work for some reason. Im testing it on an old site I have that is html and I just replaced php for html but both URL's still resolves.
-
Good answer Ben.
My main site is my own CMS, that I built 10 years ago, so after I added a lot of things to the .htaccess file and it became too large, I just moved the handling inside the control program, that only looks up filed URLs when they are broken. This processing is fast, but if there was any degradation, it only affects the broken URLs.
Speaking of broken URLs, I was getting a few 400 return codes and it seems the webserver handles those, so you have no chance to handle it in .htaccess. So the wat to handle that is with a 400 handler - that on cpanel sites just needs a 400.shtml file, that you can customize.
- you get a 400 response if you request a URL with a % symbol on the end, and some other site did that, thanks very much, and then google decided it would be a great thing to index.
-
Try using this instead:
<code>RewriteBase /</code>
<code># remove .php; use THE_REQUEST to prevent infinite loops
RewriteCond %{HTTP_HOST} ^www.domain.com
RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP
RewriteRule (.).php$ $1 [R=301]remove index
RewriteRule (.*)index$ $1 [R=301]
remove slash if not directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} /$
RewriteRule (.*)/ $1 [R=301]add .php to access file, but don't redirect
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_URI} !/$
RewriteRule (.*) $1.php [L]</code>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Issues: Duplicate Content
Hi there
Technical SEO | | Kingagogomarketing
Moz flagged the following content issues, the page has duplicate content and missing canonical tags.
What is the best solution to do? Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/industrial-flooring/ Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring/0 -
Duplicate content when working with makes and models?
Okay, so I am running a store on Shopify at the address https://www.rhinox-group.com. This store is reasonably new, so being updated constantly! The thing that is really annoying me at the moment though, is I am getting errors in the form of duplicate content. This seems to be because we work using the machine make and model, which is obviously imperative, but then we have various products for each machine make and model. Have we got any suggestions on how I can cut down on these errors, as the last thing I want is being penalised by Google for this! Thanks in advance, Josh
Technical SEO | | josh.sprakes1 -
Is duplicate content ok if its on LinkedIn?
Hey everyone, I am doing a duplicate content check using copyscape, and realized we have used a ton of the same content on LinkedIn as our website. Should we change the LinkedIn company page to be original? Or does it matter? Thank you!
Technical SEO | | jhinchcliffe0 -
SEOMOZ and non-duplicate duplicate content
Hi all, Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework. Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same. Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress). I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not? Here is a URL and one of its "duplicates" according to the SEOMOZ report: http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
Technical SEO | | fretts
http://www.1010direct.com/TDV-019-GOLD-50/details.aspx Thanks for any help people0 -
Over 700+ duplicate content pages -- help!
I just signed up for SEO Moz pro for my site. The initial report came back with over 700+ duplicate content pages. My problem is that while I can see why some of the content is duplicated on some of the pages I have no idea why it's coming back as duplicated. Is there a tutorial for a novie on how to read the duplicate content report and what steps to take? It's an e-commerce website and there is some repetitive content on all the product pages like our "satisfaction guaranteed" text and the fabric material... and not much other text. There's not a unique product description because an image speaks for itself. Could this be causing the problem? I have lots of URLs with over 50+ duplicates. Thx for any help.
Technical SEO | | Santaur0 -
Duplicate content - wordpress image attachement
I have run my seomoz campaign through my wordpress site and found duplicate content. However, all of this duplicate content was either my logo or images and no content with addresses like /?attachement_id=4 for example . How should I resolve this? thank you.
Technical SEO | | htmanage0 -
Duplicate content + wordpress tags
According to SEOMoz platform, one of my wordpress websites deals with duplicate content because of the tags I use. How should I fix it? Is it loyal to remove tag links from the post pages?
Technical SEO | | giankar0 -
Duplicate Content Home Page
Hello, I am getting Duplicate Content warning from SEOMoz for my home page: http://www.teacherprose.com http://www.teacherprose.com/index html I tried code below in .htaccess: redirect 301 /index.html http://www.teacherprose.com This caused error "too many re-directs" in browser Any thoughts? Thank You, Eric
Technical SEO | | monthelie10