Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Mod Rewrite / .htaccess avoid duplicate content
-
I have been searching and testing for hours but cannot find a solution. I am able to get a URL to display with out the file exntension.
i.e domain.com/file instead of domain.com/file.php
The problem is both versions of the URL above work, therefore a duplicate content issue. How can I force the URL with the file extension not to resolve and give a 404 error? Or just redirect to the non extension URL?
IF it helps here is my code.
Options +FollowSymLinks
RewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+)$ $1.php [L,QSA] -
Hi Erik,
No problem, glad I could help
To answer your question, No it doesn't matter which you use because the end result will be re-written to remove the file extension and add a forward slash at the end.
For consistency I would suggest having it without the .php inside your content though. If nothing else it would save you the pain of having to remove .php from your content if you moved to a content management system in the future.
If you've got any other questions let me know, and I'll be happy to help.
Ben
-
Didnt say thanks before, so thank you. One question I did not think of. Should the internal linking of the site be to the file name with extension or no extension?
I think it should be without extension but just want to double check.
-
Hi Ben. I tried this code on another hosting account and it did worked. The first account was a VPS account from Godaddy. The second was a shared account from the same hosting company. Im not sure why it works on one and not on the other. I did see the mod_rewrite option enabled.
-
Just tried this on my development server and it worked fine:
RewriteBase / RewriteEngine on RewriteCond %{HTTP_HOST} ^test.local RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP RewriteRule (.).php$ $1 [R=301]
remove index RewriteRule (.*)index$ $1 [R=301]
remove slash if not directory RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ RewriteRule (.)/ $1 [R=301] # add .php to access file, but don't redirect RewriteCond %{REQUEST_FILENAME}.php -f RewriteCond %{REQUEST_URI} !/$RewriteRule (.) $1.php [L]
The dev URL is test.local so you would want to change this to www.yourdomain.co.ukI had a page called about.php if I entered http://test.local/about.php or http://test.local/about it would show http://test.local/about in the address bar
-
Hi Ben. Thanks for your help but this does not work for some reason. Im testing it on an old site I have that is html and I just replaced php for html but both URL's still resolves.
-
Good answer Ben.
My main site is my own CMS, that I built 10 years ago, so after I added a lot of things to the .htaccess file and it became too large, I just moved the handling inside the control program, that only looks up filed URLs when they are broken. This processing is fast, but if there was any degradation, it only affects the broken URLs.
Speaking of broken URLs, I was getting a few 400 return codes and it seems the webserver handles those, so you have no chance to handle it in .htaccess. So the wat to handle that is with a 400 handler - that on cpanel sites just needs a 400.shtml file, that you can customize.
- you get a 400 response if you request a URL with a % symbol on the end, and some other site did that, thanks very much, and then google decided it would be a great thing to index.
-
Try using this instead:
<code>RewriteBase /</code>
<code># remove .php; use THE_REQUEST to prevent infinite loops
RewriteCond %{HTTP_HOST} ^www.domain.com
RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP
RewriteRule (.).php$ $1 [R=301]remove index
RewriteRule (.*)index$ $1 [R=301]
remove slash if not directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} /$
RewriteRule (.*)/ $1 [R=301]add .php to access file, but don't redirect
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_URI} !/$
RewriteRule (.*) $1.php [L]</code>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recurring events and duplicate content
Does anyone have tips on how to work in an event system to avoid duplicate content in regards to recurring events? How do I best utilize on-page optimization?
Technical SEO | | megan.helmer0 -
Duplicate Content on a Page Due to Responsive Version
What are the implications if a web designer codes the content of the site twice into the page in order to make the site responsive? I can't add the url I'm afraid but the H1 and the content appear twice in the code in order to produce both a responsive version and a desktop version. This is a Wordpress site. Is Google clever enough to distinguish between the 2 versions and treat them individually? Or will Google really think that the content has been repeated on the same page?
Technical SEO | | Wagada0 -
.com and .co.uk duplicate content
hi mozzers I have a client that has just released a .com version of their .co.uk website. They have basically re-skinned the .co.uk version with some US amends so all the content and title tags are the same. What you do recommend? Canonical tag to the .co.uk version? rewrite titles?
Technical SEO | | KarlBantleman0 -
Do I use /es/, /mx/ or /es-mx/ for my Spanish site for Mexico only
I currently have the Spanish version of my site under myurl.com/es/ When I was at Pubcon in Vegas last year a panel reviewed my site and said the Spanish version should be in /mx/ rather than /es/ since es is for Spain only and my site is for Mexico only. Today while trying to find information on the web I found /es-mx/ as a possibility. I am changing my site and was planning to change to /mx/ but want confirmation on the correct way to do this. Does anyone have a link to Google documentation that will tell me for sure what to use here? The documentation I read led me to the /es/ but I cannot find that now.
Technical SEO | | RoxBrock0 -
Duplicate Content and URL Capitalization
I have multiple URLs that SEOMoz is reporting as duplicate content. The reason is that there are characters in the URL that may, or may not, be capitalized depending on user input. A couple examples are: www.househitz.com/Pennsylvania/Houses-for-sale www.househitz.com/Pennsylvania/houses-for-sale www.househitz.com/Pennsylvania/Houses-for-rent www.househitz.com/Pennsylvania/houses-for-rent There are currently thousands of instances of this on the site. Is this something I should spend effort to try and resolve (may not be minor effort), or should I just ignore it and move on?
Technical SEO | | Jom0 -
My .htaccess has changed, what do i do to avoid it again...?
Hello Today i notice that our site did not auto changed from without www to with, when i checked the .htaccess file i notice # in-front of each line and i know we did not insert it in there, after i removed it it worked fine. The only changes that we did recently was to a mobile version to the site but the call to autodirect is in a JS and not in the .htaccess, could it be the server..? is there any way that anything else might cause this...? The site is HTML and WP could it be because of that...? Thank's Simo
Technical SEO | | Yonnir0 -
How to resolve this Duplicate content?
Hi , There is page i get when i do proper menu navigation Caratlane.com>jewellery>rings>casualsrings> http://www.caratlane.com/jewellery/rings/casual-rings/leaves-dew-diamond-0-03-ct-peridot-1-ct-ring-18k-yellow-gold.html When i do a site search in my search box by my product code number "JR00219" The same page is appears with different url http://www.caratlane.com/leaves-dew-diamond-0-03-ct-peridot-1-ct-ring-18k-yellow-gold.html So there is a duplicate content. How can we resolve it. Regards, kathir caratlane.com
Technical SEO | | kathiravan0 -
Duplicate content and http and https
Within my Moz crawl report, I have a ton of duplicate content caused by identical pages due to identical pages of http and https URL's. For example: http://www.bigcompany.com/accomodations https://www.bigcompany.com/accomodations The strange thing is that 99% of these URL's are not sensitive in nature and do not require any security features. No credit card information, booking, or carts. The web developer cannot explain where these extra URL's came from or provide any further information. Advice or suggestions are welcome! How do I solve this issue? THANKS MOZZERS
Technical SEO | | hawkvt10