Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
6 .htaccess Rewrites: Remove index.html, Remove .html, Force non-www, Force Trailing Slash
-
i've to give some information about my website Environment
1. i have static webpage in the root.
2. Wordpress installed in sub-dictionary www.domain.com/blog/
3. I have two .htaccess , one in the root and one in the wordpress
folder.i want to
- www to non on all URLs
- Remove index.html from url
- Remove all .html extension / Re-direct 301 to url
without .html extension - Add trailing slash to the static webpages / Re-direct 301 from non-trailing slash
- Force trailing slash to the Wordpress Webpages / Re-direct 301 from non-trailing slash
Some examples
domain.tld/index.html >> domain.tld/
domain.tld/file.html >> domain.tld/file/
domain.tld/file.html/ >> domain.tld/file/
domain.tld/wordpress/post-name >> domain.tld/wordpress/post-name/
My code in ROOT htaccess is
<ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /#removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]#www to non
RewriteCond %{HTTP_HOST} ^www.(([a-z0-9_]+.)?domain.com)$ [NC]
RewriteRule .? http://%1%{REQUEST_URI} [R=301,L]#html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [NC,L]#index redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://domain.com/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L]</ifmodule>The above code do
1. redirect www to non-www
2. Remove trailing slash at the end (if exists)
3. Remove index.html
4. Remove all .html
5. Redirect 301 to filename but doesn't add trailing slash at the end -
#index redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://domain.com/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L]hi anyone please help I use this code but now getting 404 error. please help.
i also remove this code again but still same issue.
-
Hi Tom,
thanks for your reply.
i have some problems
the above code doesn't
1 - Add trailing slash to the static webpages / Re-direct 301 from non-trailing slash
so it should be http://ghadaalsaman.com/articles/ instead of http://ghadaalsaman.com/articles
2 - Force trailing slash to the Wordpress Webpages / Re-direct 301 from non-trailing slash
-
Hey NeatIT!
I see you have a working solution there. Did you have a specific question about the setup?
I did notice that your setup cane sometimes result in chaining 301 redirects, which is one area for possible improvement.
Let me know how we can help!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 redirect hops from non-https and www
It's best practice to minimize the amount of 301 redirect hops. Ideally only one redirect hop. It's also best practice to 301 redirect (or at least canonical) your non-https and/or your non-www (or www) to the canonical protocol/subdomain. The simplest (and possibly the most common) way to implement canonical protocol/subdomain redirects is through a load balancer or before your app processes the request. Both of which will just blanket 301 to the canonical domain/protocol regardless if the path exists or not In which case, you could have: Two hops. i.e. hop #1 http://example.com/foo to https://example.com/foo, hop #2 https://example.com/foo to https://example.com/bar 301 to a 404. Let's say https://example.com/dog never existed, but somebody for whatever reason linked to it (maybe a typo). If I request https://www.example.com/dog, the load balancer would 301 to a 404 page. Either scenario above should be fairly rare. However, you can't control how people link to you. Should I care about either above scenario? I could have my app attempt to check if the page exists before forwarding, but that code could be complicated.
Intermediate & Advanced SEO | | dsbud0 -
Best way to remove full demo (staging server) website from Google index
I've recently taken over an in-house role at a property auction company, they have a main site on the top-level domain (TLD) and 400+ agency sub domains! company.com agency1.company.com agency2.company.com... I recently found that the web development team have a demo domain per site, which is found on a subdomain of the original domain - mirroring the site. The problem is that they have all been found and indexed by Google: demo.company.com demo.agency1.company.com demo.agency2.company.com... Obviously this is a problem as it is duplicate content and so on, so my question is... what is the best way to remove the demo domain / sub domains from Google's index? We are taking action to add a noindex tag into the header (of all pages) on the individual domains but this isn't going to get it removed any time soon! Or is it? I was also going to add a robots.txt file into the root of each domain, just as a precaution! Within this file I had intended to disallow all. The final course of action (which I'm holding off in the hope someone comes up with a better solution) is to add each demo domain / sub domain into Google Webmaster and remove the URLs individually. Or would it be better to go down the canonical route?
Intermediate & Advanced SEO | | iam-sold0 -
Removing index.php
I have question for the community and whether or not this is a good or bad idea. I currently have a Joomla site that displays www.domain.com/index.php in all the URLs with the exception of the home page. I have read that it's better to not have index.php showing in the URL at all. Does it really matter if I have index.php in my URL? I've read that it is a bad practice. I am thinking about installing the sh404SEF component on my site and removing the index.php. However, I rank pretty high for the keywords I want in Google, Bing and Yahoo. All of the URLs that show up in the searches have index.php as part of the URL. Has anyone ever used sh404SEF to remove the index.php and how did you overcome not loosing your search engine links? I don't want an existing search showing www.domain.com/index.php/sales and it not linking to the correct page which would now be www.domain.com/sales. I guess I could insert the proper redirects in the htaccess file. But I was hoping to avoid having every page of my site in the htaccess file for redirecting. Any help or advice appreciated.
Intermediate & Advanced SEO | | MedGroupMedia0 -
How to detect a bad link and remove ?
As per google penguin, all the low quality back links are going to affect the website SERPS hugely. So, we need to find all the bad back links and then remove them one by one. What I would like to know is, what tool do you use to find all the bad back links ? And how do we know which is a bad back link or bad website, where our link should not be there ? Then what service what do you suggest for back links removal. I contacted LinkDelete.com and they quoted me 97$ for a month to remove all links in less than 3 weeks.
Intermediate & Advanced SEO | | monali123
Let me know, what you suggest.0 -
Duplicate Content www vs. non-www and best practices
I have a customer who had prior help on his website and I noticed a 301 redirect in his .htaccess Rule for duplicate content removal : www.domain.com vs domain.com RewriteCond %{HTTP_HOST} ^MY-CUSTOMER-SITE.com [NC]
Intermediate & Advanced SEO | | EnvoyWeb
RewriteRule (.*) http://www.MY-CUSTOMER-SITE.com/$1 [R=301,L,NC] The result of this rule is that i type MY-CUSTOMER-SITE.com in the browser and it redirects to www.MY-CUSTOMER-SITE.com I wonder if this is causing issues in SERPS. If I have some inbound links pointing to www.MY-CUSTOMER-SITE.com and some pointing to MY-CUSTOMER-SITE.com, I would think that this rewrite isn't necessary as it would seem that Googlebot is smart enough to know that these aren't two sites. -----Can you comment on whether this is a best practice for all domains?
-----I've run a report for backlinks. If my thought is true that there are some pointing to www.www.MY-CUSTOMER-SITE.com and some to the www.MY-CUSTOMER-SITE.com, is there any value in addressing this?0 -
Our login pages are being indexed by Google - How do you remove them?
Each of our login pages show up under different subdomains of our website. Currently these are accessible by Google which is a huge competitive advantage for our competitors looking for our client list. We've done a few things to try to rectify the problem: - No index/archive to each login page Robot.txt to all subdomains to block search engines gone into webmaster tools and added the subdomain of one of our bigger clients then requested to remove it from Google (This would be great to do for every subdomain but we have a LOT of clients and it would require tons of backend work to make this happen.) Other than the last option, is there something we can do that will remove subdomains from being viewed from search engines? We know the robots.txt are working since the message on search results say: "A description for this result is not available because of this site's robots.txt – learn more." But we'd like the whole link to disappear.. Any suggestions?
Intermediate & Advanced SEO | | desmond.liang1 -
Reducing Booking Engine Indexation
Hi Mozzers, I am working on a site with a very useful room booking engine. Helpful as it may be, all the variations (2 bedrooms, 3 bedrooms, room with a view, etc, etc,) are indexed by Google. Section 13 on Search Pagination in Dr. Pete's great post on Panda http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world speaks to our issue, but I was wondering since 2 (!) years have gone by, if there are any additional solutions y'all might recommend. We want to cut down on the duplicate titles and content and get the useful but not useful for SERPs online booking pages out of the index. Any thoughts? Thanks for your help.
Intermediate & Advanced SEO | | Leverage_Marketing0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0