Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Trailing Slashes for Magento CMS pages - 2 URLS - Duplicate content
-
Hello,
Can anyone help me find a solution to Fixing and Creating Magento CMS pages to only use one URL and not two URLS?
I found a previous article that applies to my issue, which is using htaccess to redirect request for pages in magento 301 redirect to slash URL from the non-slash URL. I dont understand the syntax fully in htaccess , but I used this code below.
This code below fixed the CMS page redirection but caused issues on other pages, like all my categories and products with this error:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Assuming you're running at domain root. Change to working directory if needed.
RewriteBase /
# www check
If you're running in a subdirectory, then you'll need to add that in
to the redirected url (http://www.mydomain.com/subdirectory/$1
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L]Trailing slash check
Don't fix direct file links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ $1/ [L,R=301]Finally, forward everything to your front-controller (index.php)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [QSA,L] -
301's are not difficult for me, but handling the code for a logic to re-route requests for "URL" to "URL/" is something I dont know how to do. I can manually 301 or rel canonical my CMS pages on Magento everytime, but that defeats the purpose or the automation in htaccess I am trying to get working.
thanks
-
Thank You Kevin.
This is almost the default Magento htaccess file(out of the box), I think I had a couple entries to fix a couple other issues, the code I just added that isnt working is in the middle of the htaccess, its commented starting with this: ** "## slash removal re-write done by ALEX MEADE for iamgreenminded.com**
uncomment these lines for CGI mode
make sure to specify the correct cgi php binary file name
it might be /cgi-bin/php-cgi
Action php5-cgi /cgi-bin/php5-cgi
AddHandler php5-cgi .php
############################################
GoDaddy specific options
Options -MultiViews
you might also need to add this line to php.ini
cgi.fix_pathinfo = 1
if it still doesn't work, rename php.ini to php5.ini
############################################
this line is specific for 1and1 hosting
#AddType x-mapp-php5 .php
#AddHandler x-mapp-php5 .php############################################
default index file
DirectoryIndex index.php
############################################
adjust memory limit
php_value memory_limit 64M
php_value memory_limit 256M
php_value max_execution_time 18000############################################
disable magic quotes for php request vars
php_flag magic_quotes_gpc off
############################################
disable automatic session start
before autoload was initialized
php_flag session.auto_start off
############################################
enable resulting html compression
#php_flag zlib.output_compression on
###########################################
disable user agent verification to not break multiple image upload
php_flag suhosin.session.cryptua off
###########################################
turn off compatibility with PHP4 when dealing with objects
php_flag zend.ze1_compatibility_mode Off
<ifmodule mod_security.c="">###########################################
disable POST processing to not break multiple image upload</ifmodule>
SecFilterEngine Off
SecFilterScanPOST Off############################################
enable apache served files compression
http://developer.yahoo.com/performance/rules.html#gzip
Insert filter on all content
###SetOutputFilter DEFLATE
Insert filter on selected content types only
#AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript
Netscape 4.x has some problems...
#BrowserMatch ^Mozilla/4 gzip-only-text/html
Netscape 4.06-4.08 have some more problems
#BrowserMatch ^Mozilla/4.0[678] no-gzip
MSIE masquerades as Netscape, but it is fine
#BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
Don't compress images
#SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
Make sure proxies don't deliver the wrong content
#Header append Vary User-Agent env=!dont-vary
############################################
make HTTPS env vars available for CGI mode
SSLOptions StdEnvVars
############################################
enable rewrites
Options +FollowSymLinks
RewriteEngine on############################################
slash removal re-write done by ALEX MEADE for iamgreenminded.com
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(.)/$
RewriteCond %{REQUEST_FILENAME} !.(gif|jpg|png|jpeg|css|js)$ [NC]
RewriteRule ^(.)$ http://%{HTTP_HOST}/$1/ [L,R=301]
########################################################################################
you can put here your magento root folder
path relative to web root
#RewriteBase /magento/
############################################
uncomment next line to enable light API calls processing
RewriteRule ^api/([a-z][0-9a-z_]+)/?$ api.php?type=$1 [QSA,L]
############################################
rewrite API2 calls to api.php (by now it is REST only)
RewriteRule ^api/rest api.php?type=rest [QSA,L]
############################################
workaround for HTTP authorization
in CGI environment
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
############################################
TRACE and TRACK HTTP methods disabled to prevent XSS attacks
RewriteCond %{REQUEST_METHOD} ^TRAC[EK]
RewriteRule .* - [L,R=405]############################################
redirect for mobile user agents
#RewriteCond %{REQUEST_URI} !^/mobiledirectoryhere/.$
#RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
#RewriteRule ^(.)$ /mobiledirectoryhere/ [L,R=302]############################################
always send 404 on missing files in these folders
RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
############################################
never rewrite for existing files, directories and links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l############################################
rewrite everything else to index.php
RewriteRule .* index.php [L]
############################################
Prevent character encoding issues from server overrides
If you still have problems, use the second line instead
AddDefaultCharset Off
#AddDefaultCharset UTF-8############################################
Add default Expires header
http://developer.yahoo.com/performance/rules.html#expires
ExpiresDefault "access plus 1 year"
############################################
By default allow all access
Order allow,deny
Allow from all###########################################
Deny access to release notes to prevent disclosure of the installed Magento version
<files release_notes.txt="">order allow,deny
deny from all</files>############################################
If running in cluster environment, uncomment this
http://developer.yahoo.com/performance/rules.html#etags
#FileETag none
Permanent URL redirect - generated by www.rapidtables.com
Redirect 301 /thebirdword http://www.thebirdword.com
-
You probably have other redirects in your .htaccess and possibly in your website code. The order of your rewrites is also important. Publish your Apache config and I'll take a look.
FYI, there are better resources for technical issue than MOZ. Most here are not developers/IT specialists; we're more like SEO strategists and business managers.
-
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ http://domain.com/$1/ [L,R=301]I have found both of the articles you linked here, nothing is working - any code I try gives me the same error on most of my pages:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Still need a fix for this
thanks
-
Yes, server redirects are necessary. Try these solutions to see which one works for you:
http://ralphvanderpauw.com/seo/how-to-301-redirect-a-trailing-slash-in-htaccess/
http://enarion.net/web/htaccess/trailing-slash/
You might want to consider moving to Nginx. You'll notice amazing speed and stability improvement with Nginx, Redis Session Cache, Memcached, OpCache, Ngx_pagespeed, and Magento Cache Storage Management. I can help much more with Nginx redirects and conf files--I gave up Apache years ago. Sorry I couldn't be of more help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does rewriting a URL affect the page authority?
Hi all, I recently optimized an overview page for a car rental website. Because the page didn’t rank very well, I rewrote the URL, putting the exact keyword combination in it. Then I asked Google to re-crawl the URL through Search Console. This afternoon, I checked Open Site Explorer and saw that the Page Authority had decreased to 1, while the subpages still have an authority of about 18-20. Hence my question: is rewriting a URL a bad idea for SEO? Thank you,
Intermediate & Advanced SEO | | LiseDE
Lise0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Replace dynamic paramenter URLs with static Landing Page URL - faceted navigation
Hi there, got a quick question regarding faceted navigation. If a specific filter (facet) seems to be quite popular for visitors. Does it make sense to replace a dynamic URL e.x http://www.domain.com/pants.html?a_type=239 by a static, more SEO friendly URL e.x http://www.domain.com/pants/levis-pants.html by creating a proper landing page for it. I know, that it is nearly impossible to replace all variations of this parameter URLs by static ones but does it generally make sense to do this for the most popular facets choose by visitors. Or does this cause any issues? Any help is much appreciated. Thanks a lot in advance
Intermediate & Advanced SEO | | ennovators0 -
Google indexing only 1 page out of 2 similar pages made for different cities
We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?
Intermediate & Advanced SEO | | abhihan0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
International SEO - cannibalisation and duplicate content
Hello all, I look after (in house) 3 domains for one niche travel business across three TLDs: .com .com.au and co.uk and a fourth domain on a co.nz TLD which was recently removed from Googles index. Symptoms: For the past 12 months we have been experiencing canibalisation in the SERPs (namely .com.au being rendered in .com) and Panda related ranking devaluations between our .com site and com.au site. Around 12 months ago the .com TLD was hit hard (80% drop in target KWs) by Panda (probably) and we began to action the below changes. Around 6 weeks ago our .com TLD saw big overnight increases in rankings (to date a 70% averaged increase). However, almost to the same percentage we saw in the .com TLD we suffered significant drops in our .com.au rankings. Basically Google seemed to switch its attention from .com TLD to the .com.au TLD. Note: Each TLD is over 6 years old, we've never proactively gone after links (Penguin) and have always aimed for quality in an often spammy industry. **Have done: ** Adding HREF LANG markup to all pages on all domain Each TLD uses local vernacular e.g for the .com site is American Each TLD has pricing in the regional currency Each TLD has details of the respective local offices, the copy references the lacation, we have significant press coverage in each country like The Guardian for our .co.uk site and Sydney Morning Herlad for our Australia site Targeting each site to its respective market in WMT Each TLDs core-pages (within 3 clicks of the primary nav) are 100% unique We're continuing to re-write and publish unique content to each TLD on a weekly basis As the .co.nz site drove such little traffic re-wrting we added no-idex and the TLD has almost compelte dissapread (16% of pages remain) from the SERPs. XML sitemaps Google + profile for each TLD **Have not done: ** Hosted each TLD on a local server Around 600 pages per TLD are duplicated across all TLDs (roughly 50% of all content). These are way down the IA but still duplicated. Images/video sources from local servers Added address and contact details using SCHEMA markup Any help, advice or just validation on this subject would be appreciated! Kian
Intermediate & Advanced SEO | | team_tic1 -
Trailing Slash: Lost in Redirection?
Question here, but first the lead in. As you all know, 301 redirects don't pass on 100% of link juice. I've set up my site using htaccess to redirect all non-ww to www and redirect all URLs to have a trailing slash. FYI, the preferred domain is selected in WMT and canonical URLs appear in the head section of all pages. So now what happens when sites that link to mine don't include either the www or the trailing slash, which is actually quite common? Of course, asking the site own to correct the link is ideal, but that's not always possible. So if thousands of links on external sites are linking to http://www.site.com instead of http://www.site.com/, won't lots of link juice get lost in redirection? I can't think of anything more I can do to the URLs to reduce duplicate content and juice dilution. Thoughts? Kevin
Intermediate & Advanced SEO | | kwoolf0 -
Duplicate Content | eBay
My client is generating templates for his eBay template based on content he has on his eCommerce platform. I'm 100% sure this will cause duplicate content issues. My question is this.. and I'm not sure where eBay policy stands with this but adding the canonical tag to the template.. will this work if it's coming from a different page i.e. eBay? Update: I'm not finding any information regarding this on the eBay policy's: http://ocs.ebay.com/ws/eBayISAPI.dll?CustomerSupport&action=0&searchstring=canonical So it does look like I can have rel="canonical" tag in custom eBay templates but I'm concern this can be considered: "cheating" since rel="canonical is actually a 301 but as this says: http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html it's legitimately duplicate content. The question is now: should I add it or not? UPDATE seems eBay templates are embedded in a iframe but the snap shot on google actually shows the template. This makes me wonder how they are handling iframes now. looking at http://www.webmaster-toolkit.com/search-engine-simulator.shtml does shows the content inside the iframe. Interesting. Anyone else have feedback?
Intermediate & Advanced SEO | | joseph.chambers1