Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Trailing Slashes for Magento CMS pages - 2 URLS - Duplicate content
-
Hello,
Can anyone help me find a solution to Fixing and Creating Magento CMS pages to only use one URL and not two URLS?
I found a previous article that applies to my issue, which is using htaccess to redirect request for pages in magento 301 redirect to slash URL from the non-slash URL. I dont understand the syntax fully in htaccess , but I used this code below.
This code below fixed the CMS page redirection but caused issues on other pages, like all my categories and products with this error:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Assuming you're running at domain root. Change to working directory if needed.
RewriteBase /
# www check
If you're running in a subdirectory, then you'll need to add that in
to the redirected url (http://www.mydomain.com/subdirectory/$1
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L]Trailing slash check
Don't fix direct file links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ $1/ [L,R=301]Finally, forward everything to your front-controller (index.php)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [QSA,L] -
301's are not difficult for me, but handling the code for a logic to re-route requests for "URL" to "URL/" is something I dont know how to do. I can manually 301 or rel canonical my CMS pages on Magento everytime, but that defeats the purpose or the automation in htaccess I am trying to get working.
thanks
-
Thank You Kevin.
This is almost the default Magento htaccess file(out of the box), I think I had a couple entries to fix a couple other issues, the code I just added that isnt working is in the middle of the htaccess, its commented starting with this: ** "## slash removal re-write done by ALEX MEADE for iamgreenminded.com**
uncomment these lines for CGI mode
make sure to specify the correct cgi php binary file name
it might be /cgi-bin/php-cgi
Action php5-cgi /cgi-bin/php5-cgi
AddHandler php5-cgi .php
############################################
GoDaddy specific options
Options -MultiViews
you might also need to add this line to php.ini
cgi.fix_pathinfo = 1
if it still doesn't work, rename php.ini to php5.ini
############################################
this line is specific for 1and1 hosting
#AddType x-mapp-php5 .php
#AddHandler x-mapp-php5 .php############################################
default index file
DirectoryIndex index.php
############################################
adjust memory limit
php_value memory_limit 64M
php_value memory_limit 256M
php_value max_execution_time 18000############################################
disable magic quotes for php request vars
php_flag magic_quotes_gpc off
############################################
disable automatic session start
before autoload was initialized
php_flag session.auto_start off
############################################
enable resulting html compression
#php_flag zlib.output_compression on
###########################################
disable user agent verification to not break multiple image upload
php_flag suhosin.session.cryptua off
###########################################
turn off compatibility with PHP4 when dealing with objects
php_flag zend.ze1_compatibility_mode Off
<ifmodule mod_security.c="">###########################################
disable POST processing to not break multiple image upload</ifmodule>
SecFilterEngine Off
SecFilterScanPOST Off############################################
enable apache served files compression
http://developer.yahoo.com/performance/rules.html#gzip
Insert filter on all content
###SetOutputFilter DEFLATE
Insert filter on selected content types only
#AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript
Netscape 4.x has some problems...
#BrowserMatch ^Mozilla/4 gzip-only-text/html
Netscape 4.06-4.08 have some more problems
#BrowserMatch ^Mozilla/4.0[678] no-gzip
MSIE masquerades as Netscape, but it is fine
#BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
Don't compress images
#SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
Make sure proxies don't deliver the wrong content
#Header append Vary User-Agent env=!dont-vary
############################################
make HTTPS env vars available for CGI mode
SSLOptions StdEnvVars
############################################
enable rewrites
Options +FollowSymLinks
RewriteEngine on############################################
slash removal re-write done by ALEX MEADE for iamgreenminded.com
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(.)/$
RewriteCond %{REQUEST_FILENAME} !.(gif|jpg|png|jpeg|css|js)$ [NC]
RewriteRule ^(.)$ http://%{HTTP_HOST}/$1/ [L,R=301]
########################################################################################
you can put here your magento root folder
path relative to web root
#RewriteBase /magento/
############################################
uncomment next line to enable light API calls processing
RewriteRule ^api/([a-z][0-9a-z_]+)/?$ api.php?type=$1 [QSA,L]
############################################
rewrite API2 calls to api.php (by now it is REST only)
RewriteRule ^api/rest api.php?type=rest [QSA,L]
############################################
workaround for HTTP authorization
in CGI environment
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
############################################
TRACE and TRACK HTTP methods disabled to prevent XSS attacks
RewriteCond %{REQUEST_METHOD} ^TRAC[EK]
RewriteRule .* - [L,R=405]############################################
redirect for mobile user agents
#RewriteCond %{REQUEST_URI} !^/mobiledirectoryhere/.$
#RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
#RewriteRule ^(.)$ /mobiledirectoryhere/ [L,R=302]############################################
always send 404 on missing files in these folders
RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
############################################
never rewrite for existing files, directories and links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l############################################
rewrite everything else to index.php
RewriteRule .* index.php [L]
############################################
Prevent character encoding issues from server overrides
If you still have problems, use the second line instead
AddDefaultCharset Off
#AddDefaultCharset UTF-8############################################
Add default Expires header
http://developer.yahoo.com/performance/rules.html#expires
ExpiresDefault "access plus 1 year"
############################################
By default allow all access
Order allow,deny
Allow from all###########################################
Deny access to release notes to prevent disclosure of the installed Magento version
<files release_notes.txt="">order allow,deny
deny from all</files>############################################
If running in cluster environment, uncomment this
http://developer.yahoo.com/performance/rules.html#etags
#FileETag none
Permanent URL redirect - generated by www.rapidtables.com
Redirect 301 /thebirdword http://www.thebirdword.com
-
You probably have other redirects in your .htaccess and possibly in your website code. The order of your rewrites is also important. Publish your Apache config and I'll take a look.
FYI, there are better resources for technical issue than MOZ. Most here are not developers/IT specialists; we're more like SEO strategists and business managers.
-
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ http://domain.com/$1/ [L,R=301]I have found both of the articles you linked here, nothing is working - any code I try gives me the same error on most of my pages:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Still need a fix for this
thanks
-
Yes, server redirects are necessary. Try these solutions to see which one works for you:
http://ralphvanderpauw.com/seo/how-to-301-redirect-a-trailing-slash-in-htaccess/
http://enarion.net/web/htaccess/trailing-slash/
You might want to consider moving to Nginx. You'll notice amazing speed and stability improvement with Nginx, Redis Session Cache, Memcached, OpCache, Ngx_pagespeed, and Magento Cache Storage Management. I can help much more with Nginx redirects and conf files--I gave up Apache years ago. Sorry I couldn't be of more help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing the Trailing Slash in Magento
Hi guys, We have noticed trailing slash vs non-trailing slash duplication on one of our sites. Example:
Intermediate & Advanced SEO | | brandonegroup
Duplicate: https://www.example.com.au/living/
Preferred: https://www.example.com.au/living So, SEO-wise, we suggested placing a canonical tag on all trailing slash pointing to non-trailing slash. However, devs have advised against removing the trailing slash from some URLs with a blanket rule, as this may break functionality in Magento that depends on the trailing slash. The full site would need to be tested after implementing a blanket rewrite rule. Is any other way to address this trailing slash duplication issue without breaking anything in Magento? Keen to hear from you guys. Cheers,0 -
Directory with Duplicate content? what to do?
Moz keeps finding loads of pages with duplicate content on my website. The problem is its a directory page to different locations. E.g if we were a clothes shop we would be listing our locations: www.sitename.com/locations/london www.sitename.com/locations/rome www.sitename.com/locations/germany The content on these pages is all the same, except for an embedded google map that shows the location of the place. The problem is that google thinks all these pages are duplicated content. Should i set a canonical link on every single page saying that www.sitename.com/locations/london is the main page? I don't know if i can use canonical links because the page content isn't identical because of the embedded map. Help would be appreciated. Thanks.
Intermediate & Advanced SEO | | nchlondon0 -
Help with facet URLs in Magento
Hi Guys, Wondering if I can get some technical help here... We have our site britishbraces.co.uk , built in Magento. As per eCommerce sites, we have paginated pages throughout. These have rel=next/prev implemented but not correctly ( as it is not in is it in ) - this fix is in process. Our canonicals are currently incorrect as far as I believe, as even when content is filtered, the canonical takes you back to the first page URL. For example, http://www.britishbraces.co.uk/braces/x-style.html?ajaxcatalog=true&brand=380&max=51.19&min=31.19 Canonical to... http://www.britishbraces.co.uk/braces/x-style.html Which I understand to be incorrect. As I want the coloured filtered pages to be indexed ( due to search volume for colour related queries ), but I don't want the price filtered pages to be indexed - I am unsure how to implement the solution? As I understand, because rel=next/prev implemented ( with no View All page ), the rel=canonical is not necessary as Google understands page 1 is the first page in the series. Therefore, once a user has filtered by colour, there should then be a canonical pointing to the coloured filter URL? ( e.g. /product/black ) But when a user filters by price, there should be noindex on those URLs ? Or can this be blocked in robots.txt prior? My head is a little confused here and I know we have an issue because our amount of indexed pages is increasing day by day but to no solution of the facet urls. Can anybody help - apologies in advance if I have confused the matter. Thanks
Intermediate & Advanced SEO | | HappyJackJr0 -
Duplicate content on recruitment website
Hi everyone, It seems that Panda 4.2 has hit some industries more than others. I just started working on a website, that has no manual action, but the organic traffic has dropped massively in the last few months. Their external linking profile seems to be fine, but I suspect usability issues, especially the duplication may be the reason. The website is a recruitment website in a specific industry only. However, they posts jobs for their clients, that can be very similar, and in the same time they can have 20 jobs with the same title and very similar job descriptions. The website currently have over 200 pages with potential duplicate content. Additionally, these jobs get posted on job portals, with the same content (Happens automatically through a feed). The questions here are: How bad would this be for the website usability, and would it be the reason the traffic went down? Is this the affect of Panda 4.2 that is still rolling What can be done to resolve these issues? Thank you in advance.
Intermediate & Advanced SEO | | iQi0 -
Case Sensitive URLs, Duplicate Content & Link Rel Canonical
I have a site where URLs are case sensitive. In some cases the lowercase URL is being indexed and in others the mixed case URL is being indexed. This is leading to duplicate content issues on the site. The site is using link rel canonical to specify a preferred URL in some cases however there is no consistency whether the URLs are lowercase or mixed case. On some pages the link rel canonical tag points to the lowercase URL, on others it points to the mixed case URL. Ideally I'd like to update all link rel canonical tags and internal links throughout the site to use the lowercase URL however I'm apprehensive! My question is as follows: If I where to specify the lowercase URL across the site in addition to updating internal links to use lowercase URLs, could this have a negative impact where the mixed case URL is the one currently indexed? Hope this makes sense! Dave
Intermediate & Advanced SEO | | allianzireland0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
News sites & Duplicate content
Hi SEOMoz I would like to know, in your opinion and according to 'industry' best practice, how do you get around duplicate content on a news site if all news sites buy their "news" from a central place in the world? Let me give you some more insight to what I am talking about. My client has a website that is purely focuses on news. Local news in one of the African Countries to be specific. Now, what we noticed the past few months is that the site is not ranking to it's full potential. We investigated, checked our keyword research, our site structure, interlinking, site speed, code to html ratio you name it we checked it. What we did pic up when looking at duplicate content is that the site is flagged by Google as duplicated, BUT so is most of the news sites because they all get their content from the same place. News get sold by big companies in the US (no I'm not from the US so cant say specifically where it is from) and they usually have disclaimers with these content pieces that you can't change the headline and story significantly, so we do have quite a few journalists that rewrites the news stories, they try and keep it as close to the original as possible but they still change it to fit our targeted audience - where my second point comes in. Even though the content has been duplicated, our site is more relevant to what our users are searching for than the bigger news related websites in the world because we do hyper local everything. news, jobs, property etc. All we need to do is get off this duplicate content issue, in general we rewrite the content completely to be unique if a site has duplication problems, but on a media site, im a little bit lost. Because I haven't had something like this before. Would like to hear some thoughts on this. Thanks,
Intermediate & Advanced SEO | | 360eight-SEO
Chris Captivate0 -
Duplicate Content | eBay
My client is generating templates for his eBay template based on content he has on his eCommerce platform. I'm 100% sure this will cause duplicate content issues. My question is this.. and I'm not sure where eBay policy stands with this but adding the canonical tag to the template.. will this work if it's coming from a different page i.e. eBay? Update: I'm not finding any information regarding this on the eBay policy's: http://ocs.ebay.com/ws/eBayISAPI.dll?CustomerSupport&action=0&searchstring=canonical So it does look like I can have rel="canonical" tag in custom eBay templates but I'm concern this can be considered: "cheating" since rel="canonical is actually a 301 but as this says: http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html it's legitimately duplicate content. The question is now: should I add it or not? UPDATE seems eBay templates are embedded in a iframe but the snap shot on google actually shows the template. This makes me wonder how they are handling iframes now. looking at http://www.webmaster-toolkit.com/search-engine-simulator.shtml does shows the content inside the iframe. Interesting. Anyone else have feedback?
Intermediate & Advanced SEO | | joseph.chambers1