Index.php canonical/dup issues
-
Hello my fellow SEOs!
I would LOVE some additional insight/opinions on the following...
I have a client who is an industry leader, big site, ranks for many competitive phrases, blah blah..you get the picture.
However, they have a big dup content/canonical issue. Most pages resolve with and without the /index.php at the end of the URL. Obviously this is a dup content issue but more importantly they SEs sometimes serve an "index.php" version of the page, sometimes they don't, and it is constantly changing which version it serves and the rank goes up and down.
Now, I've instructed them that we are going to need to write a sitewide redirect to attempt a uniform structure. Most people would say, redirect to the non index.php version buttttt
1. The index.php pages consistently outperforms the non index.php versions, except the homepage.
2. The client really would prefer to have the "index.php" at the end of the URL
The homepage performs extremely well for a lot of competitive phrases. I'd like to redirect all pages to the "index.php" version except the homepage and I'm thinking that if I redirect all pages EXCEPT the homepage to the index.php version, it could cause some unforeseen issues.
I can not use rel=canonical because they have many different versions of the their pages with different country codes in the URL..example, if I make the US version canonical, it will hurt the pages trying to rank with a fr URL, de URL, (where fr/de are country codes in the URL depending where the user is, it serves the correct version).
Any advice would be GREATLY appreciated. Thanks in advance!
Mike
-
Have you checked the backlinks? The only logical reason I can think of for the index.php versions of the URL to outperform the friendly versions is more sites have linked to them.
I would make every effort to convince the client to use friendly URLs. Users clearly prefer them and technologies change. Even if they are using .php today, in a couple years it may be a dead technology and they will have to redirect their entire site. It's not a logical business move.
With the above noted, if you wish to perform the redirect of all pages except the home page to the index.php form of the URL, it is doable with the proper regex expression. The issues I foresee have already been shared:
-
URLs are harder to read by users and are therefore less friendly
-
URLs are longer so therefore more difficult to share naturally in tweets (for example) without a URL shortening service
-
URLs include "php" so when the site's technology changes the URLs will need to be redirected
-
Users may experience confusion related to the inconsistent URL formats of the home page and the rest of the site
-
Long URLs are cut off. You mentioned using other languages. If a page's title involves foreign characters, those characters are converted in the URL to ?unicode. It is where you see characters like "%20" replace a single character. With foreign URLs the length can often exceed maximums which is an issue. Keeping index.php is an extra 9 characters added to every URL.
This decision approaches the SEO equivalent of a patient going against their doctor's advice. If it was my client, I would want a very firm acknowledgment this decision was against my advice and industry best practices.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Strategy/Duplicate Content Issue, rel=canonical question
Hi Mozzers: We have a client who regularly pays to have high-quality content produced for their company blog. When I say 'high quality' I mean 1000 - 2000 word posts written to a technical audience by a lawyer. We recently found out that, prior to the content going on their blog, they're shipping it off to two syndication sites, both of which slap rel=canonical on them. By the time the content makes it to the blog, it has probably appeared in two other places. What are some thoughts about how 'awful' a practice this is? Of course, I'm arguing to them that the ranking of the content on their blog is bound to be suffering and that, at least, they should post to their own site first and, if at all, only post to other sites several weeks out. Does anyone have deeper thinking about this?
Intermediate & Advanced SEO | | Daaveey0 -
Redirect Issue in .htaccess
Hi, I'm stumped on this, so I'm hoping someone can help. I have a Wordpress site that I migrated to https about a year ago. Shortly after I added some code to my .htaccess file. My intention was to force https and www to all pages. I did see a moderate decline in rankings around the same time, so I feel the code may be wrong. Also, when I run the domain through Open Site Explorer all of the internal links are showing 301 redirects. The code I'm using is below. Thank you in advance for your help! Redirect HTTP to HTTPS RewriteEngine On ensure www. RewriteCond %{HTTP_HOST} !^www. [NC]
Intermediate & Advanced SEO | | JohnWeb12
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301] ensure https RewriteCond %{HTTP:X-Forwarded-Proto} !https
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress USER IP BANNING <limit get="" post="">order allow,deny
deny from 213.238.175.29
deny from 66.249.69.54
allow from all</limit> #Enable gzip compression
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript #Setting heading expires
<ifmodule mod_expires.c="">ExpiresActive on
ExpiresDefault "access plus 1 month"
ExpiresByType application/javascript "access plus 1 year"
ExpiresByType image/x-ico "access plus 1 year"
ExpiresByType image/jpg "access plus 14 days"
ExpiresByType image/jpeg "access plus 14 days"
ExpiresByType image/gif "access plus 14 days"
ExpiresByType image/png "access plus 14 days"
ExpiresByType text/css "access plus 14 days"</ifmodule>0 -
Visibility for https://goo.gl/gJH7eh
Hi Mozzers, I am wondering if anyone can help me with the following. At the start of May this year we really lost visibility for the homepage of this site https://goo.gl/gJH7eh. This was particularly noticeable by tracking rankings for the term 'oak furniture'. We previously ranked on page 1 for the term 'oak furniture', but since May the homepage has struggled to make the top 100 positions for this term. We're confident that we have done everything within Google's guidelines, but it seems something is really holding the homepage back. The site ranks on page 1 for 'oak furniture' on Bing. The site had previously had a manual penalty for unnatural links (warning received several years ago). These links had a particular emphasis on using the anchor text 'oak furniture'. When we took over the site we did an extensive link clean up and disavow and managed to get the penalty removed at the end of October 2013. Any help would be greatly appreciated. Karen
Intermediate & Advanced SEO | | OFS0 -
How to take out international URL from google US index/hreflang help
Hi Moz Community, Weird/confusing question so I'll try my best. The company I work for also has an Australian retail website. When you do a site:ourbrand.com search the second result that pops up is au.brand.com, which redirects to the actual brand.com.au website. The Australian site owner removed this redirect per my bosses request and now it leads to a an unavailable webpage. I'm confused as to best approach, is there a way to noindex the au.brand.com URL from US based searches? My only problem is that the au.brand.com URL is ranking higher than all of the actual US based sub-cat pages when using a site search. Is this an appropriate place for an hreflang tag? Let me know how I can help clarify the issue. Thanks,
Intermediate & Advanced SEO | | IceIcebaby
-Reed0 -
What may cause a page not to be indexed (be de-indexed)?
Hi All, I have a main category page, a landing page, that does not appear in the SERPS at all (even if I serach for a whole sentence from it). This page once ranked high. What may cause such a punishment for a specific page? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
Retailers Issue
Hi there, We have 20 retailers who are about to launch websites and are going to be selling our products on their websites, however with they have no content for these products they are wanting to take our content we have for our product pages on place the content on their websites, is this going to cause an issue for me? We are ranking well for competitive keywords in this niche and do not want to do anything to harm it. What I would say is the retailers in question of no intention short term anyway of doing anything with SEO. Thanks for any help
Intermediate & Advanced SEO | | Paul780 -
Bing/Yahoo! Updates
On March 27th I noticed a huge rankings drop across the board on a client site in Bing and Yahoo! After some research, I found this article on SEOroundtable (it also links back to a Webmaster World discussion). For this particular site we're talking a few dozen of keywords dropping off the first page, or even from the first page dropping out of the top 50. The only thing not affected were brand keywords. The site was recently relaunched, and has a fairly weak backlink profile right now. It doesn't use keywords in the domain (which was one of the things identified in the SEOroundtable article). Has anyone else noticed changes? If so, what do you attribute them to and how are you combating them?
Intermediate & Advanced SEO | | BedeFahey0 -
Rel=Canonical URLs?
If I had two pages: PageA about Cats PageB about Dogs If PageA had a link rel=canonical to PageB, but the content is different, how would Google resolve this and what would users see if they searched "Cats" or "Dogs?" If PageA 301 redirected to PageB, (no content in PageA since it's 301 redirected), how would Google resolve this and what would users see if they searched "Cats" or "Dogs?"
Intermediate & Advanced SEO | | visionnexus0