Internal file extension canonicalization
-
Ok no doubt this is straightforward, however seem to be finding to hard to find a simple answer; our websites' internal pages have the extension .html. Trying to the navigate to that internal url without the .html extension results in a 404.
The question is; should a 401 be used to direct to the extension-less url to future proof? and should internal links direct to the extension-less url for the same reason?
Hopefully that makes sense and apologies for what I believe is a straightforward answer;
-
As above
example/abc rewrites to example/abc.html
example/abc.html redirects to example/abc
and all internal links link to example/abc
-
Thankyou for the replies.
I will try and clarify what I am trying to get at; apologies in advance for any naivety.
I understand homepage canonicalization; the confusion revolves around how this applies to internal pages.
Logically; I am struggling to see how internal pages are any different to a homepage in terms of the need to avoid multiple urls....and thus an extension-less url seemed appropriate. Not too mention the benefit or cleaner urls, easier to link to, remember etc.
i.e.
example/abc
example/abc.html
example/abc.index.html
-
As nick said, you dont need to do this, but if you are.
1. REWRITE the new url to the old url, as your webserver needs to know the extention
2. REDIRECT the old url to the new one, incase you already have links to the old urls, you dont want5 duplicate content
3. you need to make surer that all internal links point to the new url, you dont want un-necessary redirects as they leak link juice.
-
I'm about to make a whole lot of assumptions about your website to give this answer, just be aware.
Your website is built static, using HTML. Hence the .html file extension. If you're seeing websites that don't have file extension, it's most likely they are using content management systems (or have some serious /folder/index.html stuff going on).
Having a file extension like .html or .aspx or .php is not a bad thing. On websites like yours, it is required (unless you do the above subfolder thing) because it's an actual file the browser is grabbing rather than something being dynamically generated by a CMS. It has nothing to do with future-proofing.
As for 301'ing non-extension URLs to extention'd ones...well I don't know why you'd need to do that for your type of site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to no index / no follow CAD files .dxf .dwg
Hi, I have a new Wordpress site with a number of CAD files (.dxf& .dwg) downloadable straight from the site. These have been flagged in MOZ as warnings with everying from No Title/Description to duplicate content. Does anybody now how I would no index these type of files? Many thanks.
Technical SEO | | Jon_Pearce0 -
Is there a limit to how many URLs you can put in a robots.txt file?
We have a site that has way too many urls caused by our crawlable faceted navigation. We are trying to purge 90% of our urls from the indexes. We put no index tags on the url combinations that we do no want indexed anymore, but it is taking google way too long to find the no index tags. Meanwhile we are getting hit with excessive url warnings and have been it by Panda. Would it help speed the process of purging urls if we added the urls to the robots.txt file? Could this cause any issues for us? Could it have the opposite effect and block the crawler from finding the urls, but not purge them from the index? The list could be in excess of 100MM urls.
Technical SEO | | kcb81780 -
Sitemap international websites
Hey Mozzers,Here is the case that I would appreciate your reply for: I will build a sitemap for .com domain which has multiple domains for other countries (like Italy, Germany etc.). The question is can I put the hreflang annotations in sitemap1 only and have a sitemap 2 with all URLs for EN/default version of the website .COM. Then put 2 sitemaps in a sitemap index. The issue is that there are pages that go away quickly (like in 1-2 days), they are localised, but I prefer not to give annotations for them, I want to keep clear lang annotations in sitemap 1. In this way, I will replace only sitemap 2 and keep sitemap 1 intact. Would it work? Or I better put everything in one sitemap?The second question is whether you recommend to do the same exercise for all subdomains and other domains? I have read much on the topic, but not sure whether it worth the effort.The third question is if I have www.example.it and it.example.com, should I include both in my sitemap with hreflang annotations (the sitemap on www.example.com) and put there it for subdomain and it-it for the .it domain (to specify lang and lang + country).Thanks a lot for your time and have a great day,Ani
Technical SEO | | SBTech0 -
Can you be penalised in Google for excessive internal keyword linking?
I have an online shop and 3 blogs (with different topics) all set up on sub-domains (for security reasons, don't want Word Press installed in the same hosting space as my shop in case one gets hacked). I have been on the front page of Google for a keyword, lets say 'widgets' for months now. I have been writing blogs about 'widgets', probably about 1/4 of all my blog posts are linking to the 'widgets' page in my shop. I write maybe 1-2 blogs a week, so it's not excessive. This morning I have woken to fine that the widgets page in my shop has vanished from Google's index. So typing in 'widgets' brings up nothing. It hasn't dropped in the rankings, it's just vanished. A few weeks ago I ranked 3 or 4. Then I dropped to about 6. A couple of days ago, i jumped back up to 5 and now it's vanished. If you type in 'buy widgets', or 'widgets online' or 'widgets australia', I have the #1 spot for all those, but for 'widgets', I just don't exist anymore. Could I have been penalised for writing too many posts and keyword linking internally? They're not keyword stuffed and they're well written. I just don't understand what's happened. Right now I"m freaking out about blogging and putting internal links on my website.
Technical SEO | | sparrowdog0 -
Tool to search relative vs absolute internal links
I'm preparing for a site migration from a .co.uk to a .com and I want to ensure all internal links are updated to point to the new primary domain. What tool can I use to check internal links as some are relative and others are absolute so I need to update them all to relative.
Technical SEO | | Lindsay_D0 -
Canonicalization Issue?
Good day! I am not sure if my company has a Canonicalization issue? When typing in www.cushingco.com the site redirects to http://www.cushingco.com/index.shtml A visitor can also type in http://cushingco.com/index.shtml into a web browser and land on our homepage (and the url will be http://www.cushingco.com/index.shtml) A majority of websites that link to our company point to: http://www.cushingco.com/index.shtml We are in the process of cleaning up citations and pulling together a content marketing strategy/editorial calendar. I want to be sure folks interested in linking to us have the right url. Please ask me any questions to help narrow down what we might be doing incorrectly. Thanks in advance!! Jon
Technical SEO | | SEOSponge0 -
International Websites: rel="alternate" hreflang="x"
Hi people, I keep on reading and reading , but I won't get it... 😉 I mean this page: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077&topic=2370587&ctx=topic On the bottom of the page they say: Step 2: Use rel="alternate" hreflang="x" Update the HTML of each URL in the set by adding a set of rel="alternate" hreflang="x" link elements. Include a rel="alternate" hreflang="x" link for every URL in the set, like this: This markup tells Google's algorithm to consider all of these pages as alternate versions of each other. OK! Each URL needs this markup. BUT: Do i need it exactly as written above, or do I have to put in the complete URL of the site, like: The next question is, what happens exactly in the SERPS when I do it like this (an also with Step1 that I haven't copied here)? Google will display the "canonical"-version of the page, but wehen a user from US clicks he will get on http://en-us.example.com/**page.htm **??? I tried to find other sites which use this method, but I haven't found one. Can someone give me an example.website??? Thank you, thank you very much! André
Technical SEO | | waynestock0 -
Php to html - change in extension
Hello, What is the code to redirect all the pages of a site from .php to .html extension? Thanks
Technical SEO | | seoug_20051