Internal file extension canonicalization
-
Ok no doubt this is straightforward, however seem to be finding to hard to find a simple answer; our websites' internal pages have the extension .html. Trying to the navigate to that internal url without the .html extension results in a 404.
The question is; should a 401 be used to direct to the extension-less url to future proof? and should internal links direct to the extension-less url for the same reason?
Hopefully that makes sense and apologies for what I believe is a straightforward answer;
-
As above
example/abc rewrites to example/abc.html
example/abc.html redirects to example/abc
and all internal links link to example/abc
-
Thankyou for the replies.
I will try and clarify what I am trying to get at; apologies in advance for any naivety.
I understand homepage canonicalization; the confusion revolves around how this applies to internal pages.
Logically; I am struggling to see how internal pages are any different to a homepage in terms of the need to avoid multiple urls....and thus an extension-less url seemed appropriate. Not too mention the benefit or cleaner urls, easier to link to, remember etc.
i.e.
example/abc
example/abc.html
example/abc.index.html
-
As nick said, you dont need to do this, but if you are.
1. REWRITE the new url to the old url, as your webserver needs to know the extention
2. REDIRECT the old url to the new one, incase you already have links to the old urls, you dont want5 duplicate content
3. you need to make surer that all internal links point to the new url, you dont want un-necessary redirects as they leak link juice.
-
I'm about to make a whole lot of assumptions about your website to give this answer, just be aware.
Your website is built static, using HTML. Hence the .html file extension. If you're seeing websites that don't have file extension, it's most likely they are using content management systems (or have some serious /folder/index.html stuff going on).
Having a file extension like .html or .aspx or .php is not a bad thing. On websites like yours, it is required (unless you do the above subfolder thing) because it's an actual file the browser is grabbing rather than something being dynamically generated by a CMS. It has nothing to do with future-proofing.
As for 301'ing non-extension URLs to extention'd ones...well I don't know why you'd need to do that for your type of site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
301 redirect file question
Hi Everyone, I am creating a list of 301 redirects to give to a developer to put into Magento. I used Screaming Frog to crawl the site, but I have noticed that all of their urls 302 to another page. I am wondering if I should 301 the first URL to the url on the new site, or the second. I am thinking the first, but would love some confirmation. Thank you!
Technical SEO | | mrbobland0 -
Robots.txt - "File does not appear to be valid"
Good afternoon Mozzers! I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move! To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt) I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error. Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues. Cheers, Lewis FNcK2YQ
Technical SEO | | PeaSoupDigital0 -
Domain Types/Extensions – Are .infos any good ?
Hi I know general concensus is to stay away from the non established domain suffix types and concentrate on .coms .co.uk’s etc etc. But i have an aged .info domain that has some content on it related to a online news paper i have on that subject (on the news paper providers domain sub-folder currently) which i want to focus more time on and put on its own dedicated domain. So i want to upload it to this aged .info domain. However waste of time if .info domains are bad for seo etc Does anyone have any experience of .info doing well in serps or should i totally scrap the idea and try find a new .com etc type domain ? My .info has been live with related content for 7 years so hoping that should count for something 🙂 All Best
Technical SEO | | Dan-Lawrence
Dan0 -
301 Redirect keep html files on server?
Hello just one quick question which came up in the discussion here: http://moz.com/community/q/take-a-good-amount-of-existing-landing-pages-offline-because-of-low-traffic-cannibalism-and-thin-content When I do 301 redirects where I put together content from 2 pages, should I keep the page/html which redirects on the server? Or should I delete? Or does it make no difference at all?
Technical SEO | | _Heiko_0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0 -
Indexing of flash files
When Google indexes a flash file, do they use a library for such a purpose ? What set me thinking was this blog post ( although old ) which states - "we expanded our SWF indexing capabilities thanks to our continued collaboration with Adobe and a new library that is more robust and compatible with features supported by Flash Player 10.1."
Technical SEO | | seoug_20050 -
Will changing page extensions from .html to .php require a redirect?
Hi. We are launching a new website and our .html page extensions will be replaced with a .php page extension. Example: www.theideapeople.com/web_design.html (current url) www.theideapeople.com/web_design.php (new url) Will this require any special treatment to maintain the page SEO ranking? Does it make a difference if you use a .html or .php? Thank you for your help and insight! Jay
Technical SEO | | theideapeople0