Internal file extension canonicalization
-
Ok no doubt this is straightforward, however seem to be finding to hard to find a simple answer; our websites' internal pages have the extension .html. Trying to the navigate to that internal url without the .html extension results in a 404.
The question is; should a 401 be used to direct to the extension-less url to future proof? and should internal links direct to the extension-less url for the same reason?
Hopefully that makes sense and apologies for what I believe is a straightforward answer;
-
As above
example/abc rewrites to example/abc.html
example/abc.html redirects to example/abc
and all internal links link to example/abc
-
Thankyou for the replies.
I will try and clarify what I am trying to get at; apologies in advance for any naivety.
I understand homepage canonicalization; the confusion revolves around how this applies to internal pages.
Logically; I am struggling to see how internal pages are any different to a homepage in terms of the need to avoid multiple urls....and thus an extension-less url seemed appropriate. Not too mention the benefit or cleaner urls, easier to link to, remember etc.
i.e.
example/abc
example/abc.html
example/abc.index.html
-
As nick said, you dont need to do this, but if you are.
1. REWRITE the new url to the old url, as your webserver needs to know the extention
2. REDIRECT the old url to the new one, incase you already have links to the old urls, you dont want5 duplicate content
3. you need to make surer that all internal links point to the new url, you dont want un-necessary redirects as they leak link juice.
-
I'm about to make a whole lot of assumptions about your website to give this answer, just be aware.
Your website is built static, using HTML. Hence the .html file extension. If you're seeing websites that don't have file extension, it's most likely they are using content management systems (or have some serious /folder/index.html stuff going on).
Having a file extension like .html or .aspx or .php is not a bad thing. On websites like yours, it is required (unless you do the above subfolder thing) because it's an actual file the browser is grabbing rather than something being dynamically generated by a CMS. It has nothing to do with future-proofing.
As for 301'ing non-extension URLs to extention'd ones...well I don't know why you'd need to do that for your type of site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dealing with broken internal links/404s. What's best practice?
I've just started working on a website that has generated lots (100s) of broken internal links. Essentially specific pages have been removed over time and nobody has been keeping an eye on what internal links might have been affected. Most of these are internal links that are embedded in content which hasn't been updated following the page's deletion. What's my best way to approach fixing these broken links? My plan is currently to redirect where appropriate (from a specific service page that doesn't exist to the overall service category maybe?) but there are lots of pages that don't have a similar or equivalent page. I presume I'll need to go through the content removing the links or replacing them where possible. My example is a specific staff member who no longer works there and is linked to from a category page, should i be redirecting from the old staff member and updating the anchor text, or just straight up replacing the whole thing to link to the right person? In most cases, these pages don't rank and I can't think of many that have any external websites linking to them. I'm over thinking all of this? Please help! 🙂
Technical SEO | | Adam_SEO_Learning0 -
CSS and Javascipt files - website redesign project
UPDATED: We ran a crawl of the old website and have a list of css and javascript links as are part of the old website content. As the website is redesigned from scratch, I don't think these old css and javascipt files are being used for anything on the new site. I've read elsewhere online that you redirect "all" content files if launching/migrating to a new site. We are debating if this is needed for css and javascript files. Examples (A) http://website.com/wp-content/themes/style.css (B) http://website.com/wp-includes/js/wp-embed.min.js?ver=4.8.1
Technical SEO | | CREW-MARKETING0 -
Setting up a site with different extensions (.co.uk and .com)
hi i am setting up a new site but have bought two domains to cover those who may type the wrong version. So i have: regionwithchildren.co.uk and regionwithchildren.com i am just setting up both on my wordpress host with a coming soon page (to include social links and sign up form). but had a few questions: as the main site is .co.uk should i just set up a redirect from the .com to the .co.uk as the root folders on the two will be the same (regionwithchildren) i need to change one as host cant have two identical - what should i change the .com one to? any other considerations for this kind of set up would be much appreciated? thanks neil
Technical SEO | | neilhenderson0 -
What's Worse - 404 errors or a huge .htaccess file
We have changed our site architecture pretty significantly and now have many fewer pages (albeit with more robust content and focused linking). My question is, what should I do about all the 404 errors (keep in mind, I am only finding these in Bing Webmaster tools, not Moz or GWT)? Is it worse to have all those 404 errors (hundreds), or to have a massive htaccess file for pages that are only getting hits by the Bing crawlbot. Any insight would be great. Thanks
Technical SEO | | CleanEdisonInc0 -
Does anchor text penalty apply to internal links?
We already know that over optimsied anchor text for external will cause a penalty. But what about internal links? All of our blog posts include an advertisement linking sales pages. These links all use the exact same anchor text. Is linking to an internal page from so many other pages (blog posts) likely to trigger a penalty? Here is an example: http://www.designquotes.com.au/business-blog/four-ways-to-enhance-your-e-commerce-site-for-busy-shoppers/ This links to http://www.designquotes.com.au/web-design-quotes Many of the posts link to the same page using the anchor text "Compare Web Design Quotes from Local Designers."
Technical SEO | | designquotes0 -
How to solve issues regarding canonicalization?
Today, I was searching for article which may help me in issues regarding canonicalization and found very interesting article on SEOmoz. I am facing issues regarding de-indexing of pages and down of organic search engine visits. I have done proper R & D and apply it very carefully. But, still my indexed pages and visits are going down. I have applied canonical tag to following pages. Narrow by search: http://www.vistastores.com/outdoor-umbrellas?manufacturer=California+Umbrella Sorting: http://www.vistastores.com/outdoor-umbrellas?dir=desc&order=position Pagination: http://www.vistastores.com/outdoor-umbrellas?p=2 How can I improve my performance?
Technical SEO | | CommercePundit0 -
For large sites, best practices for pages hidden behind internal search?
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index? Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site? Is submitting an XML sitemap enough?
Technical SEO | | vlevit0 -
Internal website search
Hi, I'd like to index dynamically generated - internal website search pages - to Google. A mod rewrite to make the URL strings friendlier might be one way, but as these pages are created on the fly and effectively don't exist till the search keywords are inputted, is it even possible to index them? thanks
Technical SEO | | richcowley0