How important is the file extension in the URL for images?
-
I know that descriptive image file names are important for SEO. But how important is it to include .png, .jpg, .gif (or whatever file extension) in the url path? i.e. https://example.com/images/golden-retriever vs. https://example.com/images/golden-retriever.jpg
Furthermore, since you can set the filename in the Content-Disposition response header, is there any need to include the descriptive filename in the URL path?
Since I'm pulling most of our images from a database, it'd be much simpler to not care about simulating a filename, and just reference an image id in my templates.
Example:
1. Browser requests GET /images/123456
2. Server responds with image setting both Content-Disposition, and Link (canonical) headersContent-Disposition: inline; filename="golden-retriever"
Link: <https: 123456="" example.com="" images="">; rel="canonical"</https:> -
In theory, there should be no difference - the canonical header should mean that Google treats the inclusion of /images/123456 as exactly the same as including /images/golden-retriever.
It is slightly messier so I think that if it was easy, I'd go down the route of only ever using the /golden-retriever version - but if that's difficult, this is theoretically the same so should be fine.
-
@Will Thank you so much for this response. Very helpful.
"If you can't always refer to the image by its keyword-rich filename"...
If I'm already including the canonical link header on the image, and am able to serve from both /images/123456 and /images/golden-retriever (canonical), is there any benefit to referencing the canonical over the other in my image tags?
-
Hi James. I've responded with what I believe is a correct answer to MarathonRunner's question. There are a few inaccuracies in your responses to this thread - as pointed out by others below - please can you target your future responses to areas where you are confident that you are correct and helpful? Many thanks.
-
@MarathonRunner - you are correct in your inline responses - it's totally valid to serve an image (or other filetype) without an extension, with its type identified by the Content-Type. Sorry that you've had a less-than-helpful experience here so far.
To answer your original questions:
- From an SEO perspective, there is no need that I know of for your images to have a file extension - the content type should be fine
- However - I have no reason to think that a filename in the Content-Disposition header will be recognised as a ranking signal - what you are describing is a rare use-case and I haven't seen any evidence that it would be recognised by the search engines as being the "real" filename
If you can't always refer to the image by its keyword-rich filename, then could you:
- Serve it as you propose (though without the Content-Disposition filename)
- Serve a rel="canonical" link to a keyword-rich filename (https://example.com/images/golden-retriever in your example)
- Also serve the image on that URL
This only helps if you are able to serve the image on the /images/golden-retriever path, but need to have it available at /images/123456 for inclusion in your own HTML templates.
I hope that helps.
-
If you really did your research you would have noticed the header image is not using an extension.
-
Again, you're mistaken. The Content-Type response header tells the browser what type of file the resource is (mime type). This is _completely different _from the file extension in URL paths.
In fact, on the web all the file extensions are faked through the URL path. For example, this page's URL path is:
https://moz.com/community/q/how-important-is-the-file-extension-in-the-url-for-images
It's not
https://moz.com/community/q/how-important-is-the-file-extension-in-the-url-for-images.html
How does the browser know the the page is an html doc? Because of the Content-Type response header. The faked "extension" in the URL path, is unnecessary.
You can view http response headers for any URL using this tool.
-
-
Do you need a new keyboard?
-
@James Wolff: I'm really hoping you're being sarcastic here. As it's totally fine to serve it without the extension. There are many more ways for a crawler to understand what type a file is. Including what @MarathonRunner is talking about here.
-
This isn't accurate. File extension (in the url path) is not the same as the **Content-Type **response header. Browsers respect the response header Content-Type over whatever extension I use in the path.
Example: try serving a file /golden-retriever.png with a content type of image/jpeg. Your browser will understand the file as a .jpg. If you attempt to save, your browser will correct to golden-retriever.jpg.
You can route URLs however you want.
Additionally, I'm not aware of any way browsers "leverage cache by content type". Browsers handle cache by the etag/expires header.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to manage images
We have been using Google+ to load our images straight on to our site, we did this to make sure our site loaded fast. google+ delivers them to website at the size we specify, so even if original is say 4000px x 3000px we can ask for them at 100x100 and they send as resized scale. we dont have to manage sizes just the original images and their tagging If we wanted to improve our SEO opportunities should we be doing this another way? Our images show if you look in the image serp but they dont appear on the main serp. How much of a difference would having the images on our own domain rather than having them on Google+ I am working through the recommended list below, would love to hear guys who are doing well with images and have to manage 1000's of them. There are a number of ways to optimise your images to increase your visibility within Google image search, and the chance of being featured within the main search results (as seen in the 'tablet PC' example): Use a short descriptive piece of text featuring desired keywords within the image alt text attribute.
Intermediate & Advanced SEO | | PottyScotty
Save the image using a descriptive file name
Create an Image XML sitemap
Ensure your images directory isn't blocked by robots.txt
Ideally host images on the same domain
And surround the image with related text content to build a stronger page context/association0 -
2 URLS pointing to the same content
Hi, We currently have 2 URL's pointing to the same website (long story why we have it) - A & B. A is our main website but we set up B as a rewrite URL to use for our Pay Per Click campaign. Now because its the same site, but B is just a URL rewrite, Google Webmaster Tools is seeing that we have thousands of links coming in from site B to site A. I want to tell Google to ignore site B url but worried it might affect site A. I can't add a no follow link on site B as its the same content so will also be applicable on Site A. I'm also worried about using Google Disavow as it might impact on site A! Can anyone make any suggestions on what to do, as I would like to hear from anyone with experience with this or can recommend a safe option. Thanks for your time!
Intermediate & Advanced SEO | | Party_Experts0 -
Robots.txt: Syntax URL to disallow
Did someone ever experience some "collateral damages" when it's about "disallowing" some URLs? Some old URLs are still present on our website and while we are "cleaning" them off the site (which takes time), I would like to to avoid their indexation through the robots.txt file. The old URLs syntax is "/brand//13" while the new ones are "/brand/samsung/13." (note that there is 2 slash on the URL after the word "brand") Do I risk to erase from the SERPs the new good URLs if I add to the robots.txt file the line "Disallow: /brand//" ? I don't think so, but thank you to everyone who will be able to help me to clear this out 🙂
Intermediate & Advanced SEO | | Kuantokusta0 -
Rewriting URL
I'm doing a major URL rewriting on our site to make the URL more SEO friendly as well as more comfortable and intuitive for our users. Our site has a lot of indexed pages, over 250k. So it will take Google a while to reindex everything. I was thinking that when Google Bot encounters the new URLs, it will probably figure out it's duplicate content with the old URL. At least until it recrawls the old URL and get a 301 directing them to the new URL. This will probably lower the ranking of every page being crawled. Am I right to assume this is what will happen? Or is it fine as long as the old URLs get 301 redirect? If it is indeed a problem, what's the best solution? rel="canonical" on every single page maybe? Another approach? Thank you.
Intermediate & Advanced SEO | | corwin0 -
How important is the number of indexed pages?
I'm considering making a change to using AJAX filtered navigation on my e-commerce site. If I do this, the user experience will be significantly improved but the number of pages that Google finds on my site will go down significantly (in the 10,000's). It feels to me like our filtered navigation has grown out of control and we spend too much time worrying about the url structure of it - in some ways it's paralyzing us. I'd like to be able to focus on pages that matter (explicit Category and Sub-Category) pages and then just let ajax take care of filtering products below these levels. For customer usability this is smart. From the perspective of manageable code and long term design this also seems very smart -we can't continue to worry so much about filtered navigation. My concern is that losing so many indexed pages will have a large negative effect (however, we will reduce duplicate content and be able provide much better category and sub-category pages). We probably should have thought about this a year ago before Google indexed everything :-). Does anybody have any experience with this or insight on what to do? Thanks, -Jason
Intermediate & Advanced SEO | | cre80 -
How canonical url harm our website???
Even though my website has no similar/copied content, i used rel=canonical for all my website pages. Is Google or yahoo make any harm to my SERP's?? EX: http://www.seomoz.org is my site, in that i used canonical as rel="<a class="attribute-value">canonical</a>" href="http://www.seomoz.org" to my home page like that similar to all pages, i created rel=canonical. Is search engine harm my website???
Intermediate & Advanced SEO | | MadhukarSV0 -
SEO Overly-Dynamic URL Website with thousands of URLs
Hello, I have a new client who has a Diablo 3 database. They have created a very interesting site in which every "build" is it's own URL. Every page is a list of weapons and gear for the gamer. The reader may love this but it's nightmare for SEO. I have pushed for a blog to help generate inbound links and traffic but overall I feel the main feature of their site is a headache to optimize. They have thousands of pages index in google but none are really their own page. There is no strong content, H-Tags, or any real substance at all. With a lack of definition for each page, Google see's this as a huge ball of mess, with duplicate Page Titles and too many onpage links. The first thing I did was tell them to add a canonical link which seemed to drop the errors down 12K leaving only 2400 left...which is a nice start, but the remaining errors is still a challenge. I'm thinking about seeing if I can either find a way to make each page it's own blurb, H Tag or simple have the Nav bar and all the links in the database Noindex. That way the site is left with only a handful of URLs + the Blog and Forum Thought?
Intermediate & Advanced SEO | | MikePatch0 -
SEO Strategy for URL Change
I'm working with a company who will likely have to change their URL because of a trademark dispute. They will be able to maintain the new URL for some period but will soon need to drop the existing URL all together. Aside from the usual keyword considerations when choosing a URL, are there any SEO strategies I should consider as we execute this change?
Intermediate & Advanced SEO | | Jon_KS0