How important is the file extension in the URL for images?
-
I know that descriptive image file names are important for SEO. But how important is it to include .png, .jpg, .gif (or whatever file extension) in the url path? i.e. https://example.com/images/golden-retriever vs. https://example.com/images/golden-retriever.jpg
Furthermore, since you can set the filename in the Content-Disposition response header, is there any need to include the descriptive filename in the URL path?
Since I'm pulling most of our images from a database, it'd be much simpler to not care about simulating a filename, and just reference an image id in my templates.
Example:
1. Browser requests GET /images/123456
2. Server responds with image setting both Content-Disposition, and Link (canonical) headersContent-Disposition: inline; filename="golden-retriever"
Link: <https: 123456="" example.com="" images="">; rel="canonical"</https:> -
In theory, there should be no difference - the canonical header should mean that Google treats the inclusion of /images/123456 as exactly the same as including /images/golden-retriever.
It is slightly messier so I think that if it was easy, I'd go down the route of only ever using the /golden-retriever version - but if that's difficult, this is theoretically the same so should be fine.
-
@Will Thank you so much for this response. Very helpful.
"If you can't always refer to the image by its keyword-rich filename"...
If I'm already including the canonical link header on the image, and am able to serve from both /images/123456 and /images/golden-retriever (canonical), is there any benefit to referencing the canonical over the other in my image tags?
-
Hi James. I've responded with what I believe is a correct answer to MarathonRunner's question. There are a few inaccuracies in your responses to this thread - as pointed out by others below - please can you target your future responses to areas where you are confident that you are correct and helpful? Many thanks.
-
@MarathonRunner - you are correct in your inline responses - it's totally valid to serve an image (or other filetype) without an extension, with its type identified by the Content-Type. Sorry that you've had a less-than-helpful experience here so far.
To answer your original questions:
- From an SEO perspective, there is no need that I know of for your images to have a file extension - the content type should be fine
- However - I have no reason to think that a filename in the Content-Disposition header will be recognised as a ranking signal - what you are describing is a rare use-case and I haven't seen any evidence that it would be recognised by the search engines as being the "real" filename
If you can't always refer to the image by its keyword-rich filename, then could you:
- Serve it as you propose (though without the Content-Disposition filename)
- Serve a rel="canonical" link to a keyword-rich filename (https://example.com/images/golden-retriever in your example)
- Also serve the image on that URL
This only helps if you are able to serve the image on the /images/golden-retriever path, but need to have it available at /images/123456 for inclusion in your own HTML templates.
I hope that helps.
-
If you really did your research you would have noticed the header image is not using an extension.
-
Again, you're mistaken. The Content-Type response header tells the browser what type of file the resource is (mime type). This is _completely different _from the file extension in URL paths.
In fact, on the web all the file extensions are faked through the URL path. For example, this page's URL path is:
https://moz.com/community/q/how-important-is-the-file-extension-in-the-url-for-images
It's not
https://moz.com/community/q/how-important-is-the-file-extension-in-the-url-for-images.html
How does the browser know the the page is an html doc? Because of the Content-Type response header. The faked "extension" in the URL path, is unnecessary.
You can view http response headers for any URL using this tool.
-
-
Do you need a new keyboard?
-
@James Wolff: I'm really hoping you're being sarcastic here. As it's totally fine to serve it without the extension. There are many more ways for a crawler to understand what type a file is. Including what @MarathonRunner is talking about here.
-
This isn't accurate. File extension (in the url path) is not the same as the **Content-Type **response header. Browsers respect the response header Content-Type over whatever extension I use in the path.
Example: try serving a file /golden-retriever.png with a content type of image/jpeg. Your browser will understand the file as a .jpg. If you attempt to save, your browser will correct to golden-retriever.jpg.
You can route URLs however you want.
Additionally, I'm not aware of any way browsers "leverage cache by content type". Browsers handle cache by the etag/expires header.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
410 or 301 after URL update?
Hi there, A site i'm working on atm has a thousand "not found" errors on google console (of course, I'm sure there are thousands more it's not showing us!). The issue is a lot of them seem to come from a URL change. Damage has been done, the URLs have been changed and I can't stop that... but as you can imagine, i'm keen to fix as many as humanly possible. I don't want to go mad with 301s - but for external links in, this seems like the best solution? On the other hand, Google is reading internal links that simply aren't there anymore. Is it better to hunt down the new page and 301-it anyway? OR should I 410 and grit my teeth while google crawls and recrawls it, warning me that this page really doesn't exist? Essentially I guess I'm asking, how many 301s are too many and will affect our DA? And what's the best solution for dealing with mass 404 errors - many of which aren't attached or linked to from any other pages anymore? Thanks for any insights 🙂
Intermediate & Advanced SEO | | Fubra0 -
What should my main sitemap URL be?
Hi Mozzers - regarding the URL of a website's main website: http://example.com/sitemap.xml is the normal way of doing it but would it matter if I varied this to: http://example.com/mainsitemapxml.xml or similar? I can't imagine it would matter but I have never moved away from the former before - and one of my clients doesn't want to format the URL in that way. What the client is doing is actually quite interesting - they have the main sitemap: http://example.com/sitemap.xml - that redirects to the sitemap file which is http://example.com/sitemap (with no xml extension) - might that redirect and missing xml extension the redirected to sitemap cause an issue? Never come across such a setup before. Thanks in advance for your feedback - Luke
Intermediate & Advanced SEO | | McTaggart0 -
Combining images with text as anchor text
Hello everyone, I am working to create sub-category pages on our website virtualsheetmusic.com, and I'd like to have your thoughts on using a combination of images and text as anchor text in order to maximize keyword relevancy. Here is an example (I'll keep it simple): Let's take our violin sheet music main category page located at /violin/, which includes the following sub-categories: Christmas Classical Traditional So, the idea is to list the above sub-categories as links on the main violin sheet music page, and if we had to use simple text links, that would be something like: Christmas
Intermediate & Advanced SEO | | fablau
Classical
Traditional Now, since what we really would like to target are keywords like: "christmas violin sheet music" "classical violin sheet music" "traditional violin sheet music" I would be tempted to make the above links as follows: Christmas violin sheet music
Classical violin sheet music
Traditional violin sheet music But I am sure that would be too much overwhelming for the users, even if the best CSS design were applied to it. So, my idea would be to combine images with text, in a way to put those long-tail keywords inside the image ALT tag, so to have links like these: Christmas
Classical
Traditional That would allow a much easier way to work the UI , and at the same time keep relevancy for each link. I have seen some of our competitors doing that and they have top-notch results on the SEs. My questions are: 1. Do you see any negative effect of doing this kind of links from the SEO standpoint? 2. Would you suggest any better way to accomplish what I am trying to do? I am eager to know your thoughts about this. Thank you in advance to anyone!1 -
Help with facet URLs in Magento
Hi Guys, Wondering if I can get some technical help here... We have our site britishbraces.co.uk , built in Magento. As per eCommerce sites, we have paginated pages throughout. These have rel=next/prev implemented but not correctly ( as it is not in is it in ) - this fix is in process. Our canonicals are currently incorrect as far as I believe, as even when content is filtered, the canonical takes you back to the first page URL. For example, http://www.britishbraces.co.uk/braces/x-style.html?ajaxcatalog=true&brand=380&max=51.19&min=31.19 Canonical to... http://www.britishbraces.co.uk/braces/x-style.html Which I understand to be incorrect. As I want the coloured filtered pages to be indexed ( due to search volume for colour related queries ), but I don't want the price filtered pages to be indexed - I am unsure how to implement the solution? As I understand, because rel=next/prev implemented ( with no View All page ), the rel=canonical is not necessary as Google understands page 1 is the first page in the series. Therefore, once a user has filtered by colour, there should then be a canonical pointing to the coloured filter URL? ( e.g. /product/black ) But when a user filters by price, there should be noindex on those URLs ? Or can this be blocked in robots.txt prior? My head is a little confused here and I know we have an issue because our amount of indexed pages is increasing day by day but to no solution of the facet urls. Can anybody help - apologies in advance if I have confused the matter. Thanks
Intermediate & Advanced SEO | | HappyJackJr0 -
We 410'ed URLs to decrease URLs submitted and increase crawl rate, but dynamically generated sub URLs from pagination are showing as 404s. Should we 410 these sub URLs?
Hi everyone! We recently 410'ed some URLs to decrease the URLs submitted and hopefully increase our crawl rate. We had some dynamically generated sub-URLs for pagination that are shown as 404s in google. These sub-URLs were canonical to the main URLs and not included in our sitemap. Ex: We assumed that if we 410'ed example.com/url, then the dynamically generated example.com/url/page1 would also 410, but instead it 404’ed. Does it make sense to go through and 410 these dynamically generated sub-URLs or is it not worth it? Thanks in advice for your help! Jeff
Intermediate & Advanced SEO | | jeffchen0 -
Image impressions fall drastically
Hi everyone, On June 15th, 2015 we saw a huge drop(70%) in image impressions and clicks for website: http://www.zakoopi.com/ What can be the possible reason for that? Please let me know what can be done to improve the impressions.
Intermediate & Advanced SEO | | Obbserv0 -
How to manage images
We have been using Google+ to load our images straight on to our site, we did this to make sure our site loaded fast. google+ delivers them to website at the size we specify, so even if original is say 4000px x 3000px we can ask for them at 100x100 and they send as resized scale. we dont have to manage sizes just the original images and their tagging If we wanted to improve our SEO opportunities should we be doing this another way? Our images show if you look in the image serp but they dont appear on the main serp. How much of a difference would having the images on our own domain rather than having them on Google+ I am working through the recommended list below, would love to hear guys who are doing well with images and have to manage 1000's of them. There are a number of ways to optimise your images to increase your visibility within Google image search, and the chance of being featured within the main search results (as seen in the 'tablet PC' example): Use a short descriptive piece of text featuring desired keywords within the image alt text attribute.
Intermediate & Advanced SEO | | PottyScotty
Save the image using a descriptive file name
Create an Image XML sitemap
Ensure your images directory isn't blocked by robots.txt
Ideally host images on the same domain
And surround the image with related text content to build a stronger page context/association0 -
Will our PA be retained after URL updates?
Our web hosting company recently applied a seo update to our site to deal with canonicalization issues and also rewrote all urls to lower case. As a result our PA is now 1 on all pages its effected. I took this up with them and they had this to say. "I must confess I’m still a bit lost however can assure you our consolidation tech uses a 301 permanent redirect for transfers. This should ensure any back link equity isn’t lost. For instance this address: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-Bath-sheet_ct474bd182pd2731.htm Redirects to this page: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-bath-sheet_ct474bd182pd2731.htm And the redirect returns 301 header response – as discussed in your attached forum thread extract" Firstly, is canonicalization working as the number of duplicate pages shot up last week and also will we get our PA back? Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0