Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Not all images indexed in Google
-
Hi all,
Recently, got an unusual issue with images in Google index. We have more than 1,500 images in our sitemap, but according to Search Console only 273 of those are indexed. If I check Google image search directly, I find more images in index, but still not all of them.
For example this post has 28 images and only 17 are indexed in Google image. This is happening to other posts as well.
Checked all possible reasons (missing alt, image as background, file size, fetch and render in Search Console), but none of these are relevant in our case. So, everything looks fine, but not all images are in index.
Any ideas on this issue?
Your feedback is much appreciated, thanks
-
Fetching, rendering, caching and indexing are all different. Sometimes they're all part of the same process, sometimes not. When Google 'indexes' images, that's primarily for its image search engine (Google Images). 'Indexing' something means that Google is listing that resource within its own search results for one reason or another. For the same reasons that Google rarely indexes all of your web-pages, Google also rarely indexes all of your images.
That doesn't mean that Google 'can't see' your images and has an imperfect view of your web-page. It simply means that Google does not believe the image which you have uploaded are 'worthy' enough to be served to an end-user who is performing a certain search on Google images. If you think that gaining normal web rankings is tricky, remember that most users only utilise Google images for certain (specific) reasons. Maybe they're trying to find a meme to add to their post on a form thread or as a comment on a Social network. Maybe they're looking for PNG icons to add into their PowerPoint presentations.
In general, images from the commercial web are... well, they're commercially driven (usually). When was the last time you expressedly set out to search for Ads to look at on Google images? Never? Ok then.
First Google will fetch a page or resource by visiting that page or resource's URL. If the resource or web-page is of moderate to high value, Google may then render the page or resource (Google doesn't always do this, but usually it's to get a good view of a page on the web which is important - yet which is heavily modified by something like JS or AJAX - and thus all the info isn't in the basic 'source code' / view-source).
Following this, Google may decide to cache the web-page or resource. Finally, if the page or resource is deemed worthy enough and Google's algorithm(s) decide that it could potentially satisfy a certain search query (or array thereof) - the resource or page may be indexed. All of this can occur in various patterns, e.g: indexing a resource without caching it or caching a resource without indexing it (there are many reasons for all of this which I won't get into now).
On the commercial web, many images are stock or boiler-plate visuals from suppliers. If Google already has the image you are supplying indexed at a higher resolution or at superior quality (factoring compression) and if your site is not a 'main contender' in terms of popularity and trust metrics, Google probably won't index that image on your site. Why would Google do so? It would just mean that when users performed an image search, they would see large panes of results which were all the same image. Users only have so much screen real-estate (especially with the advent of mobile browsing popularity). Seeing loads of the same picture at slightly different resolutions would just be annoying. People want to see a variety, a spread of things! **That being said **- your images are lush and I don't think they're stock rips!
If some images on your page, post or website are not indexed - it's not necessarily an 'issue' or 'error'.
Looking at the post you linked to: https://flothemes.com/best-lightroom-presets-photogs/
I can see that it sits on the "flothemes.com" domain. It has very strong link and trust metrics:
Ahrefs - Domain rating 83
Moz - Domain Authority - 62
As such, you'd think that most of these images would be unique (I don't have time to do a reverse image search on all of them) - also because the content seems really well done. I am pretty confident (though not certain) that quality and duplication are probably not to blame in this instance.
That makes me think, hmm maybe some of the images don't meet Google's compression standards.
Check out these results (https://gtmetrix.com/reports/flothemes.com/xZARSfi5) for the page / post you referenced, on GTMetrix (I find it superior to Google's Page-Speed Insights) and click on the "Waterfall" tab.
You can see that some of the image files have pretty lard 'bars' in terms of the total time it took to load in those individual resources. The main offenders are this image: https://l5vd03xwb5125jimp1nwab7r-wpengine.netdna-ssl.com/wp-content/uploads/2016/01/PhilChester-Portfolio-40.jpg (over 2 seconds to pull in by itself) and this one: https://l5vd03xwb5125jimp1nwab7r-wpengine.netdna-ssl.com/wp-content/uploads/2017/04/Portra-1601-Digital-2.png (around 1.7 seconds to pull in)
Check out the resource URLs. They're being pulled into your page, but they're not hosted on your website. As such - how could Google index those images for your site when they're pulled in externally? Maybe there's some CDN stuff going on here. Maybe Google is indexing some images on the CDN because it's faster and not from your base-domain. This really needs looking into in a lot more detail, but I smell the tails of something interesting there.
If images are deemed to be uncompressed or if their resolution is just way OTT (such that most users would never need even half of the full deployment resolution) - Google won't index those images. Why? Well they don't want Google Images to become a lag-fest I guess!
**Your main issue is that you are not serving 'scaled' images **(or apparently, optimising them). On that same GTMetrix report, check out the "PageSpeed" tab. Yeah, you scored an F by the way (that's a fail) and it's mainly down to your image deployment.
Google thinks one or more of the following:
- You haven't put enough effort into optimising some of your images
- Some of your images are not worth indexing or it can find them somewhere else
- Google is indexing some of the images from your CDN instead of your base domain
- Google is having trouble indexing images for your domain, which are permanently or temporarily stored off-site (and the interference is causing Google to just give up)
I know there's a lot to think about here, but I hope I have at least put you on the 'trail' a reasonable solution
This was fun to examine, so thanks for the interesting question!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Google tries to index non existing language URLs. Why?
Hi, I am working for a SAAS client. He uses two different language versions by using two different subdomains.
Technical SEO | | TheHecksler
de.domain.com/company for german and en.domain.com for english. Many thousands URLs has been indexed correctly. But Google Search Console tries to index URLs which were never existing before and are still not existing. de.domain.com**/en/company
en.domain.com/de/**company ... and an thousand more using the /en/ or /de/ in between. We never use this variant and calling these URLs will throw up a 404 Page correctly (but with wrong respond code - we`re fixing that 😉 ). But Google tries to index these kind of URLs again and again. And, I couldnt find any source of these URLs. No Website is using this as an out going link, etc.
We do see in our logfiles, that a Screaming Frog Installation and moz.com w opensiteexplorer were trying to access this earlier. My Question: How does Google comes up with that? From where did they get these URLs, that (to our knowledge) never existed? Any ideas? Thanks 🙂0 -
Why images are not getting indexed and showing in Google webmaster
Hi, I would like to ask why our website images not indexing in Google. I have shared the following screenshot of the search console. https://www.screencast.com/t/yKoCBT6Q8Upw Last week (Friday 14 Sept 2018) it was showing 23.5K out 31K were submitted and indexed by Google. But now, it is showing only 1K 😞 Can you please let me know why might this happen, why images are not getting indexed and showing in Google webmaster.
Technical SEO | | 21centuryweb0 -
Removed Subdomain Sites Still in Google Index
Hey guys, I've got kind of a strange situation going on and I can't seem to find it addressed anywhere. I have a site that at one point had several development sites set up at subdomains. Those sites have since launched on their own domains, but the subdomain sites are still showing up in the Google index. However, if you look at the cached version of pages on these non-existent subdomains, it lists the NEW url, not the dev one in the little blurb that says "This is Google's cached version of www.correcturl.com." Clearly Google recognizes that the content resides at the new location, so how come the old pages are still in the index? Attempting to visit one of them gives a "Server Not Found" error, so they are definitely gone. This is happening to a couple of sites, one that was launched over a year ago so it doesn't appear to be a "wait and see" solution. Any suggestions would be a huge help. Thanks!!
Technical SEO | | SarahLK0 -
Fake Links indexing in google
Hello everyone, I have an interesting situation occurring here, and hoping maybe someone here has seen something of this nature or be able to offer some sort of advice. So, we recently installed a wordpress to a subdomain for our business and have been blogging through it. We added the google webmaster tools meta tag and I've noticed an increase in 404 links. I brought this up to or server admin, and he verified that there were a lot of ip's pinging our server looking for these links that don't exist. We've combed through our server files and nothing seems to be compromised. Today, we noticed that when you do site:ourdomain.com into google the subdomain with wordpress shows hundreds of these fake links, that when you visit them, return a 404 page. Just curious if anyone has seen anything like this, what it may be, how we can stop it, could it negatively impact us in anyway? Should we even worry about it? Here's the link to the google results. https://www.google.com/search?q=site%3Amshowells.com&oq=site%3A&aqs=chrome.0.69i59j69i57j69i58.1905j0j1&sourceid=chrome&es_sm=91&ie=UTF-8 (odd links show up on pages 2-3+)
Technical SEO | | mshowells0 -
Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO? Anyone know? Thanks! ~Brett
Technical SEO | | BBuck0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
Index forum sites
Hi Moz Team, somehow the last question i raised a few days ago not only wasnt answered up until now, it was also completely deleted and the credit was not "refunded" - obviously there was some data loss involved with your restructuring. Can you check whether you still find the last question and answer it quickly? I need the answer 🙂 Here is one more question: I bought a website that has a huge forum, loads of pages with user generated content. Overall around 500.000 Threads with 9 Million comments. The complete forum is noindex/nofollow when i bought the site, now i am thinking about what is the best way to unleash the potential. The current system is vBulletin 3.6.10. a) Shall i first do an update of vbulletin to version 4 and use the vSEO tool to make the URLs clean, more user and search engine friendly before i switch to index/follow? b) would you recommend to have the forum in the folder structure or on a subdomain? As far as i know subdomain does take lesser strenght from the TLD, however, it is safer because the subdomain is seen as a separate entity from the regular TLD. Having it in he folder makes it easiert to pass strenght from the TLD to the forum, however, it puts my TLD at risk c) Would you release all forum sites at once or section by section? I think section by section looks rather unnatural not only to search engines but also to users, however, i am afraid of blasting more than a millionpages into the index at once. d) Would you index the first page of a threat or all pages of a threat? I fear duplicate content as the different pages of the threat contain different body content but the same Title and possibly the same h1. Looking forward to hear from you soon! Best Fabian
Technical SEO | | fabiank0