Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google not Indexing images on CDN.
-
My URL is: http://bit.ly/1H2TArH
- We have set up a CDN on our own domain: http://bit.ly/292GkZC
- We have an image sitemap: http://bit.ly/29ca5s3
- The image sitemap uses the CDN URLs.
- We verified the CDN subdomain in GWT.
- The robots.txt does not restrict any of the photos: http://bit.ly/29eNSXv. We used to have a disallow to /thumb/ which had a 301 redirect to our CDN but we removed both the disallow in the robots.txt as well as the 301.
Yet, GWT still reports none of our images on the CDN are indexed.
The above screenshot is from the GWT of our main domain.The GWT from the CDN subdomain just shows 0. We did not submit a sitemap to the verified subdomain property because we already have a sitemap submitted to the property on the main domain name. While making a search of images indexed from our CDN, nothing comes up: http://bit.ly/293ZbC1While checking the GWT of the CDN subdomain, I have been getting crawling errors, mainly 500 level errors. Not that many in comparison to the number of images and traffic that we get on our website. Google is crawling, but it seems like it just doesn't index the pictures!?
Can anyone help? I have followed all the information that I was able to find on the web but yet, our images on the CDN still can't seem to get indexed.
-
Hey Dan,
Thanks for your concerns. The reason for that is because of the image headers. We display different images depending of the resolution of the user visiting the website. So because of this, the image is downloaded automatically rather than being displayed. I researched on this and this shouldn't be a problem in terms of Image SEO. As for the images and watermarking. That is a business decision.
-
Hey There
I just did a reverse image search on two of your images and they are present in Google Image search
- http://screencast.com/t/QWKhqQfIH0Z8 - this one is indexed.
- This one is indexed and both versions are from Eyeem
But one issue, is that when I click 'view image' (what normally would open the image file in a new tab - instead it triggers a download box for me --> http://screencast.com/t/7LyLRRJ4CTb6 - perhaps this is because you are preventing people from doing so and just copying the images for free. But I was actually able to download the image for free straight from Google (the download worked).
Which leads me to another question... if the images are not free, maybe it makes sense to not index them? Or maybe index a watermarked version or small thumbnail?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexed "Lorem Ipsum" content on an unfinished website
Hi guys. So I recently created a new WordPress site and started developing the homepage. I completely forgot to disallow robots to prevent Google from indexing it and the homepage of my site got quickly indexed with all the Lorem ipsum and some plagiarized content from sites of my competitors. What do I do now? I’m afraid that this might spoil my SEO strategy and devalue my site in the eyes of Google from the very beginning. Should I ask Google to remove the homepage using the removal tool in Google Webmaster Tools and ask it to recrawl the page after adding the unique content? Thank you so much for your replies.
Intermediate & Advanced SEO | | Ibis150 -
Google Pagination Changes
What with Google recently coming out and saying they're basically ignoring paginated pages, I'm considering the link structure of our new, sooner to launch ecommerce site (moving from an old site to a new one with identical URL structure less a few 404s). Currently our new site shows 20 products per page but with this change by Google it means that any products on pages 2, 3 and so on will suffer because google treats it like an entirely separate page as opposed to an extension of the first. The way I see it I have one option: Show every product in each category on page 1. I have Lazy Load installed on our new website so it will only load the screen a user can see and as they scroll down it loads more products, but how will google interpret this? Will Google simply see all 50-300 products per category and give the site a bad page load score because it doesn't know the Lazy Load is in place? Or will it know and account for it? Is there anything I'm missing?
Intermediate & Advanced SEO | | moon-boots0 -
Text over image
Hello, I am creating an overlay on a image. Is it ok to write on this overlay in html or it is better to have the text not on a image for google and other search engines ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
CDN for SEO (or not)?
Does CDN impact on SEO or not? There seems conflicting ideas as to whether they impact positively or negatively, I realise that if the page loads quicker this is a good thing for SEO and usability of course. Does Google see CDN as just cheating and a get-around for not doing the work from the ground up and using good hosting etc? Do you have any direct experience? All constructive input much appreciated!
Intermediate & Advanced SEO | | seoman101 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
Google Not Indexing XML Sitemap Images
Hi Mozzers, We are having an issue with our XML sitemap images not being indexed. The site has over 39,000 pages and 17,500 images submitted in GWT. If you take a look at the attached screenshot, 'GWT Images - Not Indexed', you can see that the majority of the pages are being indexed - but none of the images are. The first thing you should know about the images is that they are hosted on a content delivery network (CDN), rather than on the site itself. However, Google advice suggests hosting on a CDN is fine - see second screenshot, 'Google CDN Advice'. That advice says to either (i) ensure the hosting site is verified in GWT or (ii) submit in robots.txt. As we can't verify the hosting site in GWT, we had opted to submit via robots.txt. There are 3 sitemap indexes: 1) http://www.greenplantswap.co.uk/sitemap_index.xml, 2) http://www.greenplantswap.co.uk/sitemap/plant_genera/listings.xml and 3) http://www.greenplantswap.co.uk/sitemap/plant_genera/plants.xml. Each sitemap index is split up into often hundreds or thousands of smaller XML sitemaps. This is necessary due to the size of the site and how we have decided to pull URLs in. Essentially, if we did it another way, it may have involved some of the sitemaps being massive and thus taking upwards of a minute to load. To give you an idea of what is being submitted to Google in one of the sitemaps, please see view-source:http://www.greenplantswap.co.uk/sitemap/plant_genera/4/listings.xml?page=1. Originally, the images were SSL, so we decided to reverted to non-SSL URLs as that was an easy change. But over a week later, that seems to have had no impact. The image URLs are ugly... but should this prevent them from being indexed? The strange thing is that a very small number of images have been indexed - see http://goo.gl/P8GMn. I don't know if this is an anomaly or whether it suggests no issue with how the images have been set up - thus, there may be another issue. Sorry for the long message but I would be extremely grateful for any insight into this. I have tried to offer as much information as I can, however please do let me know if this is not enough. Thank you for taking the time to read and help. Regards, Mark Oz6HzKO rYD3ICZ
Intermediate & Advanced SEO | | edlondon0 -
Does Google hate wordpress?
I have my categories pages set to noindex, follow. I deactivated the author and date based archives, and all the /page/2 /page/3 are noindex. Is this the right approach? I had thought about adding some text to the topic of each category page and then changing them to index. I'm using showing recent post excerpts on the homepage. Another other suggestions? I think two of my sites are in panda for no good reason. It seems like non-wordpress blogs in my industry do better than comparable wordpress sites.
Intermediate & Advanced SEO | | KateV0