Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Images on sub domain fed from CDN
-
I have a client that uses a CDN to fill images, from a sub domain ( images.domain.com). We've made sure that the sub domain itself is not blocked. We've added a robots.txt file, we're creating an image sitemap file & we've verified ownership of the domain within GWT.
Yet, any crawler that I use only see's the first page of the sub domain (which is .html) but none of the subsequent URL's which are all .jpeg.
Is there something simple I'm missing here?
-
Alphonse it sounded like they were just waiting for the sitemap to launch. Other than that, I couldn't think of anything else to add because the sitemap should solve their issue. However, I have marked this as "Discussion" again.
-
I am a little confused. The question was marked answered, but which one is the answer?
-
We have the same issue however we have image XML sitemaps on each country subdomain's XML Index which point to the image files on images.domain.com.
Example:
https://uk.domain.com/image-sitemap1.xml
https://us.domain.com/image-sitemap1.xml
These 2 files are the same.
We also don't have a homepage on images.domain.com and it currently responds with a 404.
Do you think we need to create a landing page on the homepage and host the image XML sitemap at https://images.domain.com/images-sitemap1.xml rather than in each sub-domain?
Thanks.
-
Yes, we are doing everything correctly, aside from waiting for IT department to create a sitemap.
-
Are you using your own subdomain or one somewhere else (e.g. akamai.com)? You should use your own subdomain, if possible.
Was this a change from a previous version that didn't use a CDN? If those images were/are hosted on your primary domain be sure to match the filenames and paths as closely as possible to what they were before.
If you're doing that you shouldn't have a problem once the sitemap is submitted.
For more information please check out this post:
http://www.goinflow.com/four-seo-best-practices-for-using-a-content-delivery-network-cdn/How do you know that Google only attempts to crawl the primary domain URL (i.e. the .html page)? Are you checking log files?
Is the crawler you're using set to crawl external URLs? If not, that could be the issue. Technically a subdomain is a totally separate website so most tools don't crawl them by default.
-
We've correctly applied the CNAME directive from the CDN to reflect the subdomain. Yet, when Google or any other tool attempts to crawl it only shows ONE URL. Not the images that are residing on their own independent URL's.
-
In order to put those image URLs for the crawler to be able to access them you should either:
- Link to the URLs of the images (does that .html page in the subdomain contain these URLs?)
or
- Use the images URLs as resources in the pages already been crawled. Unfortunately this could be tricky when dealing with CDNs since those resources are dynamic.
In either case, the sitemap will solve your problem.
-
The sitemap is not completed yet. Server logs show Googlebot only indexing one page the .html page, not other pages.
-
Did you reference the sitemap in the robots.txt file or did you set up it in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does having a sub-domain on a different server affect SEO?
I'm working with a company that has a hard-coded website on the root domain, and then a WordPress blog on a subdomain on a separate server. We're planning on implementing a hub and spoke model for their content, hosting the main hubs on the root domain and the linked articles on the blog. Is having the blog on a different server going to hinder our SEO efforts?
Technical SEO | | KaraParlin0 -
Redirect root domain to www
I've been having issues with my keyword rankings with MOZ and this is what David at M0Z asked me to do below. Does anyone have a solution to this? I'm not 100% sure what to do. Does it hurt ranking to have a domain at the root or not? Can I 301 redirect a whole site or do I have to do individual pages. "Your campaign is looking for rankings for the www version of the campaign but the URL resolves as a root domain. This would explain the discrepancy. Since there is no re-direct between the two, you can have brickmarkers.com 301 re-direct to www.site.com which will prevent you from re-creating your campaign to track the root domain. Once the re-direct is in place it will take a while for Google to show the www version in the results in which your campaign rankings will be accurate." Thanks
Technical SEO | | SeaDrive0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Beating a keyword Domain
Has anyone here managed to beat a keyword/exact match domain to top spot? I am currently second and wondering if it is worth the time and effort to knock it off the top spot. How hard is it to get these very annoyingly favoured domains off 1st? Any help and advice much appreciated.
Technical SEO | | pauledwards0 -
Domain authority and keyword difficulty
I know there are too many variables for a certain answer, however do people take their domain authority into account when using keyword difficulty tool? I have a new domain which only has a score of seven at the moment. When using the keyword searching tool what is the maximum difficulty level keywords people would target initially? Obviously I would seek to increase the difficulty of the words over time but to start off its a hard choice between keywords which can be ranked for in a reasonable period of time and the keywords which are getting enough traffic to make the effort worthwhile.
Technical SEO | | Grumpy_Carl0 -
Replace Header Text With Image
I have a static website that I would like to retheme. I have the mockup, and its spliced. The website holds nice rankings right now, and I want to keep them in place. The one thing that will change with this new design is the header will no longer be text, but instead an image. Is there a way to ensure googlebot still sees the H1 tag header exactly how it is now but use an image for the header instead? I dont want any blackhat tricks that will get me banned. Just wondering if there is a simple way to have googlebot see the header as text (not ALT img txt) so the site does not appear to have changed at all. (It hasnt, I only am changing the graphics and colors of background, and header image for better branding.
Technical SEO | | getbigyadig0 -
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like: staging.domain.com
Technical SEO | | fthead9
User-agent: *
Disallow: / in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.0 -
How to push down outdated images in Google image search
When you do a Google image search for one of my client's products, you see a lot of first-generation hardware (the product is now in its third generation). The client wants to know what they can do to push those images down so that current product images rise to the top. FYI: the client's own image files on their site aren't very well optimized with keywords. My thinking is to have the client optimize their own images and the ones they give to the media with relevant keywords in file names, alt text, etc. Eventually, this should help push down the outdated images is my thinking. Any other suggestions? Thanks so much.
Technical SEO | | jimmartin_zoho.com0