Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Stop google indexing CDN pages
-
Just when I thought I'd seen it all, google hits me with another nasty surprise!
I have a CDN to deliver images, js and css to visitors around the world. I have no links to static HTML pages on the site, as far as I can tell, but someone else may have - perhaps a scraper site?
Google has decided the static pages they were able to access through the CDN have more value than my real pages, and they seem to be slowly replacing my pages in the index with the static pages.
Anyone got an idea on how to stop that?
Obviously, I have no access to the static area, because it is in the CDN, so there is no way I know of that I can have a robots file there.
It could be that I have to trash the CDN and change it to only allow the image directory, and maybe set up a separate CDN subdomain for content that only contains the JS and CSS?
Have you seen this problem and beat it?
(Of course the next thing is Roger might look at google results and start crawling them too, LOL)
P.S. The reason I am not asking this question in the google forums is that others have asked this question many times and nobody at google has bothered to answer, over the past 5 months, and nobody who did try, gave an answer that was remotely useful. So I'm not really hopeful of anyone here having a solution either, but I expect this is my best bet because you guys are always willing to try.
-
Thank you Edward.
I don't have quite that problem, but I think you are right too.
My CDN is set up to be Origin Pull.
That means there is no need to FTP - the system just fetches content as requested.
- you should check that out if you have to ftp everything.
But what you said that helped me is this - that I should have had one CNAME for images and anotehr CNAME for content and the content should be limited to a folder called content, so I can put the CSS files and the JS files in it and that way, the plain HTML pages at teh root level will never be affected.
I also realized, while checking the system, that I wasn't using a canonical tag in the intermediate pages, as I was in the story pages. So I just added code to add canonical tags for all the intermediate pages and the front page.
I do have a few other types of pages, so I will handle the code for them next.
I think adding the canonical tag might fix the problem, but I will also work on reconfiguring the CDN and change over when the action is not too busy, in case it takes a while to propagate.
-
It sounds like you have set up your CDN slightly wrong.
After setting up a few like you have I realised that I was actually making a complete duplicate of the site rather than just the images or assets
I imagine you have your origin directory for the CDN in the public html folder.
Create a subdomain, set that as the origin.
Eg.. I'm working on this site at the moment: http://looksfishy.co.uk/
I have a subdomain called assets: http://assets.looksfishy.co.uk/
The cdn content: http://cdn.looksfishy.co.uk/
Files uploaded here:
http://assets.looksfishy.co.uk/species/holder/pike.jpg
Displayed here:
http://cdn.looksfishy.co.uk/species/holder/pike.jpg
Check the ip address on them.
It does make uploading images by ftp a bit of a faff, but does make your site better
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Cache
So, when I gain a link I always check to see if the page that is linking is in the Google cache. I've noticed recently that more and more pages are actually not showing up in Google's cache, yet still appear in search results. I did read an article from someone whoo works at Google a few weeks back that there is sometimes an error with the cache and occasionally the cache will not display. This week, my own website isn't showing up in the cache yet I'm still ranking in SERP's. I'm not worried about it, mostly whitehat, but has there been any indication that Google are phasing out the ability to check cache's of websites?
Algorithm Updates | | ThorUK0 -
Log-in page ranking instead of homepage due to high traffic on login page! How to avoid?
Hi all, Our log-in page is ranking in SERP instead of homepage and some times both pages rank for the primary keyword we targeted. We have even dropped. I am looking for a solution for this. Three points here to consider is: Our log-in page is the most visited page and landing page on the website. Even there is the primary keyword in this page or not; same scenario continues Log-in page is the first link bots touch when they crawling any page of our website as log-in page is linked on top navigation menu If we move login page to sub-domain, will it works? I am worrying that we loose so much traffic to our website which will be taken away from log-in page sub domain Please guide with your valuable suggestions. Thanks
Algorithm Updates | | vtmoz0 -
Is it Okay to have "No Response" pages?
Hi all, I can see some "No Response" pages which gives a error message "Site cannot be reached" or keeps on loading but don't. I have got this list from Screaming from spider tool. Do we need to fix these or ignore? Thanks
Algorithm Updates | | vtmoz0 -
Google & Tabbed Content
Hi I wondered if anyone had a case study or more info on how Google treats content under tabs? We have an ecommerce site & I know it is common to put product content under tabs, but will Google ignore this? Becky
Algorithm Updates | | BeckyKey1 -
Does Google use dateModified or date Published in its SERPs?
I was curious as to the prioritization of dateCreated / datePublished and dateModified in our microdata and how it affects google search results. I have read some entries online that say Google prioritizes dateModified in SERPs, but others that claim they prioritize datePublished or dateCreated. Do you know (or could you point me to some resources) as to whether Google uses dateModified or date Published in its SERPs? Thanks!
Algorithm Updates | | Parse.ly0 -
How much link juice does a sites homepage pass to inner pages and influence inner page rankings?
Hi, I have a question regarding the power of internal links and how much link juice they pass, and how they influence search engine ranking positions. If we take the example of an ecommerce store that sells kites. Scenario 1 It can be assumed that it is easier for the kite ecommerce store to earn links to its homepage from writing great content on its blog, as any blogger that will link to the content will likely use the site name, and homepage as anchor text. So if we follow this through, then it can be assumed that there will eventually be a large number of high quality backlinks pointing to the sites homepage from various high authority blogs that love the content being posted on the sites blog. The question is how much link juice does this homepage pass to the category pages, and from the category pages then to the product pages, and what influence does this have on rankings? I ask because I have seen strong ecommerce sites with very strong DA or domain PR but with no backlinks to the product page/category page that are being ranked in the top 10 of search results often, for the respective category and product pages. It therefore leads me to assume that internal links must have a strong determiner on search rankings... Could it therefore also be assumed that a site with a PR of 5 and no links to a specific product page, would rank higher than a site with a PR of 1 but with 100 links pointing to the specific product page? Assuming they were both trying to rank for the same product keyword, and all other factors were equal. Ie. neither of them built spammy links or over optimised anchor text? Scenario 2 Does internal linking work both ways? Whereas in my above example I spoke about the homepage carrying link juice downward to the inner category and product pages. Can a powerful inner page carry link juice upward to category pages and then the homepage. For example, say the blogger who liked the kite stores blog content piece linked directly to the blog content piece from his site and the kite store blog content piece was hosted on www.xxxxxxx.com/blog/blogcontentpiece As authority links are being built to this blog content piece page from other bloggers linking to it, will it then pass link juice up to the main blog category page, and then the kite sites main homepage? And if there is a link with relevant anchor text as part of the blog content piece will this cause the link juice flowing upwards to be stronger? I know the above is quite winded, but I couldn't find anywhere that explains the power of internal linking on SERP's... Look forward to your replies on this....
Algorithm Updates | | sanj50500 -
Rankings changing every couple of MINUTES in Google?
We've been experiencing some unusual behaviour in the Google.co.uk SERPs recently... Basically, the ranking of some of our websites for certain keywords appears to be changing by the minute. For example, doing a search for "our keyword" might show us at #20. Then a few minutes later, doing the same search shows us at #14, and then the same search a few minutes later shows us at #26, and then sometimes we're not ranked at all, etc etc. I know the algorithm changes a lot, but does it really change every couple of minutes? Has anyone else experienced this kind of behaviour in the SERPs? What could be causing it to happen?
Algorithm Updates | | d4online0 -
Home page replaced by subpage in google SERP (good or bad)
SInce Panda, We have seen our home page drop from #2 in google.ie serp to page 3 but it has been replaced in the same position @#2 by our relevent sub page for the keyword that we ranked#2 for. Is this a good or bad thing from and seo point of view and is it better to have deep pages show in serp rather than the homepage of a site and what is the best line of action from here in relation to seo. Is it best to work on subpage or home page for that keyword and should link building for that phrase be directed towards the subpage or the homepage as the subpage is obviously more relevent in googles eyes for the search term. It is clear that all areas of the site should be looked at in relation to link building and deep links etc but now that google is obviously looking at relevancy very closely should all campaigns be sectioned into relevent content managed sections and the site likewise and treated on an individual basis. Any help that you may have would be very welcome. Paul
Algorithm Updates | | mcintyr0