Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Stop google indexing CDN pages
-
Just when I thought I'd seen it all, google hits me with another nasty surprise!
I have a CDN to deliver images, js and css to visitors around the world. I have no links to static HTML pages on the site, as far as I can tell, but someone else may have - perhaps a scraper site?
Google has decided the static pages they were able to access through the CDN have more value than my real pages, and they seem to be slowly replacing my pages in the index with the static pages.
Anyone got an idea on how to stop that?
Obviously, I have no access to the static area, because it is in the CDN, so there is no way I know of that I can have a robots file there.
It could be that I have to trash the CDN and change it to only allow the image directory, and maybe set up a separate CDN subdomain for content that only contains the JS and CSS?
Have you seen this problem and beat it?
(Of course the next thing is Roger might look at google results and start crawling them too, LOL)
P.S. The reason I am not asking this question in the google forums is that others have asked this question many times and nobody at google has bothered to answer, over the past 5 months, and nobody who did try, gave an answer that was remotely useful. So I'm not really hopeful of anyone here having a solution either, but I expect this is my best bet because you guys are always willing to try.
-
Thank you Edward.
I don't have quite that problem, but I think you are right too.
My CDN is set up to be Origin Pull.
That means there is no need to FTP - the system just fetches content as requested.
- you should check that out if you have to ftp everything.
But what you said that helped me is this - that I should have had one CNAME for images and anotehr CNAME for content and the content should be limited to a folder called content, so I can put the CSS files and the JS files in it and that way, the plain HTML pages at teh root level will never be affected.
I also realized, while checking the system, that I wasn't using a canonical tag in the intermediate pages, as I was in the story pages. So I just added code to add canonical tags for all the intermediate pages and the front page.
I do have a few other types of pages, so I will handle the code for them next.
I think adding the canonical tag might fix the problem, but I will also work on reconfiguring the CDN and change over when the action is not too busy, in case it takes a while to propagate.
-
It sounds like you have set up your CDN slightly wrong.
After setting up a few like you have I realised that I was actually making a complete duplicate of the site rather than just the images or assets
I imagine you have your origin directory for the CDN in the public html folder.
Create a subdomain, set that as the origin.
Eg.. I'm working on this site at the moment: http://looksfishy.co.uk/
I have a subdomain called assets: http://assets.looksfishy.co.uk/
The cdn content: http://cdn.looksfishy.co.uk/
Files uploaded here:
http://assets.looksfishy.co.uk/species/holder/pike.jpg
Displayed here:
http://cdn.looksfishy.co.uk/species/holder/pike.jpg
Check the ip address on them.
It does make uploading images by ftp a bit of a faff, but does make your site better
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Log-in page ranking instead of homepage due to high traffic on login page! How to avoid?
Hi all, Our log-in page is ranking in SERP instead of homepage and some times both pages rank for the primary keyword we targeted. We have even dropped. I am looking for a solution for this. Three points here to consider is: Our log-in page is the most visited page and landing page on the website. Even there is the primary keyword in this page or not; same scenario continues Log-in page is the first link bots touch when they crawling any page of our website as log-in page is linked on top navigation menu If we move login page to sub-domain, will it works? I am worrying that we loose so much traffic to our website which will be taken away from log-in page sub domain Please guide with your valuable suggestions. Thanks
Algorithm Updates | | vtmoz0 -
Does Google ignores page title suffix?
Hi all, It's a common practice giving the "brand name" or "brand name & primary keyword" as suffix on EVERY page title. Well then it's just we are giving "primary keyword" across all pages and we expect "homepage" to rank better for that "primary keyword". Still Google ranks the pages accordingly? How Google handles it? The default suffix with primary keyword across all pages will be ignored or devalued by Google for ranking certain pages? Or by the ranking of website improves for "primary keyword" just because it has been added to all page titles?
Algorithm Updates | | vtmoz0 -
US domain pages showing up in Google UK SERP
Hi, Our website which was predominantly for UK market was setup with a .com extension and only two years ago other domains were added - US (.us) , IE (.ie), EU (.eu) & AU (.com.au) Last year in July, we noticed that few .us domain urls were showing up in UK SERPs and we realized the sitemap for .us site was incorrectly referring to UK (.com) so we corrected that and the .us domain urls stopped appearing in the SERP. Not sure if this actually fixed the issue or was such coincidental. However in last couple of weeks more than 3 .us domain urls are showing for each brand search made on Google UK and sometimes it replaces the .com results all together. I have double checked the PA for US pages, they are far below the UK ones. Has anyone noticed similar behaviour &/or could anyone please help me troubleshoot this issue? Thanks in advance, R
Algorithm Updates | | RaksG0 -
Does using parent pages in WordPress help with SEO and/or indexing for SERPs?
I have a law office and we handle four different practice areas. I used to have multiple websites (one for each practice area) with keywords in the actual domain name, but based on the recommendation of SEO "experts" a few years ago, I consolidated all the webpages into one single webpage (based on the rumors at the time that Google was going to be focusing on authorship and branding in the future, rather than keywords in URLs or titles). Needless to say, Google authorship was dropped a year or two later and "branding" never took off. Overall, having one webpage is convenient and generally makes SEO easier, but there's been a huge drawback: When my page comes up in SERPs after searching for "attorney" or "lawyer" combined with a specific practice area, the practice area landing pages don't typically come up in the SERPs, only the front page comes up. It's as if Google recognizes that I have some decent content, and Google knows that I specialize in multiple practice areas, but it directs everyone to the front page only. Prospective clients don't like this and it causes my bounce rate to be high. They like to land on a page focusing on the practice area they searched for. Two questions: (1) Would using parent pages (e.g. http://lawfirm.com/divorce/anytown-usa-attorney-lawyer/ vs. http://lawfirm.com/anytown-usa-divorce-attorney-lawyer/) be better for SEO? The research I've done up to this point appears to indicate "no." It doesn't make much difference as long as the keywords are in the domain name and/or URL. But I'd be interested to hear contrary opinions. (2) Would using parent pages (e.g. http://lawfirm.com/divorce/anytown-usa-attorney-lawyer/ vs. http://lawfirm.com/anytown-usa-divorce-attorney-lawyer/) be better for indexing in Google SERPs? For example, would it make it more likely that someone searching for "anytown usa divorce attorney" would actually end up in the divorce section of the website rather than the front page?
Algorithm Updates | | micromano0 -
Ahrefs - What Causes a Drastic Loss in Referring Pages?
While I was doing research on UK Flower companies I noticed that one particular domain had great rankings (top 3), but has slid quite a bit down to page two. After investigating further I noticed that they had a drastic loss of referring pages, but an increase in total referring domains. See this screenshot from ahrefs. I took a look at their historical rankings (got them from the original SEO provider's portfolio) and compared it to the Wayback Machine. There did not seem to be any drastic changes in the site structure. My question is what would cause such a dramatic loss in total referring pages while showing a dramatic increase in referring domains? It appears that the SEO company was trying rebound from the loss of links though. Any thoughts on why this might happen? 56VD5jD
Algorithm Updates | | AaronHenry0 -
Is it possible that Google may have erroneous indexing dates?
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after. On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date. The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results. For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied. First post:
Algorithm Updates | | SorinaDascalu
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013! Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013! Is it possible to be an error in the date shown in Google search results? I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates. How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?0 -
Why has my homepage been replaced in Google by my Facebook page?
Hi. I was wondering if others have had this happen to them. Lately, I've noticed that on a couple of my sites the homepage no longer appears in the Google SERP. Instead, a Facebook page I've created appears in the position the homepage used to get. My subpages still get listed in Google--just not the homepage. Obviously, I'd prefer that both the homepage and Facebook page appear. Any thoughts on what's going on? Thanks for your help!
Algorithm Updates | | TuxedoCat0 -
How do I get the expanded results in a Google search?
I notice for certain site (ex: mint.com) that when I search, the top result has a very detailed view with options to click to different subsections of the site. However for my site, even though we're consistently the top result for our branded terms, the result is still only a single line item. How do I adjust this?
Algorithm Updates | | syount1