How was cdn.seomoz.org configured?
-
The SEOmoz CDN appears to have a "pull zone" that is set to the root of the domain, such that any static file can be addressed from either subdomain:
http://www.seomoz.org/q/moz_nav_assets/images/logo.png
http://cdn.seomoz.org/q/moz_nav_assets/images/logo.png
The risk of this configuration is that web pages (not just images/CSS/JS) also get cached and served by the CDN. I won't put the URL here for fear of Google indexing it, but if you replace the 'www' in the URL below with 'cdn', you'll see a cached copy of the original:
http://www.seomoz.org/ugc/the-greatest-attribution-ever-graphed
The worst-case scenario is that the homepage gets indexed. But this doesn't happen here:
That URL issues a 301 redirect back to the canonical www subdomain. As it should.
Here's my question: how was that done?
Because maxcdn.com can't do it. If you set a "pull zone" to your entire domain, they'll cache your homepage and everything else. googlebot has a field day with that; it will reindex your entire site off the CDN.
Maybe the SEOmoz CDN provider (CloudFront) allows specific URLs to be blocked? Or do you detect the CloudFront IPs and serve them a 301 (which they'd proxy out to anyone requesting cdn.seomoz.org)?
One solution is to create a pull zone that points to a folder, like example.com/images... but this doesn't help a complex site that has cacheable content in multiple places (do you Wordpress users really store ALL your static content under /wp-content/ ?).
Or, as suggested above, dynamically detect requests from the CDN's proxy servers, and give them a 301 for any HTML-page request. This gets complex quickly, and is both prone to breakage and very difficult to regression-test.
Properly retrofitting a complex site to use a CDN, without creating a half-dozen new CDN subdomains, does not appear to be easy.
-
its a SEOmoz secret...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Schema.org mark up to avoid duplicate issue?
Hey there, I was wondering, does product's mark-up help to avoid penalization due to duplicate content? Here is the example: one of my client doesn't supply unique content. Because the major part of the content is technical description of products made by a couple of manufactures, do you think it will help me to link the official manufacturer webpage in a schena.org product mark-up? I know this is the right procedure to add mark-ups, but as on the pages of my client an outbound-link will show up, so I want to tell him this will be the only way to have that duplicate content without incurring in penalisation. I'd like to give him more than one solution, as I'm pretty sure it will never supply us with unique content. Thanks Pierpaolo
Intermediate & Advanced SEO | | madcow780 -
Where does Schema.org Microdata go on a page?
Say you've got a Magento e-commerce site and you want to add Schema.org Microdata to it to take advantage of Google's Rich Snippets feature. Would the markup be part of the page's HTML TITLE . . . or somewhere in the bare-bones description (usually wrought by inputting data into separate fields in the CMS), e.g., Item: Something
Intermediate & Advanced SEO | | RScime25
Price: $00.00
Short Description: blah, blah, blah Or, hidden somewhere in the header? Or, can it be marked-up somewhere beneath my lengthy (and Panda-friendly) content and subsequently extracted by Google and highlighted in the SERPs? Admittedly, I'm more than a bit late to the Schema.org party and I'm a content-guy anyway; and not much good at under-the-hood stuff. I figure I'd better get my chops together, now. I've searched Moz.com's Q&A as well as Google's and Schema.org's and haven't come up with an answer yet that doesn't require that I learn a whole new vocabulary.0 -
Schema.org Organization Logo markup
Hi Guys, Does anyone have an example of a site using schema.org Organization logo markup (as per: http://googlewebmastercentral.blogspot.com.au/2013/05/using-schemaorg-markup-for-organization.html), with the logo appearing in Google SERPs? One of the designers is pressing me ofr an example. I've found plenty of brands getting their logo in the SERPs knoweldge base results, but they have all been using G+ verified company profiles, or other methods (Googs simply selecting a best fit?) to achieve it. Thx!
Intermediate & Advanced SEO | | David_ODonnell0 -
Do links from twitter count in SEOMoz's Toolbar link count?
I am using the Chrome extension and looking at a SERP, when a page is said to have 2000 incoming links, does that include tweets with a link back to this page? What about retweets. Are those counted separately or as one? And what about independent tweets that have exactly the same content (tweet text + link)
Intermediate & Advanced SEO | | davhad0 -
All In One SEO PACK Configuration - Index or Noindex?
I'm finding conflicting information about the right way to configure the All in One SEO Pack wordpress plugin. Do I index or noindex for the items below? Use noindex for Categories - yes or no? Use noindex for Archives - yes or no? Use noindex for Tag Archives - yes or no?
Intermediate & Advanced SEO | | webestate0 -
Are there discrepancies between GWT and SEOMoz?
In our keyword rank tracking report, we've dominated a keyword in Google and have secured the slot for years. All evidence points in this direction. In Google Webmaster Tools, however, this particular keyword averages a rank of 6.5. Is anyone else experience these kinds of discrepancies? What is your take on it?
Intermediate & Advanced SEO | | NaHoku0 -
SeoMoz Crawler Shuts Down The Website Completely
Recently I have switched servers and was very happy about the outcome. However, every friday my site shuts down (not very cool if you are getting 700 unique visitors per day). Naturally I was very worried and digged deep to see what is causing it. Unfortunately, the direct answer was that is was coming from "rogerbot". (see sample below) Today (aug 5) Same thing happened but this time it was off for about 7 hours which did a lot of damage in terms of seo. I am inclined to shut down the seomoz service if I can't resolve this immediately. I guess my question is would there be a possibility to make sure this doesn't happen or time out like that because of roger bot. Please let me know if anyone has answer for this. I use your service a lot and I really need it. Here is what caused it from these error lines: 216.244.72.12 - - [29/Jul/2011:09:10:39 -0700] "GET /pregnancy/14-weeks-pregnant/ HTTP/1.1" 200 354 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)" 216.244.72.11 - - [29/Jul/2011:09:10:37 -0700] "GET /pregnancy/17-weeks-pregnant/ HTTP/1.1" 200 51582 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)"
Intermediate & Advanced SEO | | Jury0