How was cdn.seomoz.org configured?
-
The SEOmoz CDN appears to have a "pull zone" that is set to the root of the domain, such that any static file can be addressed from either subdomain:
http://www.seomoz.org/q/moz_nav_assets/images/logo.png
http://cdn.seomoz.org/q/moz_nav_assets/images/logo.png
The risk of this configuration is that web pages (not just images/CSS/JS) also get cached and served by the CDN. I won't put the URL here for fear of Google indexing it, but if you replace the 'www' in the URL below with 'cdn', you'll see a cached copy of the original:
http://www.seomoz.org/ugc/the-greatest-attribution-ever-graphed
The worst-case scenario is that the homepage gets indexed. But this doesn't happen here:
That URL issues a 301 redirect back to the canonical www subdomain. As it should.
Here's my question: how was that done?
Because maxcdn.com can't do it. If you set a "pull zone" to your entire domain, they'll cache your homepage and everything else. googlebot has a field day with that; it will reindex your entire site off the CDN.
Maybe the SEOmoz CDN provider (CloudFront) allows specific URLs to be blocked? Or do you detect the CloudFront IPs and serve them a 301 (which they'd proxy out to anyone requesting cdn.seomoz.org)?
One solution is to create a pull zone that points to a folder, like example.com/images... but this doesn't help a complex site that has cacheable content in multiple places (do you Wordpress users really store ALL your static content under /wp-content/ ?).
Or, as suggested above, dynamically detect requests from the CDN's proxy servers, and give them a 301 for any HTML-page request. This gets complex quickly, and is both prone to breakage and very difficult to regression-test.
Properly retrofitting a complex site to use a CDN, without creating a half-dozen new CDN subdomains, does not appear to be easy.
-
its a SEOmoz secret...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Suggested Screaming Frog configuration to mirror default Googlebot crawl?
Hi All, Does anyone have a suggested Screaming Frog (SF) configuration to mirror default Googlebot crawl? I want to test my site and see if it will return 429 "Too Many Requests" to Google. I have set the User Agent as Googlebot (Smartphone). Is the default SF Menu > Configuration > Speed > Max Threads 5 and Max URLs 2.0 comparable to Googlebot? Context:
Intermediate & Advanced SEO | | gravymatt-se
I had tried NetPeak SEO Spider which did a nice job and had a cool feature that would pause a crawl if it got to many 429. Long Story short, B2B site threw 429 Errors when there should have been no load on a holiday weekend at 1:00 AM.0 -
Is there a difference between .us and .org for a website targetting the US market?
Hello, We were searching for some evidence regarding this and couldn't find much so the question. We have a service that is related only to the USA market. If we buy a .us domain name with the service we provide in the domain name will google treat it the same with a .org? We did searches regarding this but didn't see too many .us domains popping up. Unfortunately the .com is not available. Thank you
Intermediate & Advanced SEO | | anitawapa0 -
Schema.org wrong display in SERP
Hi, and happy new year! I tagged our new platform with schema.org: website+application software. There's also "reviews". Those reviews use datepublished microdata. However it seems that this info is used as a date for the page... Search for "logiciel cesar" with Google.fr, and the page is https://www.caplogiciel.com/logiciel/cesar
Intermediate & Advanced SEO | | 2MSens
Here's a screenshot of the result: https://www.evernote.com/l/AN29vPn0PFNJdINZtA9QU6x_tmoq99c8D3A What did I do wrong? I checked other websites which are well displayed on Google and they use the same microdata... Thanks. Best, Benoit.0 -
Referral Spam (trafficmonetize.org, 4webmasters.org, etc)
So, we've been plagued with pretty much ALL of our clients from traffic from spammy websites, and even a porn site or two. Here are a few examples: 4webmasters Floatingsharebuttons Trafficmonetize.org 100dollarseo We are seeing our websites negatively impacted by this. How could this be happening?
Intermediate & Advanced SEO | | Digalign0 -
SEOMOZ Diagram question
Hi, On this SEOMOZ help page (http://www.seomoz.org/learn-seo/internal-link) the diagram explaining the optimal link structure (image also attached) has me a little confused. From the homepage, if the bot crawls down the right-hand link first, will it not just hit a dead end where it cant crawl any further and disappear? OR... will it hit the end of the structure and then crawl backwards to the homepage again and follow down another link and then just repeat the process until all pages are indexed? Cheers pyramid.jpg
Intermediate & Advanced SEO | | activitysuper0 -
All In One SEO PACK Configuration - Index or Noindex?
I'm finding conflicting information about the right way to configure the All in One SEO Pack wordpress plugin. Do I index or noindex for the items below? Use noindex for Categories - yes or no? Use noindex for Archives - yes or no? Use noindex for Tag Archives - yes or no?
Intermediate & Advanced SEO | | webestate0 -
Are there discrepancies between GWT and SEOMoz?
In our keyword rank tracking report, we've dominated a keyword in Google and have secured the slot for years. All evidence points in this direction. In Google Webmaster Tools, however, this particular keyword averages a rank of 6.5. Is anyone else experience these kinds of discrepancies? What is your take on it?
Intermediate & Advanced SEO | | NaHoku0 -
Schema.org on Youtube iframe embed?
So I've tried scouring the internet on the proper way to markup youtube videos. I know there's the VideoObject propery but that seems to be more made for the old school embed code that looks like this: <embed width="100%" id="video-player-flash" height="100%" type="application/x-shockwave-flash" src="http://s.ytimg.com/yt/swfbin/watch_as3-vflpp9opi.swf" allowscriptaccess="always" allowfullscreen="true" bgcolor="#000000" flashvars="el=embedded&fexp=904001%2C914057%2C918000%2C910206%2C907217%2C907335%2C921602%2C919306%2C922600%2C919316%2C920704%2C912804%2C913542%2C919324%2C912706&is_html5_mobile_device=false&tabsb=1&hl=en_US&eurl=http%3A%2F%2Fwww.dial800.com%2Fblog%2Fvideos%2Fdial800-product-overview-video&iurl=http%3A%2F%2Fi4.ytimg.com%2Fvi%2Fgk1aD9UCKYA%2Fhqdefault.jpg&tspto=12000&probably_logged_in=1&tsp_buffer=10&video_id=gk1aD9UCKYA&tsp_dvrloop=50&sendtmp=1&enablejsapi=1&sk=WZy3rFIXzzhTB_BpmE1p1tTsbxMib1vIC&rel=1&playlist_module=http%3A%2F%2Fs.ytimg.com%2Fyt%2Fswfbin%2Fplaylist_module-vfl3lol2H.swf&jsapicallback=ytPlayerOnYouTubePlayerReady&playerapiid=player1&framer=http%3A%2F%2Fwww.dial800.com%2Fblog%2Fvideos%2Fdial800-product-overview-video"> Do I need to use that code or is it possible to mark it up using just the clean iframe src that youtube provides now?
Intermediate & Advanced SEO | | SirSud0