CoralCDN Lesson: The great naming conflation of the Web

The last post argued how CoralCDN’s API through domain manipulation provided a simple yet surprisingly powerful content delivery mechanism.  Unfortunately, its technique flies in the face of the web’s use of domain names.

Conflating naming, location, and authorization, browsers use domains for three purposes:

  1. Domains provide a human-readable name for what administrative entity a client is interacting with (e.g., the “common name” identified in SSL server certificates).
  2. Domains specify where to retrieve content after they are resolved to IP addresses (through DNS).
  3. Domains specify what security policies to enforce on web objects and their interactions, especially as it relates to browser Same Origin Policy (SOP).

CoralCDN’s domain manipulation clearly focuses on the location/addressing aspect of web objects (#2).  And while it has generated abuse complaints given its naming (#1)—either from sites complaining about “illegal mirroring,” third-parties mistakenly issuing DMCA take-down notices, or from those fearing phishing attacks—its most serious implications apply to browser security (#3).

The Same Origin Policy in browsers specifies how scripts and instructions from an origin domain can access and modify browser state.  This policy most significantly applies to manipulating cookies, browser windows, frames, documents (through the DOM), as well as to requesting URLs via an XmlHttpRequest. At its simplest level, all of these behaviors are only allowed between resources that belong to the identical origin domain.  This provides security against sites accessing each others’ private information kept in cookies, for example.  It also prevents websites that run advertisements (such as Google’s AdSense) from easily performing click fraud and pay themselves advertising dollars by programmatically “clicking” on the advertisements shown on their site.  (This is enforced because advertisements like AdSense are loaded in an iframe that the parent “document”—the third-party website that stands to gain revenue—cannot access, as the frame belongs to a different domain.)

One caveat to the strict definition of an identical origin (per RFC-2965) is that it provides an exception for domains that share the same domain.tld suffix, in that www.example.com can read and set cookies for example.com.  Consider, however, how CoralCDN’s domain manipulation effects this.  When example.com is accessed via CoralCDN, it can manipulate all nyud.net cookies, not just those restricted to example.com.nyud.net.  Concerned with the potential privacy violations from this, CoralCDN does not “support” cookies, in that its proxies delete any Cookie or Set-Cookie HTTP headers.

Many websites now manage cookies via javascript, however, so cookie information still “leaks” between Coralized domains on the browser. This happens often without a site’s knowledge, as sites commonly use the URL’s domain suffix without verifying its name. Thus, if the Coralized example.com writes nyud.net cookies, these will be sent to evil.com.nyud.net if the client visits that webpage. Honest CoralCDN proxies will delete these cookies in transit, but attackers can still circumvent this problem.  For example, when a client visits evil.com.nyud.net, javascript from that page can access nyud.net cookies, then issue a XmlHttpRequest back to  evil.com.nyud.net with cookie information embedded in the URL.  These problems are mitigated by other security decisions: As CoralCDN does not support https or POST, it is unlikely that sites will establish authenticated sessions over it.  Given these attack vectors, however, simply opening up CoralCDN to a peer-to-peer deployment as is would introduce significant risk.  Similar attacks would be possible against other uses of the Same Origin Policy in the browser, especially as it relates to the ability to access and manipulate the DOM.

These issues demonstrate other challenges with deploying a secure, cooperative CDN, beyond the problem of finding the right “tradeoff” I talked about previously. It may be attractive to consider using end-hosts in a peer-to-peer fashion, perhaps even embedding proxy software in resource containers or VMs to satisfy those users’ concerns.  If clients and servers can be slightly modified, end-to-end signatures (as in RFC 2660 and Firecoral) can help ensure the integrity of content distributed through an untrusted proxy network.  Similar care would still need to be taken, however, to ensure the appropriate confidentiality of user-specific information.

In fact, these are some of the very challenges and approaches we are tackling with Firecoral, which seeks to build a P2P-CDN by running “cooperative proxies” as a browser extension of participating peers. We’re actively working towards a release; hopefully any day now!

  • oobx

    I was just turned on to Coral and was pretty jazzed about it until I read your post. Google app engine as a CDN is what led me to Coral. GAE should not be subject to such security issues. I've not read your other posts; so please excuse my ignorance of firecoral, etc.

    In trying to comprehend the scope of the security issues you raise, I conclude that only cookies set by nyud.net-cached content is vulnerable. So, I just use coral cache for images and truly static content.

    But, what's to stop evildoer from linking to my script that sets cookies? Nothing. But, how would he gain the trust of the user in order for the user to click on the nyud.net link? Then, how would evildoer track that click and convince the user to go to the malicious site to hijack data?

    Coral CDN sounds like a great asset for bandwidth-poor folks. I hope you can improve upon it. As is, it seems very workable so long as developers understand the caveats such as security and the potential to skew statistics.

    Thanks for raising the issue.

  • Hi oobx,

    Actually, the cookie issue is much less a security issue if you are a website that is trying to explicitly use CoralCDN for cached content. You should just specify that your code uses the full origin name when setting cookies: http://www.yoursite.com.nyud.net, instead of just setting a default of the domain.tld (i.e., nyud.net) for “ease of use”. This is good security practice anyway: the principle of least privilege and all. Then a user from evil.com.nyud.net can't read cookies set to http://www.yoursite.com.nyud.net, as it fails the same origin policy check.

    The problem I raise above is more when a website is being accessed by a Coralized URL and they are not similarly security conscious, so that they default to using the domain.tld, instead of the full origin name.

    Let me know if that assuages your concern.

  • Do you think it's true that the internet is slated to run out of domain names next year?

  • I'm not too worried about that. Much is just domain squatting anyway…

  • Mike, that's a good point. I was listening to the radio last evening, and there is a group trying to do a .GAY top level domain. I thought it was interesting, because the person who is promoting that idea owns a for-profit business that plans on purchasing most of the popular domain names if .GAY gets the go ahead.

  • I'm not too worried about that. Much is just domain squatting anyway…

  • Mike, that's a good point. I was listening to the radio last evening, and there is a group trying to do a .GAY top level domain. I thought it was interesting, because the person who is promoting that idea owns a for-profit business that plans on purchasing most of the popular domain names if .GAY gets the go ahead.

  • If the ICANN refused to support .xxx, I highly doubt they will support .gay. Perhaps the more interesting development was the approval of non-latin domain names recently.