CoralCDN Lesson: The great naming conflation of the Web
The last post argued how CoralCDN’s API through domain manipulation provided a simple yet surprisingly powerful content delivery mechanism. Unfortunately, its technique flies in the face of the web’s use of domain names.
Conflating naming, location, and authorization, browsers use domains for three purposes:
- Domains provide a human-readable name for what administrative entity a client is interacting with (e.g., the “common name” identified in SSL server certificates).
- Domains specify where to retrieve content after they are resolved to IP addresses (through DNS).
- Domains specify what security policies to enforce on web objects and their interactions, especially as it relates to browser Same Origin Policy (SOP).
CoralCDN’s domain manipulation clearly focuses on the location/addressing aspect of web objects (#2). And while it has generated abuse complaints given its naming (#1)—either from sites complaining about “illegal mirroring,” third-parties mistakenly issuing DMCA take-down notices, or from those fearing phishing attacks—its most serious implications apply to browser security (#3).
The Same Origin Policy in browsers specifies how scripts and instructions from an origin domain can access and modify browser state. This policy most significantly applies to manipulating cookies, browser windows, frames, documents (through the DOM), as well as to requesting URLs via an XmlHttpRequest. At its simplest level, all of these behaviors are only allowed between resources that belong to the identical origin domain. This provides security against sites accessing each others’ private information kept in cookies, for example. It also prevents websites that run advertisements (such as Google’s AdSense) from easily performing click fraud and pay themselves advertising dollars by programmatically “clicking” on the advertisements shown on their site. (This is enforced because advertisements like AdSense are loaded in an iframe that the parent “document”—the third-party website that stands to gain revenue—cannot access, as the frame belongs to a different domain.)
One caveat to the strict definition of an identical origin (per RFC-2965) is that it provides an exception for domains that share the same domain.tld suffix, in that www.example.com can read and set cookies for example.com. Consider, however, how CoralCDN’s domain manipulation effects this. When example.com is accessed via CoralCDN, it can manipulate all nyud.net cookies, not just those restricted to example.com.nyud.net. Concerned with the potential privacy violations from this, CoralCDN does not “support” cookies, in that its proxies delete any Cookie or Set-Cookie HTTP headers.
Many websites now manage cookies via javascript, however, so cookie information still “leaks” between Coralized domains on the browser. This happens often without a site’s knowledge, as sites commonly use the URL’s domain suffix without verifying its name. Thus, if the Coralized example.com writes nyud.net cookies, these will be sent to evil.com.nyud.net if the client visits that webpage. Honest CoralCDN proxies will delete these cookies in transit, but attackers can still circumvent this problem. For example, when a client visits evil.com.nyud.net, javascript from that page can access nyud.net cookies, then issue a XmlHttpRequest back to evil.com.nyud.net with cookie information embedded in the URL. These problems are mitigated by other security decisions: As CoralCDN does not support https or POST, it is unlikely that sites will establish authenticated sessions over it. Given these attack vectors, however, simply opening up CoralCDN to a peer-to-peer deployment as is would introduce significant risk. Similar attacks would be possible against other uses of the Same Origin Policy in the browser, especially as it relates to the ability to access and manipulate the DOM.
These issues demonstrate other challenges with deploying a secure, cooperative CDN, beyond the problem of finding the right “tradeoff” I talked about previously. It may be attractive to consider using end-hosts in a peer-to-peer fashion, perhaps even embedding proxy software in resource containers or VMs to satisfy those users’ concerns. If clients and servers can be slightly modified, end-to-end signatures (as in RFC 2660 and Firecoral) can help ensure the integrity of content distributed through an untrusted proxy network. Similar care would still need to be taken, however, to ensure the appropriate confidentiality of user-specific information.
In fact, these are some of the very challenges and approaches we are tackling with Firecoral, which seeks to build a P2P-CDN by running “cooperative proxies” as a browser extension of participating peers. We’re actively working towards a release; hopefully any day now!