* You are viewing the archive for the ‘Content Distribution’ Category

CoralCDN Lesson: The great naming conflation of the Web

coralcdn-lesson-the-great-naming-conflation-of-the-web

The last post argued how CoralCDN’s API through domain manipulation provided a simple yet surprisingly powerful content delivery mechanism.  Unfortunately, its technique flies in the face of the web’s use of domain names.

Conflating naming, location, and authorization, browsers use domains for three purposes:

  1. Domains provide a human-readable name for what administrative entity a client is interacting with (e.g., the “common name” identified in SSL server certificates).
  2. Domains specify where to retrieve content after they are resolved to IP addresses (through DNS).
  3. Domains specify what security policies to enforce on web objects and their interactions, especially as it relates to browser Same Origin … Continue Reading

CoralCDN Lesson: The interface was right -or- Programming elastic CDN services

coralcdn-lesson-the-interface-was-right-or-programming-elastic-cdn-services

While my previous post argued that CoralCDN’s architecture design might not be ideal given its deployment, it has proven successful from the simple perspective of real-world use. Rather than any technical argument, we believe that the central reason for its adoption has been its simple user interface: Any URL can be requested through CoralCDN by appending nyud.net to its hostname.

Interface design

While superficially obvious, this interface design achieves several important deployment goals:

  • Transparency: Work with unmodified, unconfigured, and unaware web clients and web servers.
  • Deep caching: Support the automatic retrieval of embedded images or links also through CoralCDN when appropriate.
  • Server control: Not … Continue Reading

CoralCDN Lesson: The design was mostly wrong

coralcdn-lesson-the-design-was-mostly-wrong

Most of my posts about CoralCDN to date have discussed techniques to make the system more robust; now I discuss what it got wrong.  While nice, many of these optimizations were in fact moot: CoralCDN’s design is ill-suited for its current deployment and usage.

coral-uniq-reqsLet us frame this argument by first considering some usage statistics from CoralCDN’s deployment.  The available aggregate data from 167 of the ~250 operating CoralCDN nodes during one recent, randomly-chosen day (January 20, 2009) shows that these nodes received a total of 9.74M requests … Continue Reading

Bridging the Gap with HashCache

bridging-the-gap-with-hashcache

[Today we'll be having a guest post by Anirudh Badam, a PhD student in the larger Network Systems Group at Princeton, related to systems research for developing regions.  The work he'll be talking about was recently named one of Technology Review's Top 10 Emerging Technologies for 2009.  -- Mike]

To provide Internet connectivity in the developing world is a daunting task, with problems pertaining to a high cost of bandwidth, ill-provisioned equipment and power, scarcity of on-site expertise, and adverse environmental conditions. Most common way to offset bandwidth cost/consumption is to deploy high-performance web proxy … Continue Reading

CoralCDN Lesson: Fair-sharing bandwidth via admission control

coralcdn-lesson-fair-sharing-bandwidth-via-admission-control

For commercial CDNs and other computing services, the typical answer to resource limits is simply to acquire more capacity.  As CoralCDN’s deployment on PlanetLab does not have that luxury, we instead apply admission control to manage its bandwidth resources.  This post describes some of these mechanisms, while we’ll take a step back in the next post to describe some of the challenges in doing resource accounting and management on a virtualized and shared platform such as PlanetLab.

asiantsunamivideos

Following the Asian tsunami of December … Continue Reading

CoralCDN Lesson: Accepting conservatively and serving liberally

coralcdn-lesson-accepting-conservatively-and-serving-liberally

At its heart, CoralCDN provides a caching serving, not a persistent data store.  Thus, it ultimately requires that a URL’s origin server is initially available, so that it can pull in content to some CoralCDN proxy and make it available across the network.   While traditional web proxies normally interact with sufficiently-provisioned or otherwise well-behaved origin webservers, CoralCDN experiences a different norm.  Given its very design goals, its proxies typically interact with overloaded or poorly-behaving servers; it therefore needs to react to (non-crash) failures as the rule, not the exception.  Thus, one design philosophy that has come to govern CoralCDN … Continue Reading

Security mechanisms in CoralCDN (and some attacks)

security-mechanisms-in-coralcdn-and-some-attacks

Before finally getting to some experiences, I wanted to touch on some of the security mechanisms that CoralCDN proxies incorporate to curtail misuse, especially important given their deployment at PlanetLab-affiliated universities.

Limited functionality

CoralCDN proxies only support GET and HEAD requests.  Many of the attacks for which “open” proxies are infamous are simply not feasible.  For example, clients cannot use CoralCDN to POST passwords for brute-force cracking.  It does not support SSL and thus risk carry more confidential data.  CoralCDN proxies do not support CONNECT requests, and thus they cannot be used to send spam as SMTP relays or forge From: addresses … Continue Reading

The Design of CoralCDN

the-design-of-coralcdn

In this post, I describe the architecture and mechanisms of CoralCDN at a high-level. This is meant to provide some of the background necessary for some of our experiences and lessons with operating the system.

System Overview

CoralCDN is composed of three main parts: (1) a network of cooperative HTTP proxies that handle users’ requests, (2) a network of DNS nameservers for .nyud.net that map clients to nearby CoralCDN HTTP proxies, and (3) the underlying Coral indexing infrastructure and clustering machinery on which the first two applications are built.  You’ll find that I refer to the entire system as “CoralCDN”, but the … Continue Reading

Firecoral @ IPTPS

firecoral-iptps

We’ve recently been working hard on Firecoral – a browser-based, peer-to-peer content distribution network for web caching. I’ll be presenting a short talk on Firecoral at the 8th International Workshop on Peer-to-Peer Systems (IPTPS) on April 21st in Boston, MA.

Peer-to-peer content distribution has been inarguably successful for large file distribution (e.g. BitTorrent), but P2P services have been restricted to stand-alone applications, not transparently incorporated into Web browsing and seamlessly running over HTTP. CoralCDN has served as a web content distribution network for the past five years, but its deployment has been limited to PlanetLab and demand quickly … Continue Reading

“Systems Researcher Interview”

systems-researcher-interview

Emil Sit, a friend from MIT with whom I published one of my first papers, has been blogging a series of “interviews” from colleagues about their experience building distributed systems. He started this series after describing some implementation issues with building Chord, which was one of the first and remains the canonical distributed hash table (DHT).   CoralCDN uses a DHT for its indexing, and its first implementation actually used MIT Chord as a software layer.  (I later implemented my own DHT layer, although instead based on Kademlia — which was proposed by an officemate … Continue Reading