Post-doc opportunity

post-doc-opportunity

Jen Rexford and I are jointly seeking to hire a post-doc to join us at Princeton.   We are looking for somebody to start sometime between February and June 2010 (ideally, as soon as possible) and stay for a duration of 18-24 months.

This research opportunity is part of the SCAFFOLD project.  SCAFFOLD is a new network architecture that focuses on better supporting distributed and wide-area services.  These services, run over multiple hosts distributed across many locations, need to respond quickly to server and network churn: both unexpected changes (due to equipment failures and physical mobility) and intentional changes (during planned … Continue Reading

New datacenter network architectures

new-datacenter-network-architectures

This year’s HotNets workshop was held over the past two days in the faculty club at NYU; it was nice being on old turf.   The HotNets workshop has authors write 6-page “position” or “work-in-progress” papers on current “hot topics in networking” (surprise!).  Tucked into a cosy downstairs room, the workshop was nicely intimate and it saw lots of interesting questions and discussion.

One topic that was of particular interest to me were new ideas about datacenter networking; HotNets included two papers in each of two different research areas.

The first thematic area was addressing the problem of bisectional … Continue Reading

CoralCDN Lesson: The great naming conflation of the Web

coralcdn-lesson-the-great-naming-conflation-of-the-web

The last post argued how CoralCDN’s API through domain manipulation provided a simple yet surprisingly powerful content delivery mechanism.  Unfortunately, its technique flies in the face of the web’s use of domain names.

Conflating naming, location, and authorization, browsers use domains for three purposes:

  1. Domains provide a human-readable name for what administrative entity a client is interacting with (e.g., the “common name” identified in SSL server certificates).
  2. Domains specify where to retrieve content after they are resolved to IP addresses (through DNS).
  3. Domains specify what security policies to enforce on web objects and their interactions, especially as it relates to browser Same Origin … Continue Reading

CoralCDN Lesson: The interface was right -or- Programming elastic CDN services

coralcdn-lesson-the-interface-was-right-or-programming-elastic-cdn-services

While my previous post argued that CoralCDN’s architecture design might not be ideal given its deployment, it has proven successful from the simple perspective of real-world use. Rather than any technical argument, we believe that the central reason for its adoption has been its simple user interface: Any URL can be requested through CoralCDN by appending nyud.net to its hostname.

Interface design

While superficially obvious, this interface design achieves several important deployment goals:

  • Transparency: Work with unmodified, unconfigured, and unaware web clients and web servers.
  • Deep caching: Support the automatic retrieval of embedded images or links also through CoralCDN when appropriate.
  • Server control: Not … Continue Reading

CoralCDN Lesson: The design was mostly wrong

coralcdn-lesson-the-design-was-mostly-wrong

Most of my posts about CoralCDN to date have discussed techniques to make the system more robust; now I discuss what it got wrong.  While nice, many of these optimizations were in fact moot: CoralCDN’s design is ill-suited for its current deployment and usage.

coral-uniq-reqsLet us frame this argument by first considering some usage statistics from CoralCDN’s deployment.  The available aggregate data from 167 of the ~250 operating CoralCDN nodes during one recent, randomly-chosen day (January 20, 2009) shows that these nodes received a total of 9.74M requests … Continue Reading

Bridging the Gap with HashCache

bridging-the-gap-with-hashcache

[Today we'll be having a guest post by Anirudh Badam, a PhD student in the larger Network Systems Group at Princeton, related to systems research for developing regions.  The work he'll be talking about was recently named one of Technology Review's Top 10 Emerging Technologies for 2009.  -- Mike]

To provide Internet connectivity in the developing world is a daunting task, with problems pertaining to a high cost of bandwidth, ill-provisioned equipment and power, scarcity of on-site expertise, and adverse environmental conditions. Most common way to offset bandwidth cost/consumption is to deploy high-performance web proxy … Continue Reading

Blog Name -> Dirty Slate Design

blog-name-dirty-slate-design

So after a few months of (occasional) posts, we finally decided to give the blog a name.

Dirty Slate Design reflects a design philosophy for systems and networking research, where deployability is a central goal.  While it’s often tempting to try to solve problems by wiping the slate clean and starting afresh, expecting a complete redesign and redeployment of systems as important, complex, and far-flung as the Web or the Internet is rather optimistic at best.

So, rather than implying something that is just “quick and dirty,” this philosophy tries to push new functionality or designs in a way that can be … Continue Reading

CoralCDN Lesson: Interacting with virtualized and shared hosting services

coralcdn-lesson-interacting-with-virtualized-and-shared-hosting-services

In the previous post, I discussed how CoralCDN implemented bandwidth restrictions that were fair-shared between “customer” domains. There was another major twist to this problem, however, that I didn’t talk about: the challenge of performing such a technique on a virtualized and shared platform such as PlanetLab.  While my discussion is certainly PlanetLab-centric, its questions are also applicable to other P2P deployments where users run peers within resource containers, or to commercial hosting environments using billing models such as 95th percentile usage.

Interacting with hosting platforms

CoralCDN’s self-regulation works well in trusted environments, and this approach is used similarly in other peer-to-peer … Continue Reading

CoralCDN Lesson: Fair-sharing bandwidth via admission control

coralcdn-lesson-fair-sharing-bandwidth-via-admission-control

For commercial CDNs and other computing services, the typical answer to resource limits is simply to acquire more capacity.  As CoralCDN’s deployment on PlanetLab does not have that luxury, we instead apply admission control to manage its bandwidth resources.  This post describes some of these mechanisms, while we’ll take a step back in the next post to describe some of the challenges in doing resource accounting and management on a virtualized and shared platform such as PlanetLab.

asiantsunamivideos

Following the Asian tsunami of December … Continue Reading

CoralCDN Lesson: Accepting conservatively and serving liberally

coralcdn-lesson-accepting-conservatively-and-serving-liberally

At its heart, CoralCDN provides a caching serving, not a persistent data store.  Thus, it ultimately requires that a URL’s origin server is initially available, so that it can pull in content to some CoralCDN proxy and make it available across the network.   While traditional web proxies normally interact with sufficiently-provisioned or otherwise well-behaved origin webservers, CoralCDN experiences a different norm.  Given its very design goals, its proxies typically interact with overloaded or poorly-behaving servers; it therefore needs to react to (non-crash) failures as the rule, not the exception.  Thus, one design philosophy that has come to govern CoralCDN … Continue Reading

Postdocs and the CIFellows program

postdocs-and-the-cifellows-program

Some of you might have heard about the Computing Innovation Fellows program, which is a new funding opportunity for recent PhDs interested in pursuing a 1-2 year postdoc.  Realistically, this program was a response to the terrible job market (both in academia and at research labs) that graduates are facing this year.  It’s pretty impressive that the CCC and CRA were able to put together the plan, funding, and organizing so quickly!  For those not aware of the program and interested in a postdoc next year, check it out.  Applications are due June 9, 2009.

I also just … Continue Reading

CoralCDN Lesson: Fixing overlooked assumptions in DHTs

coralcdn-lesson-fixing-overlooked-assumptions-in-dhts

So let’s start with the first of seven lessons from CoralCDN’s deployment:

  • How all published distributed hash table (DHT) algorithms are susceptible to race conditions and routing errors for non-transitive network connectivity, and what can be done to mitigate these problems.

Some challenges with deploying DHTs
slashdot-data

CoralCDN’s primary goal was to enable websites to survive spikes in traffic.  We can see examples of such so-called flash crowds through CoralCDN: The figure on the left shows a spike to Coralized slashdot.org URLs that occurred in mid-2005.  Requests grew from nothing to … Continue Reading

Security mechanisms in CoralCDN (and some attacks)

security-mechanisms-in-coralcdn-and-some-attacks

Before finally getting to some experiences, I wanted to touch on some of the security mechanisms that CoralCDN proxies incorporate to curtail misuse, especially important given their deployment at PlanetLab-affiliated universities.

Limited functionality

CoralCDN proxies only support GET and HEAD requests.  Many of the attacks for which “open” proxies are infamous are simply not feasible.  For example, clients cannot use CoralCDN to POST passwords for brute-force cracking.  It does not support SSL and thus risk carry more confidential data.  CoralCDN proxies do not support CONNECT requests, and thus they cannot be used to send spam as SMTP relays or forge From: addresses … Continue Reading

The Design of CoralCDN

the-design-of-coralcdn

In this post, I describe the architecture and mechanisms of CoralCDN at a high-level. This is meant to provide some of the background necessary for some of our experiences and lessons with operating the system.

System Overview

CoralCDN is composed of three main parts: (1) a network of cooperative HTTP proxies that handle users’ requests, (2) a network of DNS nameservers for .nyud.net that map clients to nearby CoralCDN HTTP proxies, and (3) the underlying Coral indexing infrastructure and clustering machinery on which the first two applications are built.  You’ll find that I refer to the entire system as “CoralCDN”, but the … Continue Reading

Firecoral @ IPTPS

firecoral-iptps

We’ve recently been working hard on Firecoral – a browser-based, peer-to-peer content distribution network for web caching. I’ll be presenting a short talk on Firecoral at the 8th International Workshop on Peer-to-Peer Systems (IPTPS) on April 21st in Boston, MA.

Peer-to-peer content distribution has been inarguably successful for large file distribution (e.g. BitTorrent), but P2P services have been restricted to stand-alone applications, not transparently incorporated into Web browsing and seamlessly running over HTTP. CoralCDN has served as a web content distribution network for the past five years, but its deployment has been limited to PlanetLab and demand quickly … Continue Reading

“Systems Researcher Interview”

systems-researcher-interview

Emil Sit, a friend from MIT with whom I published one of my first papers, has been blogging a series of “interviews” from colleagues about their experience building distributed systems. He started this series after describing some implementation issues with building Chord, which was one of the first and remains the canonical distributed hash table (DHT).   CoralCDN uses a DHT for its indexing, and its first implementation actually used MIT Chord as a software layer.  (I later implemented my own DHT layer, although instead based on Kademlia — which was proposed by an officemate … Continue Reading

From Industry to Academia, or it’s been a long, windy road…

from-industry-to-academia-or-its-been-a-long-windy-road

Having spent the last 9 years in industry, it’s been a refresher diving back into academic waters, not unlike taking a cold shower on a blistering hot day. It shocks you a bit at first, but once acclimated you feel so much better. One of the primary reasons for my return to the ivory internet tower is to expand the horizon of knowledge and understanding, both my own and of the systems/networking field, in large part motivated by my experience in industry. Innovation and expansive thinking are often scant luxuries, indulged in only when the rare respite between development cycles … Continue Reading

Series: Experiences with CoralCDN

series-experiences-with-coralcdn

Over the next few weeks, I’ll be posting a number of my “experiences” from the design and deployment of CoralCDN.  For those who aren’t familiar with CoralCDN, it’s a semi-open, self-organizing content distribution network (CDN) that I’ve been operating on PlanetLab for the past five years.

Our goal with CoralCDN was to democratize content distribution:  to make desired content available to everybody, regardless of the publisher’s own resources or dedicated hosting services.  It provides an open infrastructure that any publisher is free to use, without any prior registration. Publishing through CoralCDN is as simple as appending a suffix to a URL’s … Continue Reading

History of NSDR

history-of-nsdr

The call for papers for the 3rd Workshop on Networked Systems for Developing Regions (NSDR) was announced today. NSDR 2009 will be held with ACM SOSP this year at Big Sky, Montana. Direct all whining about the location to the SOSP organizers please!

I thought I’d share a little history of NSDR on this blog. Research in technologies for developing regions has been going on for a while. For example, the TIER group at Berkeley started in 2003. However, this area (often dubbed as ICTD) lacked a sense of community with no specialized workshops/conferences.

In 2006, I was attending SenSys at … Continue Reading

Coordination in Distributed Systems (ZooKeeper)

coordination-in-distributed-systems-zookeeper

Architecting distributed systems can be very difficult. Arguably the hardest part of programming a distributed application is getting node coordination correct. I’ll define a node in this context as a service running on a single server which communicates with other nodes and together make up your distributed application.

What I mean by coordination here is some act that multiple nodes must perform together. Some examples of coordination:

  • Group membership
  • Locking
  • Publisher/Subscriber
  • Ownership
  • Synchronization

One or more of these primitives show up in all distributed systems, so implementing them correctly is extremely important. While developing CRAQ, I originally implemented a very simple group membership service, but … Continue Reading