He shuns the limelight, but Van Jacobson's contributions to the Internet's underlying protocols are so fundamental that a TCP header compression scheme bears his name. The Packet Design Inc. chief scientist is a fixture on the Internet Engineering Task Force and related projects to prepare IP traffic for aggregated data, voice and video. He led Lawrence Berkeley National Lab's effort to improve TCP performance in the 1980s and later helped lay the groundwork for the Mbone, which won prominence as the multicast backbone for the Rolling Stones' Voodoo Lounge performances. Jacobson was chief scientist at Cisco Systems Inc. before joining Packet Design. He continues to work on "virtual wire" concepts and has proposed a transport scheme to improve Border Gateway Protocol scaling. Here, he recounts his history with the problematic middle layers of the Open Systems Interconnect stack.
EE Times: Your interest in improving transport-layer performance predates TCP/IP's definition. How did your work at Lawrence Berkeley Lab steer you to addressing Internet performance?
Van Jacobson: In 1972, I was working on my thesis in high-energy physics, using the Bevatron particle accelerator at LBL [Lawrence Berkeley Lab]. Our goal was to control big magnets, about the size of a city bus, to steer particle beams. I was writing an assembly language for a PDP-8, and the controls group at the labs became interested in my work and offered me a job.
I joined the real-time systems group in 1974 and ended up staying there for 25 years . . . working with such programs as the LBL Bevalac and the early design effort for the SSC [superconducting supercollider]. The key concept in developing control programs for accelerators is distributed computing, and the design task was to minimize the number of wires necessary to control such large, distributed systems, since the cost was based in part on the number of wires.
This got me interested in the mid-1970s in networked control. Our team at LBL was studying circuit-based networks and early virtual-circuit networks. In the late 1970s, packet-switched networks arrived. Studying the way you could use such networks for control of large accelerator projects was an object lesson on what worked and what didn't.
Keep in mind that all these experimental physics machines ran 24/7. They were all custom-built, and we didn't have a lot of experience from traditional mainframe computer worlds that would apply. Through trial and error we discovered that packet networks fit the bill.
I spent more than 11 years working on such control systems. Just as the SSC design work was winding down, the Arpanet [later the Internet] was making the transition from NCP [Network Control Protocol] to TCP/IP [Transmission Control Protocol/Internet Protocol]. UC Berkeley had done a lot of work on those initial protocols under a Darpa contract, and Berkeley and LBL had some of the initial interface message processors designed for that early Internet.
When the protocol transition from NCP to TCP/IP happened, network performance went from pretty good to abysmally bad. We at LBL were in the lead in looking at what had happened; I wrote some simple diagnostic tools, which later became tcpdump. We discovered that there was a protocol design aspect that interacted with a buffer limit. Once we understood what had happened, it took only two days to fix the problem.
EET: And this led to IETF's interest in more study of transport?
Jacobson: We got the first data when the organization that would become IETF, the Gateway Algorithms and Data Structure Task Force, or GADS, held its second meeting. The Arpa program manager said, "If you'd like to pursue that kind of work, we'd be interested in funding it." What a wonderful green light! It became the basis of our Network Research Group in 1985. We took the techniques we'd learned in studying accelerator control systems and applied them to more generalized IP networks.
At LBL, everything was custom, nothing was off the shelf, and we had to figure out what was wrong when networks failed. We had to discover robust ways of getting failure signatures, looking for patterns in packet timing. In many instances, simple diagnostics showed large-scale patterns, which often meant that fixes could be smaller and simpler than anticipated. This got people excited because the diagnostic methods could be generalized across many transport and network layer problems.
EET: The IETF was evolving at that time into a more structured set of groups working on several layers.
Jacobson: Dave Mills at the University of Delaware had been instrumental in getting the original GADS group together. He was the author of the "fuzzball" router code used for many Internet studies, and he was pushing empirical-data-based evaluation of problems. At this point, I had been used to standing up in front of large groups with a diverse range of people researchers, engineers, funding folks from government agencies and explaining data in a way that was meaningful to everyone, and that moved IETF as an organization to an increased emphasis on writing up results.
This was at a time when David Clark [currently at MIT] gave the IETF its motto of "rough consensus and running code," which meant that work was not to be based on theory alone, but on trying things out based on code you'd developed. In those days, of course, the Net was much more of a research vehicle than it is today. Since it wasn't a necessary part of society, you could perform tests on a work in progress.
The work on improving TCP was the result of responding to real-world problems, and the problems we ran into were almost all problems of scale, whether in network size or in bandwidth. Every time a part of the network got ratcheted up a notch, new problems had to be addressed.
EET: In the early days, Internet traffic was almost synonymous with best-effort data. How early did you get interested in low-latency traffic such as voice?
Jacobson: The first work I knew of in packet voice went back to the early '80s, funded by Arpa using networks at BBN [Bolt, Beranek and Newman, home of the early Internet experiments] and ISI [the University of Southern California's Information Sciences Institute]. We'd have some actual conferences over the Net in 1985, but the bandwidth simply wasn't available to make it practical. Reliable delivery was based on scheduling problems, and these problems were among the hardest that we knew.
The issue was not just bandwidth, but compute cycles of high-end workstations or desktops, and how the systems could break data into manageable chunks. By the early 1990s, bandwidth to the desktop was 10 Mbits per second, and the improved compute cycles allowed the reduction of audio and data into chunks appropriate for Internet conferencing, That's when research began to be viable.
In 1989, a group that included MIT, BBN, ISI, the University of Delaware, Xerox Parc and LBL talked Arpa into building a testbed network for audio/videoconferencing and associated quality-of-service research. Steve Deering at Xerox Parc used the testbed to construct an IP multicast infrastructure that eventually evolved into the Mbone [multicast backbone].
[LBL colleague] Steve McCanne and I wrote AV conferencing tools to prototype the protocols that eventually became RTP and SIP/SDP. BBN had been working on a circuit-oriented QoS protocol called ST-2, which no one else was very fond of. MIT and ISI developed a more IP-friendly QoS that eventually became IntServ [integrated services]. Our group at LBL prototyped a third approach that eventually turned into DiffServ [differentiated services].
EET: This took place at a time the telephony community was really pushing for asynchronous transfer mode, which was supposed to be the perfect compromise between circuit and packet, using a special 53-byte cell in a virtual circuit. What was your feeling about ATM at the time?
Jacobson: It was an ongoing topic of conversation among IETF members, although it was being developed outside the Internet community, and you had your ATM bigots and IP bigots. From my point of view, ATM was a link-layer technology, and IP of course could run on top of a link layer, but the circuit-oriented developers had interpreted the link layer as the network. The wires are not the network.
Because of the circuit-oriented background of ATM developers, they had bought into the telco religion that QoS equals scheduling. We know the problems of scheduling algorithms. If you go down that path, it's a highway to disaster.
Telephony had always been bandwidth-limited using point-to-point wires, and whenever the telco community thought about QoS and aggregation, everything came down to individual conversations in a 56-kbit channel. They could be talking about a 13-terabit wire and still want to break it down into 56-kbit chunks. In the IP world, it's all just bits.
Some ATM developers had some criticisms of priority queuing for IP as a QoS method at the time, but priority queuing is a simple local problem, vs. scheduling as a very hard, global problem.
A few years ago, I had this epiphany where I realized that when an IP or Ethernet person says bandwidth, they mean something totally different than when a telephony person says bandwidth. If you grew up looking at Sonet from a circuit perspective, an OC-48 fiber for you means 40,000 time slots. You had to be concerned with time-slot interchanges and reservations, so rolling out an OC-48 fiber represented a huge cost.
The thing is, the vast bulk of the cost represents the cost of switching, not the cost of bandwidth. The only cost for that kind of bandwidth fiber in the IP world is the cost of the laser and the amplifier.
EET: So all the late-1990s studies of QoS involved people speaking different languages, coming from different perspectives.
Jacobson: QoS has been an area of immense frustration for me. We're suffering death by 10,000 theses. It seems to be a requirement of thesis committees that a proposal must be sufficiently complicated for a paper to be accepted. Look at Infocom, look at IEEE papers; it seems as though there are 100,000 complex solutions to simple priority-based QoS problems.
The result is vastly increased noise in the signal-to-noise ratio. The working assumption is that QoS must be hard, or there wouldn't be 50,000 papers on the subject. The telephony journals assume this as a starting point, while the IP folks feel that progress in QoS comes from going out and doing something.
EET: And though packets declared victory over circuits, there seems to be renewed interest in giving IP as many circuit-like characteristics as possible.
Jacobson: I hope that the circuit obsession is transitional. Anytime you try to apply scheduling to a problem to give latency strict bounds, the advantages are not worth the cost of implementation. Strict guarantees gain you at best a 100-microsecond gain in networks, where the intrinsic jitter in the thermal conditions of the planet is 300 microseconds.
EET: Since joining Packet Design, you've worked on a new transport protocol for the Border Gateway Protocol BGP Scalable Transport, or BST. Judy Estrin, Packet Design's CEO, said when BST was proposed that a new transport might not be accepted by the router community at large but that it would be worthwhile to consider the use of flooding to allow better scaling on the transport layer.
Jacobson: There are many advantages to looking at reliable multicast and its relation to flooding. Steve McCanne has done a lot of work on scalable, reliable multicast. We can apply this to services using a point-to-point stateful protocol. If you need to virtualize a service, you can do it with reliable multicast and flooding.
EET: Let's relate this to the complex tasks being demanded of IP subnets these days, such as the creation of multiple virtual private networks operating from both the IPsec [Internet Protocol Secure] and SSL [Secure Sockets Layer] layers. Is too much being demanded of IP?
Jacobson: IP is exactly the right vehicle for such virtualization complexity. There are easy ways to create such virtual networks, and it's damned hard in a circuit-oriented system. When we did a demo of our BST concepts at the October 2002 North American Network Operators Group meeting, we had three machines at the demo, and we invited attendees to pull cables, bring systems down, you name it provided one system always was operating. As long as one system was up, we never stopped peering.
EET: Before I conclude, I wanted to bring up your role as chief scientist for one of the Packet Design-affiliated companies, Precision I/O. There's a lot of application to your TCP experience, but you're looking at improving server performance within the data center, which seems new.
Jacobson: It's certainly a different research area coming out of a different space. I always felt that the host-to-network interface had been really botched historically. The way bits get from the wire to the application and from the application to the wire is not very efficient.
The Internet gets a lot of its scalability and efficiency by being built from paired control and data flows, much like a physicist thinks of the universe being built from paired particles and antiparticles. Before data can flow to a destination, there first has to be a control flow routing that goes out from the destination. Since all the heavy lifting of validation, authorization, aggregation and the like is done on the control flow, packet forwarding representing the data flow can be very simple and very fast. The only part of the architecture that doesn't follow the paired control and data model is the application-to-wire interface.
Since an application doesn't get to tell the network what it wants, almost anything can get in the door and a host is stuck making complex decisions about packet disposition on every packet arrival. This is inefficient; it slows down the host and leads to a lot of junk in the network, like firewalls and load balancers.
What we're trying to do at Precision I/O is make the host look like the rest of the Internet. In 1988, Dave Clark, [then-MIT professor] Howard Salwen and I had done an analysis of protocol processing that led to the first generation of TCP offload engines. The current generation of TOEs cites the paper I had done analyzing direct data placement, but that second generation we see today does not really address the problem of improving host-to-network interfaces for better server performance. This is what we are trying to solve today.
- Lawrence Berkeley Lab, 1974-98: Research scientist, Real-Time Controls Group; group leader, Network Research Group, Information and Computer Sciences Division
- Cisco Systems Inc., 1998-2000: chief scientist
- Packet Design LLC chief scientist (since 2000)
- Precision I/O (Packet Design spin-off), chief scientist (since 2002)
Awards and honors
- ACM SIGCOMM Award recipient, 2001
- IEEE Koji Kobayashi Computers and Communications Award recipient, 2002
- Author of several dozen IETF papers and RFCs
- TCP Protocol for Header Compression (1144) a.k.a. the Van Jacobson Protocol