Queuevol4no10 December2006

Plotting the
Queue December/January 2006 Vol. 4 No. 10
Vol. 4 No. 10 Internet’s Future

Calling All
Smartbots
December/January 2006-2007
Architecture’s www.acmqueue.com
Renaissance
Multithreading for Mere Mortals
Virtualization Comes of Age
The Hennessy-Patterson Interview
CONTENTS
DECEMBER/JANUARY 2006-2007 VOL. 4 NO. 10
FOCUS
COMPUTER ARCHITECTURE
Unlocking Concurrency 24
Ali-Reza Adl-Tabatabai, Intel,
Christos Kozyrakis, Stanford University,
and Bratin Saha, Intel
Can transactional memory ease
the pain of multicore programming?
The Virtualization Reality 34

Simon Crosby, XenSource and
David Brown, Sun Microsystems
A look at hypervisors and the future
of virtualization.
Better, Faster, More Secure 42

Brian Carpenter, IBM and IETF
What’s ahead for the Internet’s fundamental
technologies? The IETF chair prognosticates.
2 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Perforce Fast Software Configuration Management
Introducing Folder Diff,

a productivity feature of Perforce SCM.
Folder Diff is an interactive, side-by-side display for comparing the state
of any two groups of files.
Use Folder Diff to quickly determine the differences between files in

folders, branches, labels, or your local disk. This is especially useful
when performing complex code merges.
And when you’ve been working offline, Folder Diff makes it a snap to
Perforce Folder Diff
reconcile and catch up with the Perforce Server when you get back online.
Folder Diff is just one of the many productivity tools that come with the
Perforce SCM System.
Download a free copy of Perforce, no questions

asked, from www.perforce.com. Free technical support is
available throughout your evaluation.
CONTENTS
DEPARTMENTS
EDITOR’S NOTE 8
Forward Thinking
Charlene O’Hanlon, ACM Queue
NEWS 2.0 10
INTERVIEW Taking a second look at the news so you don’t have to.
WHAT’S ON YOUR HARD DRIVE? 11

Visitors to our Web site are invited to tell us about
the tools they love—and the tools they hate.
A CONVERSATION WITH JOHN HENNESSY
AND DAVID PATTERSON 14 KODE VICIOUS 12
The Berkeley-Stanford duo who wrote the book Peerless P2P
(literally) on computer architecture discuss George V. Neville-Neil, Consultant
current innovations and future challenges.
BOOK REVIEWS 49
CALENDAR 50
CURMUDGEON 56
Will the Real Bots Stand Up?
Stan Kelly-Bootle, Author

of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
© 2006 Microsoft Corporation. All rights reserved. Microsoft, Visual Basic, Visual Studio, the Visual Studio logo, and “Your potential. Our passion.” are either registered trademarks or trademarks
Visual Studio 2005. The difference is obvious.

®
Spot the difference? Your peers will. A faster path to Visual Basic®
2005 makes it easier to leverage your existing skills while taking on the
challenging projects that make reputations. You get over 400 features
that streamline coding, so you can focus on the work that matters.
See all 400 differences at msdn.microsoft.com/difference
Publisher and Editor ACM Headquarters
Charlene O’Hanlon Executive Director and CEO: John White
cohanlon@acmqueue.com Director, ACM U.S. Public Policy Office: Cameron Wilson
Editorial Staff Sales Staff Deputy Executive Director and COO: Patricia Ryan
Executive Editor National Sales Director Director, Office of Information Systems: Wayne Graves
Jim Maurer Ginny Pohlman Director, Financial Operations Planning: Russell Harris
jmaurer@acmqueue.com 415-383-0203 Director, Office of Membership: Lillian Israel
gpohlman@acmqueue.com
Managing Editor
John Stanik Regional Eastern Manager Director, Office of Publications: Mark Mandelbaum
jstanik@acmqueue.com Walter Andrzejewski Deputy Director, Electronic Publishing: Bernard Rous
207-763-4772 Deputy Director, Magazine Development: Diane Crawford
Copy Editor
walter@acmqueue.com Publisher, ACM Books and Journals: Jono Hardjowirogo
Susan Holly
Art Director Contact Points Director, Office of SIG Services: Donna Baglio
Sharon Reuter Queue editorial Assistant Director, Office of SIG Services: Erica Johnson
Production Manager queue-ed@acm.org
Lynn D’Addesio-Kraus Executive Committee

Queue advertising
queue-ads@acm.org President: Stuart Feldman
Editorial Assistant
Vice-President: Wendy Hall
Michelle Vangen Copyright permissions
Secretary/Treasurer: Alain Chesnais
Copyright permissions@acm.org
Past President: Dave Patterson
Deborah Cotton Queue subscriptions Chair, SIG Board: Joseph Konstan
orders@acm.org
Editorial Advisory Board For information from Headquarters: (212) 869-7440
Change of address
Eric Allman
acmcoa@acm.org
Charles Beeler
ACM U.S. Public Policy Office: Cameron Wilson, Director
Steve Bourne
1100 17th Street, NW, Suite 507, Washington, DC 20036 USA
David J. Brown
+1-202-659-9711–office, +1-202-667-1066–fax, wilson_c@acm.org
Terry Coatta
Mark Compton
ACM Copyright Notice: Copyright © 2006 by Association for Comput-
Stu Feldman
ing Machinery, Inc. (ACM). Permission to make digital or hard copies of
Ben Fried
part or all of this work for personal or classroom use is granted without
Jim Gray
fee provided that copies are not made or distributed for profit or com-
Wendy Kellogg
Marshall Kirk McKusick mercial advantage and that copies bear this notice and full citation on
George Neville-Neil the first page. Copyright for components of this work owned by others
than ACM must be honored. Abstracting with credit is permitted. To
Guest Expert
copy otherwise, to republish, to post on servers, or to redistribute to lists,
Kunle Olukotun
requires prior specific permission and/or fee. Request permission to repub-
ACM Queue (ISSN 1542-7730) is published ten times per year by the lish from: Publications Dept. ACM, Inc. Fax +1 (212) 869-0481 or e-mail
ACM, 2 Penn Plaza, Suite 701, New York, NY 10121-0701. POSTMASTER: <permissions@acm.org>
Please send address changes to ACM Queue, 2 Penn Plaza, Suite 701, For other copying of articles that carry a code at the bottom of the
New York, NY 10121-0701 USA Printed in the U.S.A. first or last page or screen display, copying is permitted provided that the
The opinions expressed by ACM Queue authors are their own, and are per-copy fee indicated in the code is paid through the Copyright Clear-
not necessarily those of ACM or ACM Queue. Subscription ance Center, 222 Rosewood Drive, Danvers, MA 01923, 508-750-8500,
information available online at www.acmqueue.com. 508-750-4470 (fax).

Forward Thinking
from the editors

Charlene O’Hanlon, ACM Queue
Technology
KNOWS NO FEAR.
I
am of the opinion that humans are not flexible crea- Can we say the same
tures. We resist change like oil resists water. Even if a for our industry in the
change is made for the good of humankind, if it messes next year? Can technol-
around with our daily routine, then our natural instinct is ogy face the changes and
to fight the change like a virus. adapt accordingly? Can
Let’s face it, all of us thrive on routine—what time we force an evolution, or will it come naturally? Charles
we get up, how we brush our teeth, where we sit on the Darwin said living things must adapt or die, but I wonder
train, what we eat for lunch—and for some it takes a lot whether the same applies to technology. Indeed, we
to break the routine. If you don’t agree, take a look at humans are the ones forcing the change—after all, tech-
your life. How many of you regularly perform some task nology does not create itself—but are we moving along a
that you dislike (backing up your hard drive, going to the path in which one day technology will be responsible for
same boring job, eating liver every Tuesday night) simply its own evolution?
because you don’t want to face the alternative (a hard- It’s a thought that is both thrilling and scary—the kind
drive crash, no extra money for new CDs, the chance of stuff that Michael Crichton novels are made of. Some
that your iron level will dip so low you’ll end up in the may scoff and say that humans ultimately have control
hospital getting mass blood transfusions)? over the amount of intelligence any machine has, and
I grew up in a household in which Saturday was clean- that we will always be superior. But I would point out
ing day and everyone was forced to pitch in, so as a result that humans are often held back by the one thing that
there was a time not too long ago when I was absolutely technology knows nothing about: fear.
stringent about keeping a perfectly clean house. As I’ve A certain amount is fear is healthy; fear is what keeps
gotten older and somewhat wiser, however, I’ve started us from jumping off a cliff without a bungee cord just to
slacking off somewhat in the housecleaning department. see what it feels like. But too much fear can prevent us
A creature of habit, I used to begin my picking-up process from discovering our true talents and best assets—fear of
in earnest every night at 9:30, darting in and out of every the unknown, fear of being ridiculed, fear of failure.
room in the house like a dervish and cleaning up the Call me crazy, but I’m sure a Web server doesn’t care
detritus of the day. Then one night, out of pure exhaus- whether it is being laughed at.
tion, I just didn’t. And I woke up the next morning still I, for one, can envision a day when technology
alive and healthy. My house was a little out of order, but becomes smarter than humans. I think we will reach
it wasn’t anything I couldn’t handle. Since then I’ve cut that threshold when man and machine possess equal
down my dervish episodes to three a week, and it suits intelligence, and then technology will evolve to surpass
me well (I’m also a little calmer now). man simply because we humans can’t get past our fears.
Baby steps, I know. But for some it takes baby steps to Which may be a good thing, depending on how one
precede the big steps. But because this is December and looks at it. I, for one, would never wish humankind to
a new year—and the chance to make those dreaded New lose its humanity for the sake of lightning-fast decisions
Year’s resolutions—is just weeks away, I’ve decided that or a better way to build a widget. Fear, along with all our
2007 will be the year I make some real changes in my myriad emotions, is what makes us human.
life. I don’t just mean switching laundry detergents, but You can’t say that about a Web server. Q
real change. And if I fail in my attempts, then I will work
harder to make my changes successful. I know there will CHARLENE O’HANLON, editor of Queue, is in for some big
be difficulties, both internal and external, but I will face adventures in 2007. Stick around and see for yourself. Mean-
the changes and the challenges head on, embracing the while, send your comments to her at cohanlon@acmqueue.
changes rather than fighting them. com.

Q
BUSINESS INTELLIGENCE AND THE ECLIPSE PLATFORM
Log on to
www.acmqueue.com
to download the
ACM Queuecast
to your computer
or MP3 player.
In this latest ACM Premium Queuecast,

Paul Clenahan, Actuate vice president and member of
the BIRT project management committee, discusses
how BIRT can help developers incorporate reporting
capabilities quickly and easily into their applications.
emiu
pr
m
Sponsored by
news 2.0
Taking a
Fox and the Weasel second look AT data right from the cow
Capitalizing on the growing popularity of Mozilla’s pasture. Farmers can
THE NEWS SO YOU
Firefox, many Linux distributors now package the open input, view, and manage
source Web browser with their Linux code. According to DON’T HAVE TO information using wireless
Mozilla’s licensing policies, distributors may package the devices equipped with a
Firefox code with the Firefox name and logo, provided Web browser. FarmWizard’s
that Mozilla approves any changes made to the code. wirelessly accessed hosted service shows that this new
Mozilla wants to protect its trademark and prevent the breed of “Agri-IT” applications closely aligns with com-
confusion that might ensue if there were many separate puting trends seen in other sectors.
forks of Firefox that all used the Firefox name and logo. WANT MORE?
Debian, a Linux distribution closely aligned with the http://www.vnunet.com/computing/news/2167254/
free software movement, is butting heads with Mozilla handhelds-collect-farming
over these requirements. The folks at Debian want to
package a version of Firefox, but they object to using
the logo because it’s trademarked and therefore conflicts Second-Life Commerce Meets First-Life IRS
with Debian’s free-use ethos. They also object to Mozilla’s It’s becoming increasingly difficult to draw boundaries
code approval process, which could disqualify Debian’s between the imaginary and the real. Immersive online
browser from any association with the Firefox brand. simulations such as Second Life and World of Warcraft
So what’s a self-respecting free software advocate to have evolved virtual exchange systems that closely resem-
do? One solution would be for Debian to adopt the GNU ble real-world commerce. Players looking for an edge in
fork of Firefox, which, in obvious tribute to its parent, these games can head to eBay, where valuable items can
is cutely named IceWeasel. Another option would be for be bought and sold with real currency, with the actual
Debian to apply the IceWeasel name and logo, which are exchange of goods occurring in the online gaming world.
not trademarked, to its own Firefox code. Congress has noticed all this commerce and is evaluat-
WANT MORE? ing its policies for governing these virtual-to-real-world
http://www.internetnews.com/dev-news/article. transactions. After all, any transaction occurring in a real
php/3636651 marketplace using real money reasonably could be subject
to taxation, regardless of whether the goods exchanged
are tangible or imaginary. But things become complex
Down on the Wireless Farm when you consider the potential real-world value of vir-
As Queue reported in its September 2006 issue, compli- tual goods traded in cyberspace. If one person sells a deed
ance is a growing challenge for enterprises that’s creating to some Second-Life property on eBay, while someone
business opportunities for those savvy enough to sort it else, acting as an avatar online, completes the same trans-
out. Lest we get too bogged down in SOX and HIPAA and action using Second Life’s internal Linden dollars, is the
Basel II, however, we must remember that compliance first transaction taxable and the second one not taxable?
with government mandates is a challenge for all indus- The problem for the IRS is that while these games
tries. For example, farmers across the globe must comply are quite sophisticated, their economic systems lack the
with government reporting requirements to verify the structures and institutions, such as a stock market, that
safety of the food they produce. European Union farmers real-world tax law relies on. If the lack of these features
must keep detailed records about their cattle—everything is what’s keeping taxes out of virtual worlds, it seems
from where they’re grazing to their health problems. unlikely game developers will add them anytime soon.
Farmers are turning to technology to help them WANT MORE?
comply. Companies such as Ireland’s FarmWizard are http://today.reuters.com/news/ArticleNews.aspx?type=
seizing the opportunity to provide solutions. FarmWiz- technologyNews&storyID=2006-10-16T121700Z_01_N15
ard allows cattle farmers to manage important farming 306116_RTRUKOC_0_US-LIFE-SECONDLIFE-TAX.xml Q

What’s on Your
Hard Drive?
reader files
A
s the year draws to an end, we would like to thank all www.acmqueue.com and send us your rants, raves, and
of our readers who have submitted to WOYHD. Over more new tools that you absolutely can’t live without—or
the past 12 months we’ve seen a wide variety of tools can’t stand to use. As further incentive, if we publish your
mentioned, and, come 2007, we would like to see a lot submission, you’ll be starting off the New Year with a
more of the same. So log on to our Web site at http:// brand new Queue coffee mug!
Who: Charles Moore Who: Shyam Santhanam

What industry: Consulting and systems integrator What industry: ISP/Telecommunications,
Job title: Software engineer energy, cable, utilities
Flavor: Develops on Windows for Windows Job title: Software engineer
Tool I love! Beyond Compare. I love the abilities of Flavor: Develops on Linux for Linux
this program. I use it all the time to recon- Tool I love! Eclipse. I love the extensibil-
cile multiple versions of files. It’s particu- ity! It’s a jack-of-all-trades and master of
larly useful to determine what changed each. Also the intuitive UI and stability
between different releases—better than are great. Eclipse is doing things the way
most versioning systems’ compares. IDEs should instead of how they “have
Tool I hate! Microsoft Office. It never works the been” in the past.
way you want it to. Any option that is available to Tool I hate! Make. The complex and clumsy syntax
make it do what you want (and how do you find out of Make files and their widespread acceptance
about it?) is usually several layers down under some in Linux (Unix) development is horrible. If you’ve
unrelated menu—sometimes even in another system ever tried tracing down a linker error originating
application! Many features were either incorrectly in a 500-line Makefile with nth-level nested
implemented, not thought out, incompati- expansions, then you know what
ble with other features, or just plain don’t I mean.
work. And you don’t have—or aren’t
allowed—any other option.
Who: Leon Woestenberg Who: Mark Westwood

What industry: Broadcasting What industry: Oil and gas
Job title: Senior designer Job title: Principal software engineer
Flavor: Develops on Linux for Linux Flavor: Develops on Linux for Linux
Tool I love! OpenEmbedded. The world Tool I love! XEmacs. We have a longtime
of cross-compilation is cruel, especially love affair; she is so much more than an
if your system is a complex of different editor to me. My wife doesn’t understand
external open source tools together with me this well! Compilers come and go,
your own tools. OpenEmbedded is the debuggers are transient, but an editor
platform that solves the subtleties that would have is for life.
cost me a lot of time to get right. I know—I have Tool I hate! Anything GUI. I can write a finite-differ-
been there before. ence time-domain Maxwell equation solver in Fortran
Tool I hate! VisualWhatever. I do not think in a quarter of the time it takes my users to
dragging components into a view, setting make up their minds whether a dialog
some attributes, and then hooking them box should be shifted two pixels to the
up is anything like the way systems left or three pixels down. Computers
should be designed. If anyone thinks this are for number crunching!
is top-down design, think again.
more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 11

Peerless P2P
kode vicious
A koder with
attitude, KV ANSWERS
P
eer-to-peer networking (better known as P2P) has two YOUR QUESTIONS. tect them from the usual
faces: the illegal file-sharing face and the legitimate charges of providing a
MISS MANNERS HE AIN’T.
group collaboration face. While the former, illegal use system whereby people can
is still quite prevalent, it gets an undue amount of atten- exchange material that per-
tion, often hiding the fact that there are developers out haps certain other people,
there trying to write secure, legitimate P2P applications who also have lawyers, consider it wrong to exchange.
that provide genuine value in the workplace. While KV What else is there to worry about? Plenty.
probably has a lot to say about file sharing’s dark side, it is At the crux of all file-sharing systems—whether they
to the legal, less controversial incarnation of P2P that he are peer-to-peer, client/server, or what have you—is the
turns his attention to this month. Take it away, Vicious… type of publish/subscribe paradigm they follow. The pub-
lish/subscribe model defines how users share data.
The models follow a spectrum from low to high risk. A
Dear KV, high-risk model is one in which the application attempts
I’ve just started on a project working with P2P software, to share as much data as possible, such as sharing all data
and I have a few questions. Now, I know what you’re on your disk with everyone as the basic default setting.
thinking, and no this isn’t some copyright-violating piece Laugh if you like, but you’ll cry when you find out that
of kowboy kode. It’s a respectable corporate application lots of companies have built just such systems, or systems
for people to use to exchange data such as documents, that are close to being as permissive as that.
presentations, and work-related information. Here are some suggestions for building a low-risk peer-
My biggest issue with this project is security—for to-peer file-sharing system.
example, accidentally exposing our users’ data or leav- First of all, the default mode of all such software
ing them open to viruses. There must be more things to should be to deny access. Immediately after installing
worry about, but those are the top two. the software, no new files should be available to anyone.
So, I want to ask, “What would KV do?” There are several cases in which software did not obey
Unclear Peer this simple rule, so when a nefarious person wanted to
steal data, he or she would trick someone into download-
Dear UP, ing and installing the file-sharing software. This is often
What would KV do? KV would run, not walk, to the near- referred to as a “drive-by install.” The attacker would then
est bar and find a lawyer. You can always find lawyers in have free access to the victim’s computer or at least to the
bars, or at least I do; they’re the only ones drinking faster My Documents or similar folder.
than I am. The fact that you believe your users will use Second, the person sharing the files—that is, the
your software only for your designated purpose makes sharer—should have the most control over the data. The
you either naive or stupid, and since I’m feeling kind person connecting to the sharer’s computer should be
today, I’ll assume naive. able to see and copy only the files that the sharer wishes
So let’s assume your company has lawyers to pro- that person to see and copy. In a reasonably low-risk
system, the sharing of data would have a timeout such
Got a question for Kode Vicious? E-mail him at that unless the requester got the data by a certain time
kv@acmqueue.com—if you dare! And if your letter (say, 24 hours), the data would no longer be available.
appears in print, he may even send you a Queue coffee Such timeouts can be implemented by having the sharer’s
mug, if he’s in the mood. And oh yeah, we edit letters for computer generate a one-time use token containing a
content, style, and for your own good! timeout that the requester’s computer must present to get
a particular file.

Third, the system should be slow to open up access. Other things you will have to worry about include
Although we don’t want the user to have to say OK to the security of the application itself. A program that is
everything—because eventually the user will just click OK designed to take files from other computers is a perfect
without thinking—you do want a system that requires vector for attacks by virus writers. It would be unwise—
user intervention to give more access. well, actually, it would be incredibly stupid—to write such
Fourth, files should not be stored in a known or easily a program so that it executes or displays files immediately
guessable default location. Sharing a well-known folder after transfer without asking the user first. I have to admit
such as My Documents has gotten plenty of people into that answering yes to the question, “Would you like to
trouble. The best way to store downloaded or shared files run this .exe file?” on Windows is about the same as ask-
is to have the file-sharing application create and track ing, “Would you like me to pull the trigger?” in a game of
randomly named folders beneath a well-known location Russian roulette.
in the file system. Choosing a reasonably sized random Another open research area, er, I mean, big headache,
string of letters and digits as a directory name is a good which I’ll not get into here, is the authentication system
practice. This makes it harder for virus and malware writ- itself. Outside of all the other advice I just gave, this prob-
ers to know where to go to steal important information. lem is itself quite thorny. How do I know that you are
Fifth, and last for this particular letter, the sharing you? How do you know that I am me? Perhaps I am the
should be one-to-one, not one-to-many. Many systems Walrus, except, wait, the Walrus was Paul.
share data one-to-many, including most file-swap- Well, I believe you have enough to think about now. I
ping applications, such that anyone who can find your suggest you sleep on it and wake up screaming, just like...
machine can get at the data you are willing to share. KV
Global sharing should be the last option a user has, not
the first. The first option should be to a single person, the KODE VICIOUS, known to mere mortals as George V.
second to a group of people, and the last, global. Neville-Neil, works on networking and operating system
You may note that a lot of this advice is in direct con- code for fun and profit. He also teaches courses on various
flict with some of the more famous file-sharing, peer-to- subjects related to programming. His areas of interest are
peer systems that have been created in the past few years. code spelunking, operating systems, and rewriting your bad
This is because I have been trying to show you a system code (OK, maybe not that last one). He earned his bachelor’s
that allows for data protection while data is being shared. degree in computer science at Northeastern University in
If you want to create an application that is as open—and Boston, Massachusetts, and is a member of ACM, the Usenix
as dangerous—as Napster or its errant children were and Association, and IEEE. He is an avid bicyclist and traveler who
are, then that’s a different story. From the sound of your has made San Francisco his home since 1990.
letter, however, that is not what you want. © 2006 ACM 1542-7730/06/1200 $5.00
Coming in February:
Secure Open Source
Open Source vs. Closed Source Security
Vulnerability Management
Updates That Don’t Go Boom!

A Conversation with John Hennessy
and David Patterson
interview
Photography by Jacob Leverich
They wrote
the book ON
A
s authors of the seminal textbook, Computer Archi- COMPUTING computing). Patterson
tecture: A Quantitative Approach (4th Edition, Morgan pioneered the RISC project
Kaufmann, 2006), John Hennessy and David Patter- at Berkeley, which pro-
son probably don’t need an introduction. You’ve prob- duced research on which
ably read them in college or, if you were lucky enough, Sun’s Sparc processors (and
even attended one of their classes. Since rethinking, and many others) would later be based. Meanwhile, Hen-
then rewriting, the way computer architecture is taught, nessy ran a similar RISC project at Stanford in the early
both have remained committed to educating a new 1980s called MIPS. Hennessy would later commercialize
generation of engineers with the skills to tackle today’s this research and found MIPS Computer Systems, whose
tough problems in computer architecture, Patterson as a RISC designs eventually made it into the popular game
professor at Berkeley and Hennessy as a professor, dean, consoles of Sony and Nintendo.
and now president of Stanford University. Interviewing Hennessy and Patterson this month is
In addition to teaching, both have made significant Kunle Olukotun, associate professor of electrical engineer-
contributions to computer architecture research, most ing and computer science at Stanford University. Oluko-
notably in the area of RISC (reduced instruction set tun led the Stanford Hydra single-chip multiprocessor

Q
THE POWER OF IP PROTECTION AND SOFTWARE LICENSING
Log on to
www.acmqueue.com
to download the
ACM Queuecast
to your computer
or MP3 player.
Since IP is considered a financial asset in today’s business climate,

the threats to IP create a real concern. In an ACM Premium Queuecast,
Aladdin vice president Gregg Gronowski explains how
Software Digital Rights Management solutions are the de-facto
standard today for protecting software IP, preventing software piracy,
and enabling software licensing and compliance.
emiu
pr
m
Sponsored by
interview
research project, which pioneered multiple processors on it in its company store for employees. I think what also
a single silicon chip. Technology he helped develop and surprised us is how quickly it caught on internationally.
commercialize is now used in Sun Microsystems’s Niagara We’re now in at least eight languages.
line of multicore CPUs. DP I got a really great compliment the other day when I
was giving a talk. Someone asked, “Are you related to the
KUNLE OLUKOTUN I want to start by asking why you Patterson, of Patterson and Hennessy?” I said, “I’m pretty
decided to write Computer Architecture: A Quantitative sure, yes, I am.” But he says, “No, you’re too young.” So I
Approach. guess the book has been around for a while.
DAVID PATTERSON Back in the 1980s, as RISC was just JH Another thing I’d say about the book is that it wasn’t
getting under way, I think John and I kept complaining until we started on it that I developed a solid and com-
to each other about the existing textbooks. I could see plete quantitative explanation of what had happened in
that I was going to become the chair of the computer sci- the RISC developments. By using the CPI formula
ence department, which I thought meant I wouldn’t have
any time. So we said, “It’s now or never.” Execution Time/Program = Instructions/Program x Clocks/
JOHN HENNESSY As we thought about the courses we Instruction x Time/Clock
were teaching in computer architecture—senior under-
graduate and first-level graduate courses—we were very we could show that there had been a real breakthrough in
dissatisfied with what resources were out there. The terms of instruction throughput, and that it overwhelmed
common method of teaching a graduate-level, even an any increase in instruction count.
introductory graduate-level computer architecture course, With a quantitative approach, we should be able to
was what we referred to as the supermarket approach. explain such insights quantitatively. In doing so, it also
The course would consist of selected readings—some- became clear how to explain it to other people.
times a book, but often selected readings. Many people DP The subtitle, Quantitative Approach, was not just a
used [Dan] Siewiorek, [Gordon] Bell, and [Allen] Newell casual additive. This was a turn away from, amazingly,
(authors of Computer Structures, McGraw-Hill, 1982), people spending hundreds of millions of dollars on
which were essentially selected readings. Course curricula somebody’s hunch of what a good instruction set would
looked as though someone had gone down the aisle and be—somebody’s personal taste. Instead, there should be
picked one selection from each aisle, without any notion engineering and science behind what you put in and
of integration of the material, without thinking about what you leave out. So, we worked on that title.
the objective, which in the end was to teach people how We didn’t quite realize—although I had done books
to design computers that would be faster or cheaper, and before—what we had set ourselves up for. We both took
with better cost performance. sabbaticals, and we said, “Well, how hard can it be? We
KO This quantitative approach has had a significant can just use the lecture notes from our two courses.” But,
impact on the way that the industry has designed com- boy, then we had a long way to go.
puters and especially the way that computer research has JH We had to collect data. We had to run simulations.
been done. Did you expect your textbook to have the There was a lot of work to be done in that first book. In
wide impact that it had? the more recent edition, the book has become sufficiently
JH The publisher’s initial calculation was that we needed well known that we have been able to enlist other people
to sell 7,000 copies just to break even, and they thought to help us collect data and get numbers, but in the first
we had a good shot at getting to maybe 10,000 or 15,000. one, we did most of the work ourselves.
As it turned out, the first edition sold well over 25,000. DP We spent time at the DEC Western Research Lab,
We didn’t expect that. where we hid out three days a week to get together and
DP This was John’s first book, but I had done several talk. We would write in between, and then we would go
books before, none of which was in danger of making me there and spend a lot of time talking through the ideas.
money. So I had low expectations, but I think we were We made a bunch of decisions that I think are
shooting for artistic success, and it turned out to be a unchanged in the fourth edition of the book. For exam-
commercial success as well. ple, an idea has to be in some commercial product before
JH The book captured a lot of attention both among aca- we put it into the book. There are thousands of ideas, so
demics using it in classroom settings and among practic- how do you pick? If no one has bothered to use it yet,
ing professionals in the field. Microsoft actually stocked then we’ll wait till it gets used before we describe it.

KO Do you think that limits the forward-looking nature DP I’m pretty proud of this latest edition. We felt really
of the book? good about the first edition, but then I think some of the
DP I think it probably does, but on the other hand, we’re editions just got really big. This one, we’ve put on a diet
less likely to put a bunch of stuff in that ends up being and tried to concentrate on what we think is the essence
thrown away. of what’s going on, and moved the rest of the stuff into
JH On balance, our approach has probably benefited us the CD and appendices.
more often than it has hurt us. There are a lot of topics
that became very faddish in the architecture research
community but never really emerged. KO How would you characterize the current state of
For example, we didn’t put trace caches in the book computer architecture? Could you talk about the pace of
when they were first just an academic idea; by the time innovation, compared with what it was in the past?
we did put them in, it was already clear that they were JH I think this is nothing less than a giant inflection
going to be of limited use, and we put in a small amount point, if you look strictly from an architectural view-
of coverage. That’s a good example of not jumping the point—not a technology viewpoint. Gordon Bell has
gun too early. talked eloquently about defining computers in terms of
DP I think value prediction was another. There was tre- what I might think of as technology-driven shifts. If you
mendous excitement about its potential, and it ended up
having limited applicability.
KO You delayed the third edition for Itanium, right?
JH I think our timing worked out right. It just goes to
show the value of the quantitative approach. I think you
When we talk about
can make a lot of pronouncements about an architecture,
but when the rubber meets the road, does it perform or parallelism...
not? we’re talking about
DP One of the fallacies and pitfalls to consider is that you a problem that’s as
shouldn’t be comparing your performance to comput-
hard as any computer
ers of today, given Moore’s law. You should be compar-
ing yourself to performances at the time the computers science has faced.
come out. That relatively straightforward observation was JOHN HENNESSY
apparently, to many people in marketing departments
and to executives at computer companies, a surprising
observation.
The Itanium was very late, which is one of its prob-
lems.
KO You’ve made a commitment to keeping the text up-
to-date, so will there be a fifth edition?
DP It’s actually a lot more than four editions. We origi-
nally wanted to write the book for graduate students, and
then our publisher said, “You need to make this informa- It’s the biggest
tion available for undergraduates.” thing in 50 years
JH We thought somebody else would write an undergrad-
because industry is
uate textbook, but nobody did.
DP So we’ve now done three editions of the undergradu- betting its future that
ate book and four editions of the senior/graduate book. parallel programming
JH What makes it so much work is that in each edition,
will be useful.
60 percent of the pages are essentially new. Something
like 75 percent are substantially new, and 90 percent of DAVID PATTERSON
them are touched in that process, not counting appen-
dices. We replan each book every single time. It’s not a
small undertaking.

interview
look at architecture-driven shifts, then this is probably DP By the way, that was a lot of work back then. Comput-
only the fourth. There’s the first-generation electronic ers were a lot slower!
computers. Then I would put a sentinel at the IBM JH We were working with hammers and chisels.
360, which was really the beginning of the notion of DP We were cutting Rubylith with X-acto knives, as I
an instruction-set architecture that was independent of remember.
implementation. KO Absolutely. So today, if you really want to make an
I would put another sentinel marking the beginning of impact, it’s very difficult to actually do VLSI (very large
the pipelining and instruction-level parallelism move- scale integration) design in an academic setting.
ment. Now we’re into the explicit parallelism multipro- JH I don’t know that that’s so true. It may have gotten
cessor era, and this will dominate for the foreseeable easier again. One could imagine designing some novel
future. I don’t see any technology or architectural innova- multiprocessor starting with a commercial core, assuming
tion on the horizon that might be competitive with this that commercial core has sufficient flexibility. You can’t
approach. design something like a Pentium 4, however. It’s com-
DP Back in the ’80s, when computer science was just pletely out of the range of what’s doable.
learning about silicon and architects were able to under- DP We recently painfully built a large microprocessor. At
stand chip-level implementation and the instruction the ISCA (International Symposium on Computer Archi-
set, I think the graduate students at Berkeley, Stanford, tecture) conference in 2005, a bunch of us were in the
and elsewhere could genuinely build a microprocessor hallway talking about exactly this issue. How in the world
that was faster than what Intel could make, and that was are architects going to build things when it’s so hard to
amazing. build chips? We absolutely have to innovate, given what
Now, I think today this shift toward parallelism is has happened in the industry and the potential of this
being forced not by somebody with a great idea, but switch to parallelism.
because we don’t know how to build hardware the That led to a project involving 10 of us from several
conventional way anymore. This is another brand-new leading universities, including Berkeley, Carnegie-Mellon,
opportunity for graduate students at Berkeley and Stan- MIT, Stanford, Texas, and Washington. The idea is to use
ford and other schools to build a microprocessor that’s FPGAs (field programmable gate arrays). The basic bet is
genuinely better than what Intel can build. And once that FPGAs are so large we could fit a lot of simple proces-
again, that is amazing. sors on an FPGA. If we just put, say, 50 of them together,
JH In some ways it’s déjà vu, much as the early RISC days we could build 1,000-processor systems from FPGAs.
relied on collaboration between compiler writers and FPGAs are close enough to the design effort of hard-
architects and implementers and even operating-system ware, so the results are going to be pretty convincing.
people in the cases of commercial projects. It’s the same People will be able to innovate architecturally in this
thing today because this era demands a level of collabora- FPGA and will be able to demonstrate ideas well enough
tion and cross-disciplinary problem solving and design. that we could change what industry wants to do.
It’s absolutely mandatory. The architects can’t do it alone. We call this project Research Accelerator for Multiple
Once ILP (instruction-level parallelism) got rolling, at Processors, or RAMP. There’s a RAMP Web site (http://
least in the implicit ILP approaches, the architects could ramp.eecs.berkeley.edu).
do most of the work. That’s not going to be true going KO Do you have industry partners?
forward. DP Yes, we’ve got IBM, Sun, Xilinx, and Microsoft. Chuck
DP This parallelism challenge involves a much broader Thacker, Technical Fellow at Microsoft, is getting Micro-
community, and we have to get into applications and soft back into computer architecture, which is another
language design, and maybe even numerical analysis, not reflection that architecture is exciting again. RAMP is one
just compilers and operating systems. God knows who of his vehicles for doing architecture research.
should be sitting around the table—but it’s a big table. JH I think it is time to try. There are challenges, clearly,
Architects can’t do it by themselves, but I also think but the biggest challenge by far is coming up with suf-
you can’t do it without the architects. ficiently new and novel approaches. Remember that this
KO One of the things that was nice about RISC is that era is going to be about exploiting some sort of explicit
with a bunch of graduate students, you could build a parallelism, and if there’s a problem that has confounded
30,000- or 40,000-transistor design, and that was it. You computer science for a long time, it is exactly that. Why
were done. did the ILP revolution take off so quickly? Because pro-

grammers didn’t have to know about it. Well, here’s an architects supply the logic design, and it’s inexpensive
approach where I suspect any way you encode parallel- and runs not as fast as the real chip but fast enough to
ism, even if you embed the parallelism in a programming run real software, so we can put it in everybody’s hands
language, programmers are going to have to be aware of and they can start getting experience with a 1,000-proces-
it, and they’re going to have to be aware that memory has sor system or a lot bigger than you can buy from Intel.
a distributed model and synchronization is expensive and Not only will it enable research, it will enable teaching.
all these sorts of issues. We’ll be able to take a RAMP design, put it in the class-
DP That’s one of the reasons we’re excited about what the room, and say, “OK, today it’s a shared multiprocessor.
actual RAMP vision is: Let’s create this thing where the Tomorrow it has transactional memory.” The plus side
with FPGAs is that if some-
body comes up with a great
idea, we don’t have to wait
four years for the chips to
get built before we can start
using it. We can FTP the
designs overnight and start
trying it out the next day.
KO I think FPGAs are going
to enable some very inter-
esting architecture projects.
DP Architecture is inter-
esting again. From my
perspective, parallelism
is the biggest challenge
since high-level program-
ming languages. It’s the
biggest thing in 50 years
because industry is betting
its future that parallel pro-
gramming will be useful.
Industry is building
parallel hardware, assum-
ing people can use it. And
I think there’s a chance
they’ll fail since the soft-
ware is not necessarily in
place. So this is a gigantic
challenge facing the com-
puter science community.
If we miss this opportunity,
it’s going to be bad for the
industry.
Imagine if processors
stop getting faster, which
is not impossible. Parallel
programming has proven
to be a really hard concept.
Just because you need a
solution doesn’t mean
you’re going to find it.

interview
JH If anything, a bit of self-reflection on what happened ers don’t control the software business, so you’ve got a
in the last decade shows that we—and I mean collectively very difficult situation.
the companies, research community, and government It’s far more important now to be engaging the univer-
funders—became too seduced by the ease with which sities and working on these problems than it was, let’s say,
instruction-level parallelism was exploited, without helping find the next step in ILP. Unfortunately, we’re not
thinking that the road had an ending. We got there very going to find a quick fix.
quickly—more quickly than I would have guessed—but DP RAMP will help us get to the solution faster than
now we haven’t laid the groundwork. So I think Dave is without it, but it’s not like next year when RAMP is avail-
right. There’s a lot of work to do without great certainty able, we’ll solve the problem six months later. This is
that we will solve those problems in the near future. going to take a while.
For RISC, the big controversy was whether or not
to change the instruction set. Parallelism has changed
KO One of the things that we had in the days when you the programming model. It’s way beyond changing the
were doing the RISC research was a lot of government instruction set. At Microsoft in 2005, if you said, “Hey,
funding for this work. Do we have the necessary resources what do you guys think about parallel computers?” they
to make parallelism what we know it has to be in order to would reply, “Who cares about parallel computers? We’ve
keep computer performance going? had 15 or 20 years of doubling every 18 months. Get
DP I’m worried about funding for the whole field. As lost.” You couldn’t get anybody’s attention inside Micro-
ACM’s president for two years, I spent a large fraction of soft by saying that the future was parallelism.
my time commenting about the difficulties facing our In 2006, everybody at Microsoft is talking about paral-
field, given the drop in funding by certain five-letter gov- lelism. Five years ago, if you had this breakthrough idea
ernment agencies. They just decided to invest it in little in parallelism, industry would show you the door. Now
organizations like IBM and Sun Microsystems instead of industry is highly motivated to listen to new ideas.
the proven successful path of universities. So they are a ready market, but I just don’t think
JH DARPA spent a lot of money pursuing parallel comput- industry is set up to be a research funding agency. The
ing in the ’90s. I have to say that they did help achieve one organization that might come to the rescue would
some real advances. But when we start talking about par- be the SRC (Semiconductor Research Council), which is
allelism and ease of use of truly parallel computers, we’re a government/semiconductor industry joint effort that
talking about a problem that’s as hard as any that com- funnels monies to some universities. That type of an orga-
puter science has faced. It’s not going to be conquered nization is becoming aware of what’s facing the micropro-
unless the research program has a level of long-term cessor and, hence, semiconductor industry. They might
commitment and has sufficiently significant segments of be in position to fund some of these efforts.
strategic funding to allow people to do large experiments
and try ideas out.
DP For a researcher, this is an exciting time. There are KO There are many other issues beyond performance
huge opportunities. If you discover how to efficiently that could impact computer architecture. What ideas are
program a large number of processors, the world is going there in the architecture realm, and what sort of impact
to beat a path to your door. It’s not such an exciting are these other nonperformance metrics going to have on
time to be in industry, however, where you’re betting the computing?
company’s future that someone is going to come up with JH Well, power is easy. Power is performance. Completely
the solution. interchangeable. How do you achieve a level of improved
KO Do you see closer industry/academic collaboration efficiency in the amount of power you use? If I can
to solve this problem? These things wax and wane, but improve performance per watt, I can add more power and
given the fact that industry needs new ideas, then clearly be assured of getting more performance.
there’s going to be more interest in academic research to DP It’s something that has been ignored so far, at least in
try to figure out where to go next. the data center.
JH I would be panicked if I were in industry. Now I’m JH I agree with that. What happened is we convinced
forced into an approach that I haven’t laid the ground- ourselves that we were on a long-term road with respect
work for, it requires a lot more software leverage than the to ILP that didn’t have a conceivable end, ignoring the
previous approaches, and the microprocessor manufactur- fact that with every step on the road we were achieving

lower levels of efficiency and hence bringing the end of of the week. Our idea is when the load goes down, move
that road closer and closer. Clearly, issues of reliability the stuff off some of the machines and turn them off.
matter a lot, but as the work at Berkeley and other places When the load goes up, turn them on and move stuff to
has shown, it’s a far more complicated metric than just them, and we think there will be surprisingly substantial
looking at a simple notion of processor reliability. power savings with that simple policy.
DP Yes, I guess what you’re saying is, performance per KO People will come up with new ideas for programming
watt is still a quantitative and benchmarkable goal. Reli- parallel computers, but how will they know whether
ability is a lot harder. We haven’t successfully figured out these ideas are better than the old ideas?
thus far how to insert bugs and things and see how things DP We always think of the quantitative approach as
work. Now, that’s something we talked about at Berkeley pertaining to hardware and software, but there are huge
and never found a good vehicle for. fractions of our respective campuses that do quantitative
I’m personally enthusiastic about the popularity of work all the time with human beings. There are even
virtual machines for a bunch of reasons. In fact, there’s a elaborate rules on human-subject experiments.
new section on virtual machines in our latest book. It would be new to us to do human-subject experi-
JH Whether it’s reliability or security, encapsulation in ments on the ease of programming, but there is a large
some form prevents a failure from rippling across an methodology that’s popular on campuses that computer
entire system. In security, it’s about containment. It’s science uses only in HCI (human-computer interaction)
about ensuring that whenever or wherever attacks occur, studies. There are ways to do that kind of work. It will be
they’re confined to a relatively small area. different, but it’s not unsolvable.
DP We could use virtual machines to do fault insertion. KO Would you advocate more research in this area of
What we’re doing right now at Berkeley is looking into programmability?
using virtual machines to help deal with power. We’re DP Yes. I think if you look at the history of parallelism,
interested in Internet services. We know that with Inter- computer architecture often comes up with the wild idea
net services, the workload varies by time of day and day of how to get more peak performance out of a certain
Designing Interactions
new from the mit press

Bill Moggridge
“This will be the book—the book that summarizes how the
technology of interaction came into being and prescribes
how it will advance in the future. Essential, exciting, and a
delight for both eyes and mind.” — Don Norman, Nielsen
Norman Group and Northwestern University, author of
Emotional Design
816 pp., 700 illus., color thourghout $39.95 cloth and DVD
The Laws of Simplicity

John Maeda
“A clear and incisive guide for making simplicity the para-
mount feature of our products; it’s also a road map for
constructing a more meaningful world.” — Andrea Ragnetti,
Board of Management, Royal Philips Electronics
176 pp., 30 illus. $20 cloth
http://mitpress.mit.edu
Available in fine bookstores everywhere, or call 800-405-1619.

interview
fixed hardware budget. Then five or 10 years go by where There is evidence of tremendous advancement in
a bunch of software people try to figure out how to make part of the programming community—not particularly
that thing programmable, and then we’re off to the next the academic part. I don’t know if academics are pay-
architecture idea when the old one doesn’t turn out. ing attention to this kind of work or not in the language
Maybe we should put some science behind this, trying to community, but there’s hope of very different ways of
evaluate what worked and what didn’t work before we go doing things than we’ve done in the past.
onto the next idea. Is there some way we could leverage that kind of inno-
My guess is that’s really the only way we’re going to vation in making it compatible with this parallel future
solve these problems; otherwise, it will just be that all of that we’re sure is out there? I don’t know the answer to
us will have a hunch about what’s easier to program. that, but I would say nothing is off the table. Any solu-
Even shared memory versus message passing—this is tion that works, we’ll do it.
not a new trade-off. It has been around for 20 years. I’ll KO Given that you won’t be able to buy a microprocessor
bet all of us in this conversation have differing opinions with a single core in the near future, you might be opti-
about the best thing to do. How about some experiments mistic that the proliferation of these multicore parallel
to shed some light on what the trade-offs are in terms of architectures will enable the open source community to
ease of programming of these approaches, especially as come up with something interesting. Is that likely?
we scale? DP Certainly. What I’ve been doing is to tell all my
If we just keep arguing about it, it’s possible it will colleagues in theory and software, “Hey, the world has
never get solved; and if we don’t solve it, we won’t be changed. The La-Z-Boy approach isn’t going to work
able to rise up and meet this important challenge facing anymore. You can’t just sit there, waiting for your single
our field. processor to get a lot faster and your software to get faster,
KO Looking back in history at the last big push in parallel and then you can add the feature sets. That era is over. If
computing, we see that we ended up with message pass- you want things to go faster, you’re going to have to do
ing as a de facto solution for developing parallel software. parallel computing.”
Are we in danger of that happening again? Will we end The open source community is a real nuts-and-bolts
up with the lowest common denominator—whatever is community. They need to get access to parallel machines
easiest to do? to start innovating. One of our tenets at RAMP is that the
JH The fundamental problem is that we don’t have a software people don’t do anything until the hardware
really great solution. Many of the early ideas were moti- shows up.
vated by observations of what was easy to implement in JH The real change that has occurred is the free soft-
the hardware rather than what was easy to use: how we’re ware movement. If you have a really compelling idea,
going to change our programming languages; what we your ability to get to scale rapidly has been dramatically
can do in the architecture to mitigate the cost of various changed.
things, communication in particular, but synchronization DP In the RAMP community, we’ve been thinking about
as well. how to put this in the hands of academics. Maybe we
Those are all open questions in my mind. We’re really should be putting a big RAMP box out there on the Inter-
in the early stages of how we think about this. If it’s the net for the open source community, to let them play with
case that the amount of parallelism that programmers a highly scalable processor and see what ideas they can
will have to deal with in the future will not be just two or come up with.
four processors but tens or hundreds and thousands for I guess that’s the right question: What can we do to
some applications, then that’s a very different world than engage the open source community to get innovative
where we are today. people, such as the authors of Ruby on Rails and other
DP On the other hand, there’s exciting stuff happening in innovative programming environments? The parallel
software right now. In the open source movement, there solutions may not come from academia or from research
are highly productive programming environments that labs as they did in the past. Q
are getting invented at pretty high levels. Everybody’s
example is Ruby on Rails, a pretty different way to learn LOVE IT, HATE IT? LET US KNOW
how to program. This is a brave new world where you can feedback@acmqueue.com or www.acmqueue.com/forums
rapidly create an Internet service that is dealing with lots
of users. © 2006 ACM 1542-7730/06/1200 $5.00

“The Digital “Online Books
Library” and Courses” “Publications” “Conferences”
Vinton G. Cerf Amy Wu Benjamin Mako Hill Maria Klawe
Vice President and Chief Internet Evangelist Computer Science Student Research Assistant President
Google Stanford University MIT Media Laboratory Harvey Mudd College
ACM: KNOWLEDGE, COLLABORATION & INNOVATION IN COMPUTING
Uniting the world’s computing professionals,

researchers and educators to inspire dialogue, Association for Computing Machinery
share resources and address the computing Advancing Computing as a Science & Profession
field’s challenges in the 21st Century. www.acm.org/learnmore
FOCUS
Computer
Architecture
UNLOCKING
CONCURRENCY
ALI-REZA ADL-TABATABAI, INTEL
CHRISTOS KOZYRAKIS, STANFORD UNIVERSITY
BRATIN SAHA, INTEL
M
ulticore architectures are an inflection rency-control concepts used for decades by the database
point in mainstream software development community. Transactional-language constructs are easy
because they force developers to write paral- to use and can lead to programs that scale. By avoid-
lel programs. In a previous article in Queue, ing deadlocks and automatically allowing fine-grained
Herb Sutter and James Larus pointed out, “The concur- concurrency, transactional-language constructs enable the
rency revolution is primarily a software revolution. The programmer to compose scalable applications safely out
difficult problem is not building multicore hardware, but of thread-safe libraries.
programming it in a way that lets mainstream applica- Although TM is still in a research stage, it has increas-
tions benefit from the continued exponential growth ing momentum pushing it into the mainstream. The
in CPU performance.” 1 In this new multicore world, recently defined HPCS (high-productivity computing
developers must write explicitly parallel applications that system) languages—Fortress from Sun, X10 from IBM,
can take advantage of the increasing number of cores that and Chapel from Cray—all propose new constructs for
each successive multicore generation will provide. transactions in lieu of locks. Mainstream developers who
Parallel programming poses many new challenges to are early adopters of parallel programming technologies
the developer, one of which is synchronizing concurrent have paid close attention to TM because of its potential
access to shared memory by multiple threads. Program- for improving programmer productivity; for example, in
mers have traditionally used locks for synchronization, his keynote address at the 2006 POPL (Principles of Pro-
but lock-based synchronization has well-known pitfalls. gramming Languages) symposium, Tim Sweeney of Epic
Simplistic coarse-grained locking does not scale well, Games pointed out that “manual synchronization…is
while more sophisticated fine-grained locking risks intro- hopelessly intractable” for dealing with concurrency in
ducing deadlocks and data races. Furthermore, scalable game-play simulation and claimed that “transactions are
libraries written using fine-grained locks cannot be easily the only plausible solution to concurrent mutable state.”2
composed in a way that retains scalability and avoids Despite its momentum, bringing transactions into the
deadlock and data races. mainstream still faces many challenges. Even with trans-
TM (transactional memory) provides a new concur- actions, programmers must overcome parallel program-
rency-control construct that avoids the pitfalls of locks ming challenges, such as finding and extracting parallel
and significantly eases concurrent programming. It brings tasks and mapping these tasks onto a parallel architecture
to mainstream parallel programming proven concur- for efficient execution. In this article, we describe how
Multicore programming with transactional memory

FOCUS
Computer
Architecture
Transactions give the illusion of serial execution to the

programmer, and they give the illusion that they execute
as a single atomic step with respect to other concurrent
UNLOCKING operations in the system. The programmer can reason

serially because no other thread will perform any conflict-
CONCURRENCY ing operation.

Of course, a TM system doesn’t really execute transac-
tions serially; otherwise, it would defeat the purpose of
transactions ease some of the challenges programmers parallel programming. Instead, the system “under the
face using locks, and we look at the challenges system hood” allows multiple transactions to execute concur-
designers face implementing transactions in program- rently as long as it can still provide atomicity and isola-
ming languages. tion for each transaction. Later in this article, we cover
how an implementation provides atomicity and isolation
PROGRAMMING WITH TRANSACTIONS while still allowing as much concurrency as possible.
A memory transaction is a sequence of memory opera- The best way to provide the benefits of TM to the
tions that either executes completely (commits) or has no programmer is to replace locks with a new language
effect (aborts).3 Transactions are atomic, meaning they are construct such as atomic { B } that executes the state-
an all-or-nothing sequence of operations. If a transac- ments in block B as a transaction. A first-class language
tion commits, then all of its memory operations appear construct not only provides syntactic convenience for the
to take effect as a unit, as if all the operations happened programmer, but also enables static analyses that provide
instantaneously. If a transaction aborts, then none of its compile-time safety guarantees and enables compiler
stores appear to take effect, as if the transaction never optimizations to improve performance, which we touch
happened. on later in this article.
A transaction runs in isolation, meaning it executes as Figure 1 illustrates how an atomic statement could
if it’s the only operation running on the system and as if be introduced and used in an object-oriented language
all other threads are suspended while it runs. This means such as Java. The figure shows two different implemen-
that the effects of a memory transaction’s stores are not tations of a thread-safe map data structure. The code in
visible outside the transaction until the transaction com- section A of the figure shows a lock-based map using
mits; it also means that there are no other conflicting Java’s synchronized statement. The get() method simply
stores by other transactions while it runs. delegates the call to an underlying non-thread-safe map
Lock-based vs. Transactional Map Data Structure

A B
class LockBasedMap implements Map class AtomicMap implements Map
{ {
Object mutex;
Map m; Map m;
LockBasedMap(Map m) { AtomicMap(Map m) {
this.m = m; this.m = m;
mutex = new Object(); }
} public Object get() {
public Object get() { atomic {
synchronized (mutex) { return m.get();
return m.get(); }
} }
FIG 1
} // other Map methods
// other Map methods ...
... }
}

implementation, first wrapping the call in a synchronized Unlike coarse-grained locking, transactions can pro-
statement. The synchronized statement acquires a lock vide scalability as long as the data-access patterns allow
represented by a mutex object held in another field of the transactions to execute concurrently. The transaction
synchronized hash map. This same mutex object guards system can provide good scalability in two ways:
all the other calls to this hash map. • It can allow concurrent read operations to the same
Using locks, the programmer has explicitly forced all variable. In a parallel program, it’s safe to allow two or
threads to execute any call through this synchronized more threads to read the same variable concurrently.
wrapper serially. Only one thread at a time can call any Basic mutual exclusion locks don’t permit concurrent
method on this hash map. This is an example of coarse- readers; to allow concurrent readers, the programmer
grained locking. It’s easy to write thread-safe programs in has to use special reader-writer locks, increasing the
this way—you simply guard all calls through an interface program’s complexity.
with a single lock, forcing threads to execute inside the • It can allow concurrent read and write operations to dif-
interface one at a time. ferent variables. In a parallel program, it’s safe to allow
Part B of figure 1 shows the same code, using transac- two or more threads to read and write disjoint vari-
tions instead of locks. Rather than using a synchronized ables concurrently. A programmer can explicitly code
statement with an explicit lock object, this code uses a fine-grained disjoint access concurrency by associating
new atomic statement. This atomic statement declares different locks with different fine-grained data elements.
that the call to get() should be done atomically, as if This is usually a tedious and difficult task, however,
it were done in a single execution step with respect to and risks introducing bugs such as deadlocks and data
other threads. As with coarse-grained locking, it’s easy races. Furthermore, as we show in a later example, fine-
for the programmer to make an interface thread safe by grained locking does not lend itself to modular software
simply wrapping all the calls through the interface with engineering practices: In general, a programmer can’t
an atomic statement. Rather than explicitly forcing one take software modules that use fine-grained locking and
thread at a time to execute any call to this hash map, compose them together in a manner that safely allows
however, the programmer has instead declared to the sys- concurrent access to disjoint data.
tem that the call should execute atomically. The system Transactions can be implemented in such a way that
now assumes responsibility for guaranteeing atomicity they allow both concurrent read accesses, as well as con-
and implements concurrency control under the hood. current accesses to disjoint, fine-grained data elements
(e.g., different objects or
different array elements).
Performance of Transactions vs. Locks Using transactions, the
3.0 programmer gets these
synch forms of concurrency with-
(coarse) out having to code them
2.5
explicitly in the program.
It is possible to write a
2.0
concurrent hash-map data
time (s)
structure using locks so

1.5
that you get both concur-
synch rent read accesses and con-
1.0 (fine) current accesses to disjoint
data. In fact, the recent
0.5 atomic Java 5 libraries provide a
version of HashMap, called
0 ConcurrentHashMap, that
1 2 4 8 16
does exactly this. The code
FIG 2
number of threads
for ConcurrentHashMap,
however, is significantly
longer and more compli-
cated than the version

FOCUS
Computer
Architecture
resort to coarse-grained locking, thus losing the scalability

benefits of a concurrent hash map (figure 3A). To imple-
ment a scalable solution to this problem, the program-
UNLOCKING mer must somehow reuse the fine-grained locking code

hidden inside the implementation of the concurrent hash
CONCURRENCY map. Even if the programmer had access to this imple-

mentation, building a composite move operation out of
it risks introducing deadlock and data races, especially in
using coarse-grained locking. The algorithm was designed the presence of other composite operations.
by threading experts and it went through a comprehen- Transactions, on the other hand, allow the program-
sive public review process before it was added to the Java mer to compose applications out of libraries safely and
standard. In general, writing highly concurrent lock-based still achieve scalability. The programmer can simply wrap
code such as ConcurrentHashMap is very complicated a transaction around the composite move operation (fig-
and bug prone and thereby introduces additional com- ure 3B). The underlying TM system will allow two threads
plexity to the software development process. to perform a move operation concurrently as long as the
Figure 2 compares the performance of the three differ- two threads access different hash-table buckets in both
ent versions of HashMap. It plots the time it takes to com- underlying hash-map structures. So transactions allow a
plete a fixed set of insert, delete, and update operations programmer to take separately authored scalable software
on a 16-way SMP (symmetric multiprocessing) machine.4 components and compose them together into larger
As the numbers show, the performance of coarse-grained components, in a way that still provides as much concur-
locking does not improve as the number of processors rency as possible but without risking deadlocks because of
increases, so coarse-grained locking does not scale. The concurrency control.
performance of fine-grained locking and transactional By providing a mechanism to roll back side effects,
memory, however, improves as the number of proces- transactions enable a language to provide failure atomic-
sors increases. So for this data structure, transactions give ity. In lock-based code, programmers must make sure that
you the same scalability and performance as fine-grained exception handlers properly restore invariants before
locking but with significantly less programming effort. releasing locks. This requirement often leads to compli-
As these numbers demonstrate, transactions delegate to cated exception-handling code because the programmer
the runtime system the hard task of allowing as much must not only make sure that a critical section catches
concurrency as possible. and handles all exceptions, but also track the state of the
Although highly concurrent libraries built using data structures used inside the critical section so that the
fine-grained locking can scale well, a developer doesn’t exception handlers can properly restore invariants. In a
necessarily retain scalability after composing larger appli- transaction-based language, the atomic statement can roll
cations out of these libraries. As an example, assume the back all the side effects of the transaction (automatically
programmer wants to perform a composite operation that restoring invariants) if an uncaught exception propagates
moves a value from one concurrent hash map to another, out of its block. This significantly reduces the amount
while maintaining the invariant that threads always see of exception-handling code and improves robustness, as
a key in either one hash map or the other, but never in uncaught exceptions inside a transaction won’t compro-
neither. Implementing this requires that the programmer mise a program’s invariants.
Thread-safe Composite Operation

A B
move(Object key) { move(Object key) {
synchronized(mutex) { atomic {
map2.put(key, map1.remove(key)); map2.put(key, map1.remove(key));
FIG 3
} }
} }

TRANSACTIONS UNDER THE HOOD either by stalling one of the transactions in place or by
Transactional memory transfers the burden of concur- aborting one transaction and retrying it later. In general,
rency management from the application programmers to the performance of pessimistic detection depends on the
the system designers. Under the hood, a combination of set of policies used to resolve conflicts, which are typically
software and hardware must guarantee that concurrent referred to as contention management. A challenging issue
transactions from multiple threads execute atomically is the detection of recurring or circular conflicts between
and in isolation. The key mechanisms for a TM system multiple transactions that can block all transactions from
are data versioning and conflict detection. committing (lack of forward progress).
As transactions execute, the system must simultane- The alternative is optimistic conflict detection that
ously manage multiple versions of data. A new ver- assumes conflicts are rare and postpones all checks until
sion, produced by one of the pending transactions, will the end of each transaction. Before committing, a transac-
become globally visible only if the transaction commits. tion validates that no other transaction is reading the
The old version, produced by a previously committed data it wrote or writing the data it read. The drawback to
transaction, must be preserved in case the pending trans- optimistic detection is that conflicts are detected late, past
action aborts. With eager versioning, a write access within the point a transaction reads or writes the data. Hence,
a transaction immediately writes to memory the new data stalling in place is not a viable option for conflict resolu-
version. The old version is buffered in an undo log. If the tion and may waste more work as a result of aborts. On
transaction later commits, no further action is necessary the other hand, optimistic detection guarantees forward
to make the new versions globally visible. If the transac- progress in all cases by simply giving priority to the
tion aborts, the old versions must be restored from the committing transaction on a conflict. It also allows for
undo log, causing some additional delay. To prevent other additional concurrency for reads as conflict checks for
code from observing the uncommitted new versions (loss writes are performed toward the end of each transaction.
of atomicity), eager versioning requires the use of locks Optimistic conflict detection does not work with eager
or an equivalent hardware mechanism throughout the versioning.
transaction duration. The granularity of conflict detection is also an important
Lazy versioning stores all new data versions in a write design parameter. Object-level detection is close to the
buffer until the transaction completes. If the transaction programmer’s reasoning in object-oriented environments.
commits, the new versions become visible by copying Depending on the size of objects, it may also reduce over-
from the write buffer to the actual memory addresses. If head in terms of space and time needed for conflict detec-
the transaction aborts, no further action is needed as the tion. Its drawback is that it may lead to false conflicts,
new versions were isolated in the write buffer. In con- when two transactions operate on different fields within
trast to eager versioning, the lazy approach is subject to a large object such as a multidimensional array. Word-
loss of atomicity only during the commit process. The level detection eliminates false conflicts but requires more
challenges with lazy versioning, particularly for software space and time to track and compare read sets and write
implementations, are the delay introduced on transaction sets. Cache-line-level detection provides a compromise
commits and the need to search the write buffer first on between the frequency of false conflicts and time and
transaction reads to access the latest data versions. space overhead. Unfortunately, cache lines and words are
A conflict occurs when two or more transactions not language-level entities, which makes it difficult for
operate concurrently on the same data with at least one programmers to optimize conflicts in their code, particu-
transaction writing a new version. Conflict detection larly with managed runtime environments that hide data
and resolution are essential to guarantee atomic execu- placement from the user.
tion. Detection relies on tracking the read set and write A final challenge for TM systems is the handling of
set for each transaction, which, respectively, includes the nested transactions. Nesting may occur frequently, given
addresses it read from and wrote to during its execution. the trend toward library-based programming and the fact
We add an address to the read set on the first read to it that transactions can be composed easily and safely. Early
within the transaction. Similarly, we add an address to systems automatically flattened nested transactions by
the write set on the first write access. subsuming any inner transactions within the outermost.
Under pessimistic conflict detection, the system checks While simple, the flattening approach prohibits explicit
for conflicts progressively as transactions read and write transaction aborts, which are useful for failure atomic-
data. Conflicts are detected early and can be handled ity on exceptions. The alternative is to support partial

FOCUS
Computer
Architecture
runtimes, and existing libraries. The following sections

discuss how these challenges are addressed with software
and hardware techniques.
UNLOCKING SOFTWARE TRANSACTIONAL MEMORY
CONCURRENCY STM (software transactional memory) implements trans-

actional memory entirely in software so that it runs on
stock hardware. An STM implementation uses read and
rollback to the beginning of the nested transaction when write barriers (that is, inserts instrumentation) for all
a conflict or an abort occurs during its execution. It shared memory reads and writes inside transactional code
requires that the version management and conflict detec- blocks. The instrumentation is inserted by a compiler and
tion for a nested transaction are independent from that allows the runtime system to maintain the metadata that
for the outermost transaction. In addition to allowing is required for data versioning and conflict detection.
explicit aborts, such support for nesting provides a power- Figure 4 shows an example of how an atomic construct
ful mechanism for performance tuning and for control- could be translated by a compiler in an STM implementa-
ling the interaction between transactions and runtime or tion. Part A shows an atomic code block written by the
operating system services.5 programmer, and part B shows the compiler instrument-
It is unclear which of these options leads to an optimal ing the code in the transactional block. We use a simpli-
design. Further experience with prototype implemen- fied control flow to ease the presentation. The setjmp
tations and a wide range of applications is needed to function checkpoints the current execution context so
quantify the trade-offs among performance, ease of use, that the transaction can be restarted on an abort. The
and complexity. In some cases, a combination of design stmStart function initializes the runtime data structures.
options leads to the best performance. For example, some Accesses to the global variables a and b are mediated
TM systems use optimistic detection for reads and pes- through the barrier functions stmRead and stmWrite. The
simistic detection for writes, while detecting conflicts at stmCommit function completes the transaction and makes
the word level for arrays and at the object level for other its changes visible to other threads. The transaction gets
data types. Nevertheless, any TM system must provide
6
validated periodically during its execution, and if a con-
efficient implementations for the key structures (read flict is detected, the transaction is aborted. On an abort,
set, write set, undo log, write buffer) and must facilitate the STM library rolls back all the updates performed by
the integration with optimizing compilers, managed the transaction, uses a longjmp to restore the context
saved at the beginning of
the transaction, and reex-
Translating an Atomic Construct for STM ecutes the transaction.
A User Code B Compiled Code Since TM accesses need
to be instrumented, a
int foo (int arg) int foo (int arg) compiler needs to generate
{ { an extra copy of any func-
… jmpbuf env; tion that may be called
atomic … from inside a transac-
{ do { tion. This copy contains
b = a + 5; if (setjmp(&env) == 0) { instrumented accesses
} stmStart(); and is invoked when the
… temp = stmRead(&a); function is called from
} temp1 = temp + 5; within a transaction. The
stmWrite(&b, temp1); transactional code can be
stmCommit(); heavily optimized by a
FIG 4
break; compiler—for example, by
} eliminating barriers to the
} while (1); same address or to immut-
… able variables.7

The read and write barriers operate on transaction head compared with lock-based code on a single thread.
records, pointer-size metadata associated with every piece Moreover, STM implementations incur additional over-
of data that a transaction may access. The runtime system head if they have to guarantee isolation between transac-
also maintains a transaction descriptor for each transaction. tional and nontransactional code. Reducing this overhead
The descriptor contains its transaction’s state such as the is an active area of research. Like other forms of TM, STMs
read set, the write set, and the undo log for eager version- don’t have a satisfactory way of handling irrevocable
ing (or the write buffer for lazy versioning). The STM actions such as I/O and system calls, nor can they execute
runtime exports an API that allows other components of arbitrary precompiled binaries within a transaction.
the language runtime, such as the garbage collector, to
inspect and modify the contents of the descriptor, such HARDWARE ACCELERATION FOR TM
as the read set, write set, or undo log. The descriptor also Transactional memory can also be implemented in
contains metadata that allows the runtime system to infer hardware, referred to as HTM (hardware transactional
the nesting depth at which data was read or written. This memory). An HTM system requires no read or write bar-
allows the STM to partially roll back a nested transaction.8 riers within the transaction code. The hardware manages
The write barrier implements different forms of data data versions and tracks conflicts transparently as the
versioning and conflict detection for writes. For eager ver- software performs ordinary read and write accesses. Apart
sioning (pessimistic writes) the write barrier acquires an from reducing the overhead of instrumentation, HTM
exclusive lock on the transaction record corresponding to systems do not require two versions of the functions used
the updated memory location, remembers the location’s in transactions and work with programs that call unin-
old value in the undo log, and updates the memory loca- strumented library routines.
tion in place. For lazy versioning (optimistic writes) the HTM systems rely on the cache hierarchy and the cache
write barrier stores the new value in the write buffer; at coherence protocol to implement versioning and conflict
commit time, the transaction acquires an exclusive lock detection. Caches observe all reads and writes issued by
on all the required transaction records and copies the the processors, can buffer a significant amount of data,
values to memory. and are fast to search because of their associative organi-
The read barrier also operates on transaction records zation. All HTM systems modify the first-level caches, but
for detecting conflicts and implementing pessimistic or the approach extends to lower-level caches, both private
optimistic forms of read concurrency. For pessimistic and shared.
reads the read barrier simply acquires a read lock on the To track the read set and write set for a transaction,
corresponding transaction record before reading the data each cache line is annotated with R and W tracking bits
item. Optimistic reads are implemented by using data ver- that are set on the first read or write to the line, respec-
sioning; the transaction record holds the version number tively. When a transaction commits or aborts, all tracking
for the associated data.9 bits are cleared simultaneously using a gang or flash reset
STM implementations detect conflicts in two cases: operation.
the read or write barrier finds that a transaction record Caches implement data versioning by storing the
is locked by some other transaction; or in a system with working set for the undo log or the data buffer for the
optimistic read concurrency, the transaction finds, during transactions. Before a cache write under eager versioning,
periodic validation, that the version number for some we check if this is the first update to the cache line within
transaction record in its read set has changed. On a con- this transaction (W bit reset). In this case, the cache line
flict, the STM can use a variety of sophisticated conflict and its address are added to the undo log using additional
resolution schemes such as causing transactions to back writes to the cache. If the transaction aborts, a hardware
off in a random manner, or aborting and restarting some or software mechanism must traverse the log and restore
set of conflicting transactions. the old data versions.10
STMs allow transactions to be integrated with the rest In lazy versioning, a cache line written by the trans-
of the language environment, such as a garbage collec- action becomes part of the write buffer by setting its W
tor. They allow transactions to be integrated with tools, bit.11 If the transaction aborts, the write buffer is instanta-
such as debuggers. They also allow accurate diagnostics neously flushed by invalidating all cache lines with the W
for performance tuning. Finally, STMs avoid baking TM bit set. If the transaction commits, the data in the write
semantics prematurely into hardware. buffer becomes instantaneously visible to the rest of the
STM implementations can incur a 40-50 percent over- system by resetting the W bits in all cache lines.

FOCUS
Computer
Architecture
hardware or firmware mechanisms to move data between

caches and memory on cache overflows.
An alternative virtualization technique is to use a
UNLOCKING hybrid HTM-STM implementation. Transactions start

using the HTM mode. If hardware resources are exceeded,
CONCURRENCY the transactions are rolled back and restarted in the STM
mode.15 The challenge with hybrid TM is conflict detec-
tion between software and hardware transactions. To
To detect conflicts, the caches must communicate avoid the need for two versions of the code, the software
their read sets and write sets using the cache coherence mode of a hybrid STM system can be provided through
protocol implemented in multicore chips. Pessimistic the operating system with conflict detection at the granu-
conflict detection uses the same coherence messages larity of memory pages.16
exchanged in existing systems.12 On a read or write access A final implementation approach is to start with an
within a transaction, the processor will request shared STM system and provide a small set of key mechanisms
or exclusive access to the corresponding cache line. The that targets its main sources of overhead.17 This approach
request is transmitted to all other processors that look up is called HASTM (hardware-accelerated STM). HASTM
their caches for copies of this cache line. A conflict is sig- introduces two basic hardware primitives: support for
naled if a remote cache has a copy of the same line with detecting the first use of a cache line, and support for
the R bit set (for an exclusive access request) or the W bit detecting possible remote updates to a cache line. The
set (for either request type). Optimistic conflict detection two primitives can significantly reduce the read barrier in
operates similarly but delays the requests for exclusive general instrumentation overhead and the read-set valida-
access to cache lines in the write set until the transaction tion time in the case of optimistic reads.
is ready to commit. A single, bulk message is sufficient to
communicate all requests.13 CONCLUSIONS
Even though HTM systems eliminate most sources of Composing scalable parallel applications using locks is
overhead for transactional execution, they nevertheless difficult and full of pitfalls. Transactional memory avoids
introduce additional challenges. The modifications HTM many of these pitfalls and allows the programmer to
requires in the cache hierarchy and the coherence pro- compose applications safely and in a manner that scales.
tocol are nontrivial. Processor vendors may be reluctant Transactions improve the programmer’s productivity by
to implement them before transactional programming shifting the difficult concurrency-control problems from
becomes pervasive. Moreover, the caches used to track the the application developer to the system designer.
read set, write set, and write buffer for transactions have In the past three years, TM has attracted a great deal
finite capacity and may overflow on a long transaction. of research activity, resulting in significant progress.18
Long transactions may be rare, but they still must Nevertheless, before transactions can make it into the
be handled in a manner that preserves atomicity and mainstream as first-class language constructs, there are
isolation. Placing implementation-dependent limits on many open challenges to address.
transaction sizes is unacceptable from the programmer’s Developers will want to protect their investments
perspective. Finally, it is challenging to handle the trans- in existing software, so transactions must be added
action state in caches for deeply nested transactions or incrementally to existing languages, and tools must be
when interrupts, paging, or thread migration occur.14 developed that help migrate existing code from locks to
Several proposed mechanisms virtualize the finite transactions. This means transactions must compose with
resources and simplify their organization in HTM systems. existing concurrency features such as locks and threads.
One approach is to track read sets and write sets using System calls and I/O must be allowed inside transactions,
signatures based on Bloom filters. The signatures provide and transactional memory must integrate with other
a compact yet inexact (pessimistic) representation of the transactional resources in the environment. Debugging
sets that can be easily saved, restored, or communicated and tuning tools for transactional code are also chal-
if necessary. The drawback is that the inexact representa- lenges, as transactions still require tuning to achieve
tion leads to additional, false conflicts that may degrade scalability and concurrency bugs are still possible using
performance. Another approach is to map read sets, transactions.
write sets, and write buffers to virtual memory and use Transactions are not a panacea for all parallel program-

ming challenges. Additional technologies are needed to 11. Hammond, L., Carlstrom, B., Wong, V., Chen, M.,
address issues such as task decomposition and mapping. Kozyrakis, C., Olukotun, K. 2004. Transactional
Nevertheless, transactions take a concrete step toward coherence and consistency: Simplifying parallel
making parallel programming easier. This is a step that hardware and software. IEEE Micro 24 (6).
will clearly benefit from new software and hardware 12. See reference 10.
technologies. Q 13. See reference 11.
14. Chung, J., Cao Minh, C., McDonald, A., Skare, T.,
AUTHORS’ NOTE Chafi, H., Carlstrom, B., Kozyrakis, C., Olukotun, K.
For extended coverage on the topic, refer to the slides 2006. Tradeoffs in transactional memory virtualiza-
from the PACT ’06 (Parallel Architectures and Compila- tion. In Proceedings of the 12th International Conference
tion Techniques) tutorial, “Transactional Programming in on Architectural Support for Programming Languages and
a Multicore Environment,” available at http://csl.stanford. Operating Systems. San Jose, CA (October).
edu/~christos/publications/tm_tutorial_pact2006.zip. 15. Damron, P., Fedorova, A., Lev, Y., Luchangco, V.,
Moir, M., Nussbaum, D. Hybrid transactional mem-
REFERENCES ory. In Proceedings of the 12th International Conference
1. Sutter, H., Larus, J. 2005. Software and the concur- on Architectural Support for Programming Languages and
rency revolution. ACM Queue 3 (7). Operating Systems. San Jose, CA (October).
2. Sweeney, T. 2006. The next mainstream programming 16. See reference 14.
languages: A game developer’s perspective. Keynote 17. Saha, B., Adl-Tabatabai, A., Jacobson, Q. 2006. Archi-
speech, Symposium on Principles of Programming tectural support for software transactional memory.
Languages. Charleston, SC (January). In Proceedings of the 39th International Symposium on
3. Herlihy, M., Moss, E. 1993. Transactional memory: Microarchitecture. Orlando, FL (December).
Architectural support for lock-free data structures. In 18. Transactional Memory Online Bibliography; http://
Proceedings of the 20th Annual International Symposium www.cs.wisc.edu/trans-memory/biblio/.
on Computer Architecture. San Diego, CA (May).
4. Adl-Tabatabai, A., Lewis, B.T., Menon, V.S., Murphy, LOVE IT, HATE IT? LET US KNOW
B.M., Saha, B., Shpeisman, T. 2006. Compiler and feedback@acmqueue.com or www.acmqueue.com/forums
runtime support for efficient software transactional
memory. In Proceedings of the Conference on Program- ALI-REZA ADL-TABATABAI is a principal engineer in the
ming Language Design and Implementation. Ottawa, Programming Systems Lab at Intel Corporation. He leads a
Canada (June). team developing compilers and scalable runtimes for future
5. A. McDonald, A., Chung, J., Carlstrom, B.D., Cao Intel architectures. His current research concentrates on lan-
Minh, C., Chafi, H., Kozyrakis, C., Olukotun, K. 2006. guage features supporting parallel programming for future
Architectural semantics for practical transactional multicore architectures.
memory. In Proceedings of the 33rd International Sympo- CHRISTOS KOZYRAKIS (http://csl.stanford.edu/~christos)
sium on Computer Architecture. Boston, MA (June). is an assistant professor of electrical engineering and com-
6. Saha, B., Adl-Tabatabai, A., Hudson, R., Cao Minh, C., puter science at Stanford University. His research focuses
Hertzberg, B. 2006. McRT-STM: A high-performance on architectures, compilers, and programming models for
software transactional memory system for a multicore parallel computer systems. He is working on transactional
runtime. In Proceedings of the Symposium on Principles memory techniques that can greatly simplify parallel pro-
and Practice of Parallel Programming. New York, NY gramming for the average developer.
(March). BRATIN SAHA is a senior staff researcher in the Program-
7. See reference 4. ming Systems Lab at Intel Corporation. He is one of the
8. See reference 6. architects for synchronization and locking in the next-gen-
9. See reference 6. eration IA-32 processors. He is involved in the design and
10. Moore, K., Bobba, J., Moravan, M., Hill, M., Wood, implementation of a highly scalable runtime for multicore
D. 2006. LogTM: Log-based transactional memory. processors. As a part of this he has been looking at language
In Proceedings of the 12th International Conference on features, such as transitional memory, to ease parallel pro-
High-Performance Computer Architecture. Austin, TX gramming.
(February). © 2006 ACM 1542-7730/06/1200 $5.00

FOCUS
Computer
Architecture
The
Virtualization
Reality
A number of important challenges are associated with the tion, taking a closer look at the Xen hypervisor and its
deployment and configuration of contemporary comput- paravirtualization architecture. We then review several
ing infrastructure. Given the variety of operating systems challenges in deploying and exploiting computer systems
and their many versions—including the often-specific and software applications, and we look at IT infrastruc-
configurations required to accommodate the wide range ture management today and show how virtualization can
of popular applications—it has become quite a conun- help address some of the challenges.
drum to establish and manage such systems.
Significantly motivated by these challenges, but also A POCKET HISTORY OF VIRTUALIZATION
owing to several other important opportunities it offers, All modern computers are sufficiently powerful to use
virtualization has recently become a principal focus for virtualization to present the illusion of many smaller VMs
computer systems software. It enables a single computer (virtual machines), each running a separate operating sys-
to host multiple different operating system stacks, and tem instance. An operating system virtualization environ-
it decreases server count and reduces overall system ment provides each virtualized operating system (or guest)
complexity. EMC’s VMware is the most visible and early the illusion that it has exclusive access to the underly-
entrant in this space, but more recently XenSource, Paral- ing hardware platform on which it runs. Of course, the
lels, and Microsoft have introduced virtualization solu- virtual machine itself can offer the guest a different view
tions. Many of the major systems vendors, such as IBM, of the hardware from what is really available, including
Sun, and Microsoft, have efforts under way to exploit CPU, memory, I/O, and restricted views of devices.
virtualization. Virtualization appears to be far more than Virtualization has a long history, starting in the main-
just another ephemeral marketplace trend. It is poised to frame environment and arising from the need to provide
deliver profound changes to the way that both enterprises isolation between users. The basic trend started with
and consumers use computer systems. time-sharing systems (enabling multiple users to share
What problems does virtualization address, and more- a single expensive computer system), aided by innova-
over, what will you need to know and/or do differently tions in operating system design to support the idea of
to take advantage of the innovations that it delivers? In processes that belong to a single user. The addition of user
this article we provide an overview of system virtualiza- and supervisor modes on most commercially relevant
SIMON CROSBY, XENSOURCE and DAVID BROWN, SUN MICROSYSTEMS

Are
hypervisors
the new
foundation
for
system
software?

FOCUS
Computer
Architecture
The Virtualization Reality

processors meant that the operating system code could nant class of instructions requiring emulation. By defini-
be protected from user programs, using a set of so-called tion, a user program cannot execute these instructions.
“privileged” instructions reserved for the operating One technique to force emulation of these instructions
system software running in supervisor mode. Memory is to execute all of the code within a virtual machine,
protection and, ultimately, virtual memory were invented including the operating system being virtualized, as user
so that separate address spaces could be assigned to differ- code. The resident VMM then handles the exception pro-
ent processes to share the system’s physical memory and duced by the attempt to execute a privileged instruction
ensure that its use by different applications was mutually and performs the desired action on behalf of the operat-
segregated. ing system.
These initial enhancements could all be accommo- While some CPUs were carefully architected with oper-
dated within the operating system, until the day arrived ating system virtualization in mind (the IBM 360 is one
when different users, or different applications on the such example), many contemporary commodity proces-
same physical machine, wanted to run different operating sor architectures evolved from earlier designs, which did
systems. This requirement could be satisfied only by sup- not anticipate virtualization. Providing full virtualization
porting multiple VMs, each capable of running its own of a processor in such cases is a challenging problem,
operating system. The virtualization era (marked by IBM’s often resulting in so-called “virtualization holes.” Virtual-
release of VM for the System/360 in 1972) had dawned. ization of the x86 processor is no exception. For example,
certain instructions execute in both user mode and
VIRTUALIZATION BASICS supervisor mode but produce different results, depend-
Operating system virtualization is achieved by inserting ing on the execution mode. A common approach to
a layer of system software—often called the hypervisor overcome these problems is to scan the operating system
or VMM (virtual machine monitor)—between the guest code and modify the offending instruction sequences,
operating system and the underlying hardware. This layer either to produce the intended behavior or to force a trap
is responsible for allowing multiple operating system into the VMM. Unfortunately, this patching and trapping
images (and all their running applications) to share the approach can cause significant performance penalties.
resources of a single hardware server. Each operating
system believes that it has the resources of the entire PARAVIRTUALIZATION
machine under its control, but beneath its feet the virtu- An alternative way of achieving virtualization is to pre-
alization layer, or hypervisor, transparently ensures that sent a VM abstraction that is similar but not identical to
resources are properly and securely partitioned between the underlying hardware. This approach has been called
different operating system images and their applications. paravirtualization.
The hypervisor manages all hardware structures, such as In lieu of a direct software emulation of the underlying
the MMU (memory management unit), I/O devices, and hardware architecture, the concept of paravirtualization is
DMA (direct memory access) controllers, and presents a that a guest operating system and an underlying hypervi-
virtualized abstraction of those resources to each guest sor collaborate closely to achieve optimal performance.
operating system. Many guest operating system instances (of different
configurations and types) may run atop the one hypervi-
EMULATED VIRTUALIZATION sor on a given hardware platform. This offers improved
The most direct method of achieving virtualization is to performance, although it does require modifications
provide a complete emulation of the underlying hardware to the guest operating system. It is important to note,
platform’s architecture in software, particularly involving however, that it does not require any change to the ABI
the processor’s instruction set architecture. For the x86 (application binary interface) offered by the guest system;
processor, the privileged instructions—used exclusively by hence, no modifications are required to the guest operat-
the operating system (for interrupt handling, reading and ing system’s applications.
writing to devices, and virtual memory)—form the domi- In many ways this method is similar to the operating

system virtualization approach of VM for the IBM 360 decide how to manage the state of hardware data struc-
and 370 mainframes.1,2 Under pure virtualization, you can tures on context switches. The hypervisor is mapped into
run an unmodified operating-system binary and unmodi- the address space of each guest operating system, mean-
fied application binaries, but the resource consumption ing that there is no context-switch overhead between the
management and performance isolation is problem- operating system and the hypervisor on a hypercall.
atic—one guest operating system and/or its apps could Finally, by cooperatively working with the guest
consume all physical memory and/or cause thrashing, for operating systems, the hypervisor gains insight into the
example. The paravirtualization approach requires some intentions of the operating system and can make it aware
work to port each guest operating system, but rigorous that it has been virtualized. This can be a great advantage
allocation of hardware resources can then be done by the to the guest operating system—for example, the hypervi-
hypervisor, ensuring proper performance isolation and sor can tell the guest that real time has passed between its
guarantees. last run and its present run, permitting it to make smarter
The use of paravirtualization and the complementary rescheduling decisions to respond appropriately to a rap-
innovation of processor architecture extensions to sup- idly changing environment.
port it (particularly those recently introduced in both the Xen makes a guest operating system (running on top
Intel and AMD processors, which eliminate the need to of the VMM) virtualization-aware and presents it with a
“trap and emulate”) now permit high-performance virtu- slightly modified x86 architecture, provided through the
alization of the x86 architecture. so-called hypercall API. This removes any difficult and
costly-to-emulate privileged instructions and provides
PARAVIRTUALIZATION AND THE XEN HYPERVISOR equivalent, although not identical, functionality with
An example of paravirtualization as applied on the x86 explicit calls into the hypervisor. The operating system
architecture is the Xen hypervisor (figure 1). Xen was ini- must be modified to deal with this change, but in a well-
tially developed by Ian Pratt and a team at the University structured operating system, these changes are limited
of Cambridge in 2001-02, and has subsequently evolved to its architecture-dependent modules, most typically
into an open source project with broad involvement. a fairly small subset of the complete operating system
Any hypervisor (whether it implements full hardware implementation. Most importantly, the bulk of the oper-
emulation or paravirtualization) must provide virtualiza- ating system and the entirety of application programs
tion for the following system facilities: remain unmodified.
• CPUs (including multiple cores per device)
• Memory system (memory management and physical
memory) Paravirtualization–Xen Hypervisor
• I/O devices
domain 0/
• Asynchronous events, such as interrupts root partition user user
Let’s now briefly examine Xen’s approach to each of apps apps
these facilities. (For further detail, we recommend the mgt device
code drivers Linux Windows
excellent introduction to and comprehensive treatment
of Xen’s design and principles presented in Pratt et al.’s
paper.3) mgt
Xen
hypercall API
API
CPU AND MEMORY VIRTUALIZATION
In Xen’s paravirtualization, virtualization of CPU and
hardware
memory and low-level hardware interrupts are provided
by a low-level efficient hypervisor layer that is imple- • small hypervisor runs directly on hardware
mented in about 50,000 lines of code. When the operat- • guest OSes co-operate with hypervisor for
resource management & I/O
ing system updates hardware data structures, such as the
• device drivers outside hypervisor
page table, or initiates a DMA operation, it collaborates
FIG 1
with the hypervisor by making calls into an API that is
offered by the hypervisor.
This, in turn, allows the hypervisor to keep track of all
changes made by the operating system and to optimally

FOCUS
Computer
Architecture

For Linux, the Xen hypercall API takes the form of a plished by transferring control of the I/O to the hypervi-
jump table populated at kernel load time. When the ker- sor, with no additional complexity in the guest operating
nel is running in a native implementation (i.e., not atop system. It is important to note that the drivers in the Xen
a paravirtualizing hypervisor), the jump table is popu- architecture run outside the base hypervisor, at a lower
lated with default native operations; when the kernel is level of protection than the core of the hypervisor itself.
running on Xen, the jump table is populated with the The hypervisor is thus protected from bugs and crashes in
Xen hypercalls. This enables the same kernel to run in device drivers (they cannot crash the Xen VMM) and can
both native and virtualized forms, with the performance use any device drivers available on the market. Also, the
benefits of paravirtualization but without the need to virtualized operating system image is much more portable
recertify applications against the kernel. across hardware, since the low levels of the driver and
Isolation between virtual machines (hence, the respec- hardware management are modules that run under con-
tive guest operating systems running within each) is a trol of the hypervisor.
particularly important property that Xen provides. The In full-virtualization (emulation) implementations,
physical resources of the hardware platform (such as the platform’s physical hardware devices are emulated,
CPU, memory, etc.) are rigidly divided between VMs to and the unmodified binary for each guest operating
ensure that they each receive a guaranteed portion of the system is run, including the native drivers it contains. In
platform’s overall capacity for processing, memory, I/O, those circumstances it is difficult to restrict the respective
and so on. Moreover, as each guest is running on its own operating system’s use of the platform’s physical hard-
set of virtual hardware, applications in separate operat- ware, and one virtual machine’s runtime behaviors can
ing systems are protected from one another to almost significantly impact the performance of the others. Since
the same degree that they would be were they installed all physical access to hardware is managed centrally in
on separate physical hosts. This property is particularly Xen’s approach to I/O virtualization, resource access by
appealing in light of the inability of current operating each guest can be marshaled. This provides the conse-
systems to provide protection against spyware, worms, quential benefit of performance isolation for each of the
and viruses. In a system such as Xen, nontrusted applica- guest operating systems.
tions considered to pose such risks (perhaps such as Web Those who have experience with microkernels will
browsers) may be seconded to their own virtual machines likely find this approach to I/O virtualization familiar.
and thus completely separated from both the underlying One significant difference between Xen and historical
system software and other more trusted applications. work on microkernels, however, is that Xen has relaxed
the constraint of achieving a complete and architecturally
I/O VIRTUALIZATION pure emulation of the x86 processor’s I/O architecture.
I/O virtualization in a paravirtualizing VMM such as Xen Xen uses a generalized, shared-memory, ring-based I/O
is achieved via a single set of drivers. The Xen hypervisor communication primitive that is able to achieve very
exposes a set of clean and simple device abstractions, and high throughputs by batching requests. This I/O abstrac-
a set of drivers for all hardware on the physical platform tion has served well in ports to other processor architec-
is implemented in a special domain (VM) outside the core tures, including the IA-64 and PowerPC. It also affords an
hypervisor. These drivers are offered via the hypervisor’s innovative means to add features into the I/O path, by
abstracted I/O interface for use within other VMs, and plumbing in additional modules between the guest vir-
thus are used by all guest operating systems.4 tual device and the real device driver. One example in the
In each Xen guest operating system, simple paravirtu- network stack is the support of full OSI layer 2 switching,
alizing device drivers replace hardware-specific drivers for packet filtering, and even intrusion detection.
the physical platform. Paravirtualizing drivers are inde-
pendent of all physical hardware but represent each type HARDWARE SUPPORT FOR VIRTUALIZATION
of device (e.g., block I/O, Ethernet). These drivers enable Recent innovations in hardware, particularly in CPU,
high-performance, virtualization-safe I/O to be accom- MMU, and memory components (notably the hardware

virtualization support presently available in the Intel platform constituents (processors, storage, and memory).
VT-x and AMD-V architectures, offered in both client Ironically, in spite of the corresponding widespread
and server platforms), provide some direct platform-level adoption of these now relatively inexpensive, x86-based
architectural support for operating system virtualization. servers, most enterprises have seen their IT costs and
This has enabled near bare-metal performance for virtual- complexity escalate rapidly.
ized guest operating systems. While the steady march of Moore’s law has markedly
Xen provides a common HVM (hardware virtual decreased hardware’s cost of acquisition, the associated
machine) abstraction to hide the minor differences proliferation of this inexpensive computing has led to
between the Intel and AMD technologies and their tremendous increases in complexity—with the costs of
implementations. HVM offers two key features: First, for server configuration, management, power, and main-
unmodified guest operating systems, it avoids the need to tenance dwarfing the basic cost of the hardware. Each
trap and emulate privileged instructions in the operating server in the data center costs an enterprise on average
system, by enabling guests to run at their native privilege $10,000 per year to run when all of its costs—provision-
levels, while providing a hardware vector (called a VM ing, maintenance, administration, power, real estate,
EXIT) into the hypervisor whenever the guest executes hardware, and software—are considered. In addition, the
a privileged instruction that would unsafely modify the artifacts of current operating-system and system-software
machine state. The hypervisor begins execution with architecture result in most servers today running at under
the full state of the guest available to it and can rapidly 10 percent utilization.
decide how best to deal with the reason for the VM EXIT. Several opportunities arise directly from the rapid
Today’s hardware takes about 1,000 clock cycles to save performance and capacity increase seen in the contem-
the state of the currently executing guest and to transi- porary commodity hardware platforms. Last decade’s
tion into the hypervisor, which offers good, though not trend in commercial IT infrastructure was an expanding
outstanding, performance. hardware universe: achieving performance and capacity
A second feature of the HVM implementations is by “horizontal” scaling of the hardware. Given the dra-
that they offer guest operating systems running with a matic performance available on a single commodity box
paravirtualizing hypervisor (in particular, their device today, we may now be witnessing a contraction of this
drivers) new instructions that call directly into the universe—still a horizontal trend, but in reverse. Whereas
hypervisor. These can be used to ensure that guest I/O it may have required many servers to support enterprise-
takes the fastest path into the hypervisor. Paravirtualizing wide or even department-wide computing just five years
device drivers, inserted into each guest operating system, ago, virtualization allows many large application loads to
can then achieve optimal I/O performance, even though be placed on one hardware platform, or a smaller number
neither Intel’s nor AMD’s virtualization extension for the of platforms. This can cut both per-server capital cost and
x86 (Intel VT and AMD-V, respectively) offers particular the overall lifetime operational costs significantly.
performance benefits to I/O virtualization. The 10-percent utilization statistic reveals that server
consolidation can achieve a tenfold savings in infrastruc-
VIRTUALIZATION AS A SOLUTION ture cost, not simply through reduced CPU count but
A number of chronic challenges are associated with more importantly through its consequent reductions in
deployment and management of computer systems and switching, communication, and storage infrastructure,
their applications, especially in the modern context of and power and management costs. Since virtualiza-
larger-scale, commercial, and/or enterprise use. Virtualiza- tion allows multiple operating system images (and the
tion provides an abstraction from the physical hardware, applications associated with each that constitute software
which breaks the constraint that only a single instance of services) to share a single hardware server, it is a basic
an operating system may run on a single hardware plat- enabler for server consolidation.
form. Because it encapsulates the operating environment, The virtual I/O abstraction is another important
virtualization is a surprisingly powerful abstraction. component of server virtualization. In the past, when
multiple servers and/or multiple hardware interfaces per
SERVER VIRTUALIZATION server were used to support scalability, physical hardware
The past decade has witnessed a revolutionary reduction devices could be individually allotted to guarantee a
in hardware costs, as well as a significant increase in both certain performance, specific security properties, and/or
capacity and performance of many of the basic hardware other configuration aspects to individual operating-sys-

FOCUS
Computer
Architecture

tem and application loads. Nowadays, a single device may to use packaged VMs as a new software distribution
have significantly higher performance (e.g., the transition technique. VMware offers more than 200 prepackaged
from Fast Ethernet to inexpensive Gigabit or even 10- appliances from its Web site.
Gigabit network interface cards), and just one or a much Within the enterprise, the packaged VM offers
smaller number of physical devices will likely be present additional benefits: Software delivered by an engineer-
on a single server or server configuration. ing group can be packaged with the operating system it
In such configurations, where individual physical requires and can be staged for testing and production as a
hardware devices are shared by multiple hosted VMs VM. Easily and instantly provisioned onto testing equip-
on a single server, ensuring that there is proper isola- ment, the application and the operating system against
tion between their respective demands upon the shared which it is certified can be quickly tested in a cost-effi-
hardware is critical. Strict allocation of the shared CPU, cient environment before being made available as a pack-
memory, and I/O resources, as well as the assurance of aged VM, ready for deployment into production.
the security of both the platform and the guests, are key A key problem in the data center is the ability to get
requirements that fall on the hypervisor. new applications quickly into production. New applica-
Beyond its immediate application for server consoli- tions typically take 60 to 90 days to qualify. To make it
dation, server virtualization offers many further benefits from testing into the data center, IT staff must acquire
that derive from the separation of virtual machines (an a new machine, provision it with an operating system,
operating system and its applications) from physical install the application, configure and test the setup for
hardware. These benefits (several of which have yet to be the service in question, and only then, once satisfied, rack
exploited fully in application) include dynamic provi- the resulting server in the data center.
sioning, high availability, fault tolerance, and a “utility This packaging approach provides an avenue to a
computing” paradigm in which compute resources are solution. Once new software has been packaged as an
dynamically assigned to virtualized application work- appliance, it can be deployed and run instantly on any
loads. existing server in the data center that has sufficient capac-
ity to run it. Any final testing or qualification can still
VIRTUAL APPLIANCES be done before the service is made available for produc-
Once an operating system and its applications have been tion use if required, but the lead times to acquire, install,
encapsulated into a virtual machine, the VM can be and/or customize new hardware at its point of use are
run on any computer with a hypervisor. The ability to removed.
encapsulate all states, including application and operat-
ing-system configuration, into a single, portable, instantly LIVE RELOCATION
runnable package provides great flexibility. For a start, Virtual appliances accelerate software provisioning and
the application can be provisioned and the VM saved in portability. Live relocation—the ability to move a running
a “suspended” state, which makes it instantly runnable VM dynamically from one server to another, without
without further configuration. The image of one or more stopping it—offers another benefit: When coupled with
applications that have been properly configured in a VM load-balancing and server resource optimization soft-
and are ready to run can be saved, and this may then be ware, this provides a powerful tool for enabling a “utility
used as a highly portable distribution format for a soft- computing” paradigm. When a VM is short of resources,
ware service. it can be relocated dynamically to another machine with
The administrative tasks of installing and configuring more resources. When capacities are stretched, additional
an operating system and the necessary applications prior copies of an existing VM can be cloned rapidly and
to instantiating and launching a software service on a deployed to other available hardware resources to increase
platform are no longer needed. The preconfigured and overall service capacity. Instantaneous load consider-
saved VM image is simply loaded and launched. VMware ations are a notorious challenge in the IT administrative
led the industry with its appliance concept, which aims world. Grid engines, as applied on distributed virtualized

servers, where spare resources are held in reserve, can ACKNOWLEDGMENTS
be used to spawn many instances of a given application We are particularly indebted to the team at the University
dynamically to meet increased load or demand. of Cambridge, including Paul Barham, Boris Dragovic,
Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Derek
CLIENT SECURITY AND MOBILITY McAuley, Rolf Neugebauer, Ian Pratt, Andrew Warfield,
On the client, virtualization offers various opportuni- and Matt Williamson, who have developed and evolved
ties for enhanced security, manageability, greater worker the Xen system. This article reports on their work.
mobility, and increased robustness of client devices.
Virtualization of clients is also made possible through the REFERENCES
hosting of multiple client operating system instances on a 1. Gum, P. H. 1983. System/370 extended architecture:
modern server-class system. Offering each client environ- Facilities for virtual machines. IBM Journal of Research
ment as a virtualized system instance located on a server and Development 27(6): 530-544.
in the data center provides the user with a modern-day 2. Seawright, L., MacKinnon, R. 1979. VM/370—a study
equivalent of the thin client. Mobility of users is a direct of multiplicity and usefulness. IBM Systems Journal
result of their ability to access their virtualized workload 18(1): 4-17.
remotely from any client endpoint. Sun’s Sun Ray system 3. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.,
is an example of one such implementation. Ho, A., Neugebauer, R., Pratt, I., Warfield, A. 2003. Xen
Increased security of data, applications and their and the art of virtualization. In Proceedings of the 19th
context of use, and reduced overall cost of administration ACM SOSP (October): 164-177.
for client systems are important aspects of this technol- 4. Fraser, K., Hand, S., Neugebauer, R., Pratt, I., Warfield,
ogy. Enhanced reliability and security can be achieved, A., Williamson, M. 2004. Safe hardware access with the
for example, by embedding function-specific, hidden Xen virtual machine monitor. Cambridge, UK: Univer-
VMs on a user’s PC, where the VM has been designed to sity of Cambridge Computer Laboratory; www.cl.cam.
monitor traffic, implement “embedded IT” policies, or the ac.uk/research/srg/netos/papers/2004-oasis-ngio.pdf.
like. The packaging of applications and operating-system
images into portable appliances also provides a power- LOVE IT, HATE IT? LET US KNOW
ful metaphor for portability of application state: Simply feedback@acmqueue.com or www.acmqueue.com/forums
copying a suspended VM to a memory stick allows the
user to carry running applications to any virtualization- SIMON CROSBY is CTO of XenSource where he is respon-
ready device. VMware’s free Player application is a thin, sible for XenEnterprise R&D, technology leadership, and
client-side virtualization “player” that has the ability to product management, and maintaining a close affiliation
execute a packaged VM. Examples include prepackaged with the Xen project run by Ian Pratt, the founder of Xen-
secure Web browsers that can be discarded after per-ses- Source. Crosby was a principal engineer at Intel where he led
sion use (to obtain greater security) and secured, user-spe- research in distributed autonomic computing and platform
cific or enterprise-specific applications. security and trust. Before Intel, Simon founded CPlane Inc.,
a network optimization software vendor. He was a tenured
CONCLUSION faculty member at the University of Cambridge, where he
The use of virtualization portends many further oppor- led research on network performance and control, and multi-
tunities for security and manageability on the client. The media operating systems.
examples presented here only begin to illustrate the ways DAVID BROWN is a member of the Solaris Engineer-
in which virtualization can be applied. Virtualization rep- ing group at Sun Microsystems. He led the Solaris ABI
resents a basic change in the architecture of both systems compatibility program and more recently has worked on
software and the data center. It offers some important several projects to support Sun’s AMD x64- and Intel-based
opportunities for cost savings and efficiency in comput- platforms. Earlier he was a founder of Silicon Graphics
ing infrastructure, and for centralized administration and and the Workstation Systems Engineering group at Digital
management of that infrastructure for both servers and Equipment Corporation. He introduced and described the
clients. We expect it to change the development, test- unified memory architecture approach for high-performance
ing, and delivery of software fundamentally, with some graphics hardware in his Ph.D. dissertation at the University
immediate application in the commercial and enterprise of Cambridge.
context. Q © 2006 ACM 1542-7730/06/1200 $5.00

BETTER,
FASTER,
MORE
SECURE
BRIAN CARPENTER, IBM and INTERNET ENGINEERING TASK FORCE
Who’s in charge of the Internet’s future?
S
Since I started a stint as chair of the IETF (Internet Engineering Task Force)
in March 2005, I have frequently been asked, “What’s coming next?” but I
have usually declined to answer. Nobody is in charge of the Internet, which
is a good thing, but it makes predictions difficult (and explains why this
article starts with a disclaimer: It represents my views alone and not those
of my colleagues at either IBM or the IETF).
The reason the lack of central control is a good thing is that it has
allowed the Internet to be a laboratory for innovation throughout its
life—and it’s a rare thing for a major operational system to serve as its own
development lab. As the old metaphor goes, we frequently change some of
the Internet’s engines in flight.
This is possible because of a few of the Internet’s basic goals:
• Universal connectivity—anyone can send packets to anyone.
• Applications run at the edge—so anyone can install and offer services.
• “Cheap and cheerful” core technology—so transmission is cheap.
• Natural selection—no grand plan, but good technology survives and the
rest dies.
Of course, this is an idealistic view. In recent years, firewalls and network
address translators have made universal connectivity sticky. Some telecom-
munications operators would like to embed services in the network. Some
transmission technologies try too hard, so they are not cheap. Until now,
however, the Internet has remained a highly competitive environment and
natural selection has prevailed, even though there have been attempts to
protect incumbents by misguided regulation.
In this environment of natural selection, predicting technology trends
is very hard. The scope is broad—the IETF considers specifications for how

BETTER, All of this should make it plain that predicting the
future of the Internet is a mug’s game. This article focuses
FASTER,
on observable challenges and trends today.
HALT, WHO GOES THERE?
MORE SECURE
The original Internet goal that anyone could send a
packet to anyone at any time was the root of the extraor-
dinary growth observed in the mid-1990s. To quote Tim
Berners-Lee, “There’s a freedom about the Internet: As
IP runs over emerging hardware media, maintenance long as we accept the rules of sending packets around, we
and improvements to IP itself and to transport protocols can send packets containing anything to anywhere.”1 As
including the ubiquitous TCP, routing protocols, basic with all freedoms, however, there is a price. It’s trivial to
application protocols, network management, and secu- forge the origin of a data packet or of an e-mail message,
rity. A host of other standards bodies operate in parallel so the vast majority of traffic on the Internet is unauthen-
with the IETF. ticated, and the notion of identity on the Internet is fluid.
To demonstrate the difficulty of prediction, let’s Anonymity is easy. When the Internet user com-
consider only those ideas that get close enough to real- munity was small, it exerted enough social pressure on
ity to be published within the IETF; that’s about 1,400 miscreants that this was not a major problem area. Over
new drafts per year, of which around 300 end up being the past 10 years, however, spam, fraud, and denial-of-ser-
published as IETF requests for comments (RFCs). By an vice attacks have become significant social and economic
optimistic rough estimate, at most 100 of these specifica- problems. Thus far, service providers and enterprise users
tions will be in use 10 years later (i.e., 7 percent of the have responded largely in a defensive style: firewalls
initial proposals). Of course, many other ideas are floated to attempt to isolate themselves, filtering to eliminate
in other forums such as ACM SIGCOMM. So, anyone unwanted or malicious traffic, and virtual private net-
who agrees to write about emerging protocols has at least works to cross the Internet safely.
a 93 percent probability of writing nonsense. These mechanisms are not likely going away, but what
What would I have predicted 10 years ago? As a matter seems to be needed is a much more positive approach to
of fact, I can answer that question. In a talk in May 1996 security: Identify and authenticate the person or system
I cautiously quoted Lord Kelvin, who stated in 1895 that you are communicating with, authorize certain actions
“heavier-than-air flying machines are impossible,” and I accordingly, and if needed, account for usage. The term of
incautiously predicted that CSCW (computer-supported art is AAA (authentication, authorization, accounting).
collaborative work), such as packet videoconferencing AAA is needed in many contexts and may be needed
and shared whiteboard, would be the next killer applica- at several levels for the same user session. For example, a
tion after the Web, in terms of bandwidth and realtime user may first need to authenticate to the local network
requirements. I’m still waiting. provider. A good example is a hotel guest using the hotel’s
A little earlier, speaking to an IBM user meeting in wireless network. The first attempt to access the Internet
1994 (before I joined IBM), I made the following specific may require the user to enter a code supplied by the front
predictions: desk. In an airport, a traveler may have to supply a credit
• Desktop client/server is the whole of computing. The card number to access the Internet or use a preexisting
transaction processing model is unhelpful. account with one of the network service providers that
• Cost per plug of LAN will increase. offer connectivity. A domestic ADSL customer normally
• Internet and IPX will merge and dominate. authenticates to a service provider, too. IETF protocols
• Desktop multimedia is more than a gimmick, but only such as EAP (Extensible Authentication Protocol) and
part of desktop computing. RADIUS (Remote Authentication Dial-in User Service)
• Wireless mobile PCs will become very important. are used to mediate these AAA interactions. This form of
• Network management (including manageable equip- AAA, however, authenticates the user only as a sender
ment and cabling) is the major cost. and receiver of IP packets, and it isn’t used at all where
Well, transaction processing is more important in 2006 free service is provided (e.g., in a coffee shop).
than it has ever been, and IPX has just about vanished. Often (e.g., for a credit card transaction) the remote
The rest, I flatter myself, was reasonably accurate. server needs a true identity, which must be authenti-

cated by some secret token (in a simple solution, a PIN It’s worth understanding that whereas normal end
code transmitted over a secure channel). But the mer- users can at worst send malicious traffic (such as denial-
chant who takes the money may not need to know that of-service attacks, viruses, and fraudulent mail), an ISP
true identity, as long as a trusted financial intermediary can in theory spy on or reroute traffic, or make one server
verifies it. Thus, authentication is not automatically the simulate another. A hardware or software maker can in
enemy of privacy. theory insert “back doors” in a product that would defeat
Cryptographic authentication is a powerful tool. Just almost any security or privacy mechanism. Thus, we need
as it can be used to verify financial transactions, it can trustworthy service providers and manufacturers, and we
in theory be used to verify any message on the Internet. must be very cautious about downloaded software.
Why, then, do we still have spoofed e-mail and even To summarize the challenges in this area:
spoofed individual data packets? • How can identity be defined, authenticated, and kept
For individual data packets, there is a solution known private?
as IPsec (IP security), which is defined in a series of IETF • How can trust relationships be created between arbitrary
specifications and widely (but not universally) imple- sets of parties?
mented. It follows the basic Internet architecture known • How can cryptographic keys be agreed upon between
as the end-to-end principle: Do not implement a function the parties in a trust relationship?
inside the network that can be better implemented in the • How can packet origins be protected against spoofing at
two end systems of a communication. For two systems to line speed?
authenticate (or encrypt) the packets they send to each • How can we continue to receive messages from
other, they have only to use IPsec and to agree on the unknown parties without continuing to receive
secret cryptographic keys. So why is this not in universal unwanted messages?
usage? There are at least three reasons: The IETF is particularly interested in the last three
• Cryptographic calculations take time during the send- questions. Work on key exchange has resulted in the
ing and receiving of every packet. This overhead is not IKEv2 standard (Internet key exchange, version 2), and
always acceptable except for very sensitive applications. work continues on profiles for use of public-key cryptog-
• Management of cryptographic keys has proved to be a raphy with IPsec and IKEv2. At the moment, the only
hard problem, and usually requires some sort of preex- practical defense against packet spoofing is to encourage
isting trust relationship between the parties. ingress filtering by ISPs; simply put, that means that an
• Traversing firewalls and network address translators ISP should discard packets from a customer’s line unless
adds complexity and overhead to IPsec. they come from an IP address assigned to that customer.
Thus, IPsec deployment today is limited mainly to vir- This eliminates spoofing only if every ISP in the world
tual private network deployments where the overhead is plays the game, however. Finally, the problem of spam
considered acceptable, the two ends are part of the same prevention remains extremely hard; there is certainly no
company so key management is feasible, and firewall silver bullet that will solve this problem. At the moment
traversal is considered part of the overhead. More general the IETF’s contribution is to develop the DKIM (Domain
usage of IPsec may occur as concern about malware Keys Identified Mail) specification. If successful, this effort
within enterprise networks rises and as the deployment will allow a mail-sending domain to take responsibil-
of IPv6 reduces the difficulties caused by network address ity, using digital signatures, for having taken part in the
translation. transmission of an e-mail message and to publish “policy”
For e-mail messages, mechanisms for authentication information about how it applies those signatures. Taken
or encryption of whole messages have existed for years together, these measures will assist receiving domains in
(known as S/MIME and PGP). Most people don’t use detecting (or ruling out) certain forms of spoofing as they
them. Again, the need for a preexisting trust relationship pertain to the signing domain.
appears to be the problem. Despite the annoyance of We are far from done on Internet security. We can
spam, people want to be able to receive mail from any- expect new threats to emerge constantly, and old threats
body without prior arrangement. Operators of Internet will mutate as defenses are found.
services want to receive unsolicited traffic from unknown
parties; that’s how they get new customers. A closed WHERE DID MY PACKET GO?
network may be good for some purposes, but it’s not the The Internet has never promised to deliver packets;
Internet. technically, it is an “unreliable” datagram network, which

BETTER, ervation Protocol). Even with the rapid growth of VoIP
recently, RSVP has not struck oil—deployment seems too
FASTER,
clumsy. A related approach, however—building virtual
paths with guaranteed bandwidth across the network
core—is embodied in the use of MPLS (MultiProtocol
MORE SECURE
Label Switching). In fact, a derivative of RSVP known as
RSVP-TE (for traffic engineering) can be used to build
MPLS paths with specified bandwidth. Many ISPs are
using MPLS technology.
may and does lose a (hopefully small) fraction of all pack- MPLS does not solve the QoS problem in the access
ets. By the end-to-end principle, end systems are required networks, which by their very nature are composed of
to detect and compensate for missing packets. For reliable a rapidly evolving variety of technologies (ADSL, CATV,
data transmission, that means retransmission, normally various forms of Wi-Fi, etc.). Only one technology is
performed by the TCP half of TCP/IP. Users will see such common to all these networks: IP itself. Therefore, the
retransmission, if they notice it at all, as a performance final piece of the QoS puzzle works at the IP level. Known
glitch. For media streams such as VoIP, packet loss will as Differentiated Services, it is a simple way of marking
often be compensated for by a codec—but a burst of every packet for an appropriate service class, so that VoIP
packet loss will result in broken speech or patchy video. traffic can be handled with less jitter than Web browsing,
For this reason, the issue of QoS (quality of service) came for example. Obviously, this is desirable from a user view-
to the fore some years ago, when audio and video codecs point, and it’s ironic that the more extreme legislative
first became practical. It remains a challenge. proposals for so-called “net neutrality” would ostensibly
One aspect of QoS is purely operational. The more outlaw it, as well as outlawing priority handling for VoIP
competently a network is designed and managed, the bet- calls to 911.
ter the service will be, with more consistent performance The challenge for service providers is how to knit the
and fewer outages. Although unglamorous, this is prob- four QoS tools (competent operation, overprovision of
ably the most effective way of providing good QoS. bandwidth, traffic engineering, and differentiated ser-
Beyond that, there are three more approaches to QoS, vices) into a smooth service offering for users. This chal-
which can be summarized as: lenge is bound up with the need for integrated network
• Throw bandwidth at the problem. management systems, where not only the IETF but also
• Reserve bandwidth. the DMTF (Distributed Management Task Force), TMF
• Operate multiple service classes. (TeleManagement Forum), ITU (International Telecom-
The first approach is based on the observation that munication Union), and other organizations are active.
both in the core of ISP networks and in properly cabled This is an area where we have plenty of standards, and
business environments, raw bandwidth is cheap (even the practical challenge is integrating them.
without considering the now-historical fiber glut). In However, the Internet’s 25-year-old service model,
fact, the only place where bandwidth is significantly which allows any packet to be lost without warning,
limited is in the access networks (local loops and wireless remains; and transport and application protocols still
networks). Thus, most ISPs and businesses have solved have to be designed accordingly.
the bulk of their QoS problem by overprovisioning their
core bandwidth. This limits the QoS problem to access BACK TO THE FUTURE
networks and any other specific bottlenecks. As previously mentioned, MPLS allows operators to cre-
The question then is how to provide QoS management ate virtual paths, typically used to manage traffic flows
at those bottlenecks, which is where bandwidth reserva- across an ISP backbone or between separate sites in a
tions or service classes come into play. In the reservation large corporate network. At first glance, this revives an
approach, a session asks the network to assign bandwidth old controversy in network engineering—the conflict
all along its path. In this context, a session could be a between datagrams and virtual circuits. More than three
single VoIP call, or it could be a semi-permanent path decades ago this was a major issue. At that time, con-
between two networks. This approach has been explored ventional solutions depended on end-to-end electrical
in the IETF for more than 10 years under the name of circuits (hardwired or switched, and multiplexed where
“Integrated Services,” supported by RSVP (Resource Res- convenient).

The notion of packet switching, or datagrams, was THE EVERLASTING PROBLEM
introduced in 1962 by Paul Baran and became practi- In 1992, the IETF’s steering group published a request
cable from about 1969 when the ARPANET started up. To for comments2 that focused on two severe problems
the telecommunications industry, it seemed natural to facing the Internet at that time (just as its escape from
combine the two concepts (i.e., send the packets along the research community into the general economy was
a predefined path, which became known as a virtual cir- starting): first, IP address space exhaustion; and second,
cuit). The primary result was the standard known as X.25, routing table explosion. The first has been contained by
developed by the ITU. a hack (network address translation) and is being solved
To the emerging computer networking community, by the growing deployment of IPv6 with vastly increased
this seemed to add pointless complexity and overhead. In address space. The second problem is still with us. The
effect, this controversy was resolved by the market, with number of entries in the backbone routing tables of the
the predominance of TCP/IP and the decline of X.25 from Internet was below 20,000 in 1992 and is above 250,000
about 1990 onward. Why then has the virtual circuit today (see figure 1).
approach reappeared in the form of MPLS? This is a tough problem. Despite the optimistic com-
First, you should understand that MPLS was developed ment about router speed, such a large routing table needs
by the IETF, the custodian of the TCP/IP standards, and to be updated dynamically on a worldwide basis. Further-
it is accurate to say that the primary target for MPLS is more, it currently contains not only one entry for each
the transport of IP packets. Data still enters, crosses, and address block assigned to any ISP anywhere in the world,
leaves the Internet encased in IP packets. Within certain but also an entry for every user site that needs to be “mul-
domains, however, typically formed by single ISPs, the tihomed” (connected simultaneously to more than one
packets will be routed through a preexisting MPLS path. ISP for backup or load-sharing). As more businesses come
This has two benefits: to rely on the network, the number of multihomed sites
• At intermediate switches along the path, switching will is expected to grow dramatically, with a serious estimate
take place at full hardware speed (today electronically, of 10 million by 2050. A routing table of this size is not
tomorrow optically), without any complex routing deci- considered feasible. Even if Moore’s law solves the storage
sions being made at line speed. and processing challenge at reasonable cost, the rate of
• The path itself is estab-
lished with whatever
security, bandwidth, Growth of the BGP Table, 1994 to Present
and QoS characteristics 300,000
it is considered to need,
using network manage-
ment techniques known 250,000
collectively as “traffic
engineering.”
active BGP entries (FIB)
200,000
Note that not all experts
are convinced by these
benefits. Modern IP routers 150,000
are hardly slow, and as pre-
viously noted, QoS may in
100,000
practice not be a problem
in the network core. Most
ISPs insist on the need for 50,000
such traffic engineering,
however.
Even with MPLS virtual 0
FIG 1
’95 ’96 ’97 ’98 ’99 ’00 ’01 ’02 ’03 ’04 ’05 ’06
paths, the fundamental date
unit of transmission on the
Internet remains a single IP
packet.

BETTER, technology of the Internet, and that the party is far from
over. The Internet technical community has succeeded
FASTER,
by being open—and open-minded. Any engineer who
wants to join in can do so. The IETF has no membership
requirements; anyone can join the mailing list of any
MORE SECURE
working group, and anyone who pays the meeting fee
can attend IETF meetings. Decisions are made by rough
consensus, not by voting. The leadership committees in
the IETF are drawn from the active participants by a com-
change in a table of that size could greatly exceed the rate munity nomination process. Apart from meeting fees, the
at which routing updates could be distributed worldwide. IETF is supported by the Internet Society.
Although we have known about this problem for more Any engineer who wants to join in can do so in several
than 10 years, we are still waiting for the breakthrough ways: by supporting the Internet Society (http://www.
ideas that will solve it. isoc.org), by joining IETF activities of interest (http://
www.ietf.org), or by contributing to research activities
MULTIPLE UNIVERSES? (http://www.irtf.org and, of course, ACM SIGCOMM at
The telecommunications industry was fundamentally sur- http://www.acm.org/sigs/sigcomm/). Q
prised by the Internet’s success in the 1990s and then fun-
damentally shaken by its economic consequences. Only REFERENCES
now is the industry delivering a coherent response, in 1. Berners-Lee, T. 1999. Weaving the Web. San Francisco:
the form of the ITU’s NGN (Next Generation Networks) HarperCollins.
initiative launched in 2004. NGN is to a large extent 2. Gross, P., Almquist, P. 1992. IESG deliberations on
founded on IETF standards, including IP, MPLS, and SIP routing and addressing, RFC 1380 (November). DDN
(Session Initiation Protocol), which is the foundation of Network Information Center; http://www.rfc-archive.
standardized VoIP and IMS (IP Multimedia Subsystem). org/getrfc.php?rfc=1380.
IMS was developed for third-generation cellphones but 3. Based on a talk by Keith Knightson. 2005. Basic NGN
is now the basis for what ITU calls “fixed-mobile conver- architecture principles and issues; http://www.itu.int/
gence.” The basic principles of NGN are:3 ITU-T/worksem/ngn/200505/program.html.
• IP packet-based transport using MPLS
• QoS-enabled ACKNOWLEDGMENTS
• Embedded service-related functions—layered on top of Thanks to Bernard Aboba and Stu Feldman for valuable
transport or based on IMS comments on a draft of this article.
• User access to competing service providers
• Generalized mobility LOVE IT, HATE IT? LET US KNOW
At this writing, the standardization of NGN around feedback@acmqueue.com or www.acmqueue.com/forums
these principles is well advanced. Although it is new for
the telecommunications industry to layer services on top BRIAN E. CARPENTER is an IBM Distinguished Engineer
rather than embedding them in the transport network, working on Internet standards and technology. Based in
there is still a big contrast with the Internet here: Internet Switzerland, he became chair of the IETF (Internet Engineer-
services are by definition placed at the edges and are not ing Task Force) in March 2005. Before joining IBM, he led
normally provided by ISPs as such. The Internet has a the networking group at CERN, the European Laboratory
history of avoiding monopoly deployments; it grows by for Particle Physics, from 1985 to 1996. He served from
spontaneous combustion, which allows natural selection March 1994 to March 2002 on the Internet Architecture
of winning applications by the end users. Embedding Board, which he chaired for five years. He also served as a
service functions in the network has never worked in the trustee of the Internet Society and was chairman of its board
past (except for directories). Why will it work now? of trustees for two years until June 2002. He holds a first
degree in physics and a Ph.D. in computer science, and is a
COME JOIN THE DANCE chartered engineer (UK) and a member of the IBM Academy
It should be clear from this superficial and partial per- of Technology.
sonal survey that we are still having fun developing the © 2006 ACM 1542-7730/06/1200 $5.00

book reviews
Sustainable Software Development: able, and practical: a gift to the software engineering
An Agile Perspective community. —D. Spinellis
Kevin Tate, Addison-Wesley Professional, 2005, $39.99,
ISBN: 0321286081 Hacking Exposed: Web Applications, 2nd edition
Our software engineering com- Joel Scambray, Mike Shema, Caleb Sima, McGraw-Hill
munity has for decades flirted with Osborne Media, 2006, $49.99, ISBN: 0072262990
the idea of applying the rigor of Many years ago, the “Hacking Exposed”
other engineering disciplines to the book series started covering security
development of software. This book from a hacker’s perspective. Since the
boldly argues against this metaphor. security landscape has become more
Buildings are expensive to modify complex, the series now covers the
and typically static, whereas software multiple facets of network and system
is cheap to modify and evolves over its lifetime. security, and includes books on specific
Instead, author Kevin Tate argues that an appropri- systems (Linux, Windows, Cisco), as well as wireless net-
ate metaphor is a coral reef: an ecosystem of developers, works (forthcoming in 2007).
customers, suppliers, distributors, and competitors that This book is dedicated to the security of Web applica-
live on top of the software, in the same way that a reef’s tions and associated service deployment architectures. It
organisms live around the coral. Both the coral and the is written from an attacker’s point of view and follows the
software evolve with their surrounding ecosystems. basic steps that an attacker takes. It starts with a descrip-
The book distinguishes itself from other agile program- tion of the reconnaissance phase (fingerprinting the
ming books by taking a wider view of the field, covering application and the supporting Web server) and moves
not only the project management side of agile practices, on to more intrusive attacks, roughly divided into those
but also developer collaboration and technical excel- against authentication methods, bypassing authorization
lence. It starts by arguing that the goal of sustainability mechanisms, abusing the input-validation procedures,
comes into play by recognizing that a project’s progress and stealing sensitive information. These subjects are
depends on the competition between negative stresses presented well, both from a conceptual point of view and
(user requirements, disruptive technologies and business through examples drawn from real-world cases. There is
models, external dependencies, competition, and cost also a relatively short chapter addressing the security and
management) and positive controls (collaboration, meth- vulnerabilities of Web services.
odology, expertise, decision making, leadership, culture, Hacking on the server side is only one viewpoint of
and simplicity). When the negative stresses outweigh the Web security. The typical end user may also get hacked
counteraction of a project’s controls, the project enters just by visiting malicious or compromised Web sites. One
into a death spiral of diminishing productivity. chapter in the book reviews the most famous exploits and
The remainder of the book is organized around a vulnerabilities related to these issues.
chapter for each of the four principles that should guide This book meets high expectations. It is fun and easy
sustainable development: defect prevention, a working to read. It covers in sufficient depth the technical details
product, emphasis on design, and continual refinement. and underlying system and software-specific issues. The
With another apt metaphor, Tate advises developers to technical level is somewhere between intermediate and
juggle the four principles of sustainable development advanced, thus appealing to a broad range of readers.
while working on product features and fixing bugs. Webmasters will learn how to check their servers for the
The text is full of interesting ideas and illuminating most common security flaws, programmers will appreci-
sidebars discussing real-world cases, but as a result the ate the contents on securing their code, and typical read-
reader can occasionally get lost among them, losing focus ers will get a comprehensive picture of the status of Web
on the argument and the course of thought. Neverthe- security today. I recommend this truly exceptional book
less, this is a book that both developers and managers will to all of these readers. —Radu State
appreciate and value. Its advice is important, understand- Reprinted from Computing Reviews, © 2006 ACM, http://www.reviews.com

calendar
DECEMBER http://www.macworldexpo.com/ To announce
LISA (Large Installation System launch/
an event, E-MAIL
Administration) Conference
Dec. 3–8, 2006 Symposium on POPL (Principles of QUEUE-ED@ACM.ORG OR
Washington, D.C. Programming Languages) FAX +1-212-944-1318
http://www.usenix.org/events/lisa06/ Jan. 17-19, 2007
Nice, France
Web Builder 2.0 http://www.cs.ucsd.edu/popl/07/ Gartner Business Process
Dec. 4-6, 2006 Management Summit
Las Vegas, Nevada IUI (International Conference on Feb. 26-28, 2007
http://www.ftponline.com/ Intelligent User Interfaces) San Diego, California
conferences/webbuilder/2006/ Jan. 28-31, 2007 http://www.gartner.com/2_events/
Honolulu, Hawaii conferences/bpm3.jsp
ICSOC (International Conference on http://www.iuiconf.org/
Service-Oriented Computing) Black Hat Briefings and Trainings
Dec. 4-7, 2006 FEBRUARY Feb. 26-Mar. 1, 2007
Chicago, Illinois Designing and Building Ontologies Washington, D.C.
http://www.icsoc.org/ Feb. 5-8, 2007 http://www.blackhat.com/html/
Washington, D.C. bh-dc-07/bh-dc-07-index.html
Search Engine Strategies http://www.wilshireconferences.
Dec. 4-7, 2006 com/seminars/Ontologies/ ETel (Emerging Telephony
Chicago, Illinois Conference)
http://searchenginestrategies.com/ RSA Conference Feb. 27-Mar. 1, 2007
sew/chicago06/ Feb. 5-9, 2007 Burlingame, California
San Francisco, California http://conferences.oreillynet.com/
XML Conference http://www.rsaconference. etel2007/
Dec. 5-7, 2006 com/2007/us/
Boston, Massachusetts MARCH
http://2006.xmlconference.org/ SCALE 5x (Southern California DAMA (Data Management Association)
Linux Expo) International Symposium and Wilshire
The Spring Experience Feb. 10-11, 2007 Meta-Data Conference
Dec. 7-10, 2006 Los Angeles, California Mar. 4-8, 2007
Hollywood, Florida http://www.socallinuxexpo.org/ Boston, Massachusetts
http://thespringexperience.com/ scale5x/ http://www.wilshireconferences.
com/MD2007/
Web Design World FAST (Usenix Conference on File and
Dec. 11-13, 2006 Storage Technologies) Game Developers Conference
Boston, Massachusetts Feb. 13–16, 2007 Mar. 5-9, 2007
http://www.ftponline.com/ San Jose, California San Francisco, California
conferences/webdesignworld/ http://www.usenix.org/events/fast07/ http://www.gdconf.com/
2006/boston/
LinuxWorld OpenSolutions Summit TheServerSide Java Symposium
JANUARY Feb. 14-15, 2007 Mar. 21-23, 2007
Macworld New York, New York Las Vegas, Nevada
Jan. 8-12, 2007 http://www.linuxworldexpo.com/ http://javasymposium.techtarget.
San Francisco, California live/14/ com/lasvegas/

CLASSIFIED
St. Mary’s College of Netflix aai Services corporation
Maryland Netflix is looking for Sr Software Engineer
Tenure-Track Assistant great engineers!
AAI Services Corporation is seeking Software Eng 4 at
Professor Positions McLean, VA location. BS and 10 years C++ and some Linux
Do you want to work with
Two assistant-level tenure- talented people who are required. OpenGL or 3D graphics preferred. US Citizen, EOE.
track positions in Computer motivated by making a comprehensive benefits, competitive salary.
Science at St. Mary’s College difference and interested in
of Maryland—a Public Liberal solving tough problems? Are
Arts College—starting Fall you a web-savvy software
The D. E. Shaw group
2007. Industrial experience engineer, developer or
and a demonstrated designer? No matter how Software Developer
ability to attract and you mind works, we have
The D. E. Shaw group is looking for top-notch, innovative
retain students from a job opening for you. For
software developers to help it expand its tech venture and
underrepresented groups are more details, please come to
proprietary trading activities. We’re a a global investment
desired. Further details at: our website at http://www.
and technology development firm with approximately US $25
http://www.smcm.edu/nsm/ netflix.com/Jobs and submit
billion in aggregate investment capital and a decidedly different
mathcs/cs07.html. AA/EOE. an application.
approach to doing business. The application of advanced
technology is an integral part of virtually everything we do,
Winona State University from developing computationally intensive strategies for
trading in securities markets around the globe to designing a
Computer Science Department supercomputer intended to fundamentally transform the process
The Computer Science Department at Winona State University of drug discovery. Developers at the firm work on a variety of
invites applications for a tenure-track faculty position on interesting technical projects including real-time data analysis,
its Rochester campus, to begin Fall 2007. A PhD or ABD in distributed system development, and the creation of tools for
Computer Science or a closely related field required. We are mathematical modeling. They also enjoy access to some of the
particularly interested in candidates with specialization in most advanced computing resources in the world. If you’re
bioinformatics, biomedical informatics, database, and data interested in applying your intellect to challenging problems of
mining; candidates in all areas of CS will be considered and are software architecture and engineering in a stimulating, fast-
encouraged to apply. Rochester, a vibrant, diverse city, located paced environment, then we’d love to see your resume. To
in SE Minnesota, offers many opportunities for collaboration apply, e-mail your resume to ACM-SNowak@career.deshaw.com.
and/or employment for partners at Mayo Clinic, 20+ software EOE.
companies including IBM, and Rochester Community and
Technical College, among others. Review begins 1/16/07. For a American University of Beirut
full position description and application procedure, see http://
www.winona.edu/humanresources. AA/EOE Department of Computer Science
The Department of Computer Science at the American
University of North Carolina at Charlotte University of Beirut invites applications for faculty positions at
all levels. Candidates should have a Ph.D. in computer science or
Department of Software and Information Systems a related discipline, and a strong research record. All positions
University of North Carolina at Charlotte Department of Software are normally at the Assistant Professor level to begin September
and Information Systems Tenure-Track Faculty Position Two 15, 2007, but appointments at higher ranks and/or visiting
tenure-track faculty positions available at the associate/assistant appointments may also be considered. Appointments are for an
professor level. The Department is dedicated to research and initial period of three years. The usual teaching load is not more
education in Computing with emphasis in Information Security than nine hours a week. Sabbatical visitors are welcome. The
& Assurance and Information Integration & Environments. The language of instruction is English. For more information please
Department offers degrees at the Bachelor, Master, and Ph.D. visit http://www.aub.edu.lb/~webfas/ Interested applicants
levels. Faculty candidates with strong research expertise in should send a letter of application and a CV, and arrange for
Software Engineering, Trusted Software Development, Trusted three letters of reference to be sent to: Dean, Faculty of Arts
Information Infrastructures, and Information Security and Privacy and Sciences, American University of Beirut, c/o New York
are encouraged to apply. Highly qualified candidates in other Office, 3 Dag Hammarskjold Plaza, 8th Floor, New York, NY
areas will also be considered. Salary will be highly competitive. 10017-2303, USA or Dean, Faculty of Arts and Sciences, American
Applicants must have a Ph.D. in Computer Science, Information University of Beirut, P.O.Box 11-0236, Riad El-Solh, Beirut 1107
Technology, Software Engineering, or a related field, as well as 2020, Lebanon. Electronic submissions may be sent to: as_
a strong commitment to research and education. For further dean@aub.edu.lb. All application materials should be received
details please visit http://www.sis.uncc.edu. Application review by December 29, 2006. The American University of Beirut is an
will start in January 2007. Please send a detailed CV together Affirmative Action, Equal Opportunity Employer
with four references, copies of scholarly publications, and other
supporting documents to search-sis@uncc.edu. All mater-ials need
to be electronically submitted as separate PDF file attachments.
References must be sent directly. Women, minorities and
individuals with disabilities are encouraged to apply. UNC
more queue: www.acmqueue.com
Charlotte is an Equal Opportunity/Affirmative Action employer.
ACM QUEUE December/January 2006-2007 51
curmudgeon
Continued from page 56 lief: “An abacus? Sheer luxury! We had to dig out our own
with what seem now to be ridiculously frugal resources. pebbles from t’ local quarry. We used to dream of having
Maurice Wilkes, David Wheeler, and Stan Gill had written an abacus...”
the first book on programming.4 This revered, pioneering In truth, adversity did bring its oft-touted if not so
trio are generally acknowledged as the co-inventors of sweet usages. Programming at the lower levels with lim-
the subroutine and relocatable code. As with all the most ited memory constantly “focused the mind”—you were
sublime of inventions, it’s difficult to imagine the world nearer the problem, every cycle had to earn its keep, and
without a call/return mechanism. Indeed, I meet pro- every bit carry [sic!] its weight in expensive mercury, as
grammers, whose parasitic daily bread is earned by invok- it were. The programming cycle revolved thus: hand-
ing far-flung libraries, who have never paused to ponder write the code on formatted sheets; punch your tape on
with gratitude that the subroutine concept needed the a blind perforator (the “prayer” method of verification
brightest heaven of invention. Although no patents for was popular, whence quips about the trademark Creed);
the basic subroutine mechanism were sought (or even select and collate any subroutines (the library was a set of
available) back then, a further sign of changing times is paper tapes stored in neat white boxes); wait in line at the
that patents are now routinely [sic] awarded for varia- tape reader (this was before the more efficient “cafete-
tions on the call/return mechanism, as well as for specific ria” services were introduced); then finally collect and
subroutines.5 print your output tape (if any). All of which combined to
David Wheeler died suddenly in 2004 after one of his impose a stricter discipline on what we now call software
daily bicycle rides to the Cambridge Computer Labs. It’s development. More attention perhaps than in these
quite Cantabrigian to “die with your clips on.” I had the agile, interactive days was given to the initial formula-
sad pleasure of attending David’s memorial service and tion of the program, including “dry runs” on Brunsviga
learning more of his extensive work in many areas of hand calculators. Indeed, the name of our discipline was
computing.6 numerical analysis and automatic computing, only later
Other innovations from the Cambridge Mathemati- to be called computer science.7
cal Laboratory in the early 1950s included Wilkes’s paper EDSAC designer Professor (now Sir) Maurice Wilkes
introducing the concept of microprogramming. On a was quoted by the Daily Mail, October 1947:
more playful note was the XOX program written by my “The brain will carry out mathematical
supervisor A. S. (Sandy) Douglas. This played (and never research. It may make sensational discover-
lost!) tic-tac-toe (also known as OXO)—a seemingly trivial ies in engineering, astronomy, and atomic
pursuit, yet one with enormous, unpredicted conse- physics. It may even solve economic and
quences. XOX was the very first computer game with philosophical problems too complicated for
an interactive CRT display, the challenge being not the the human mind. There are millions of vital
programming logic, of course, but the fact that the CRT questions we wish to put to it.”
was designed and wired for entirely different duties. Little A few years later, the Star (June 1949) was reporting:
could anyone guess then that games and entertainment “The future? The ‘brain’ may one day come
would become the dominant and most demanding appli- down to our level and help with our income-
cations for computers. Can anyone gainsay this assertion? tax and bookkeeping calculations. But this is
One would need to add up all the chips, MIPS, terabytes, speculation and there is no sign of it so far.”
and kid-hours (after defining kid), so I feel safe in my Allowing for journalistic license, one can spot early
claim. Discuss! If you insist, I can offer a weaselly cop-out: differences between how computing was expected to
Games and entertainment are now among the most domi- evolve and how, in fact, things turned out. The enor-
nant and demanding applications for computers. mous impact on scientific research did come about and
Cue in some computer-historic Pythonesque clichés: continues to grow, but the relative pessimism about
“We had it tough in them days, folks. The heat, mud, commercial applications quickly vanished. Indeed, soon
dust, and flies. Try telling the young’uns of today—they after the June 1949 quote (“no sign of it so far”), the UK’s
just don’t believe yer. And did I mention the heat? leading caterers, food manufacturers, and tea-shop chain,
3,000 red-hot Mullard valves. All of us stripped down t’ J. (Joe) Lyons & Co., embarked on its LEO (Lyons Elec-
waist—even t’ men!” Then an older old soldier would tronic Office) project, a business computer based directly
intervene: “512 words? You were lucky! All we had were on Wilkes’s EDSAC designs (with appropriate financial
two beads on a rusty abacus!” More ancient cries of disbe- support). I recall visits by the Joe Lyons “suits,” who

explained that there was more to their business comput- any “complicated” introspection! In particular, I mean
ing needs than multiplying NumberOfBuns by UnitBun- the sine-qua-non acquisition and command of natural
Price. LEO was running complex valuation, inventory, language. The claimed progress certainly excites some
and payroll applications by 1951, an important root of Aivatarians (the folks who apply AI techniques to con-
the global expansion of commercial IT subsequently struct artificial avatars). Chomsky’s analogy is telling: The
dominated by IBM. long-jumpers increase their leaps each year, and claim
I now move from praising the pleasantly unpredicted that one day soon they will achieve bird-like flight. Or
iPod-in-my-pocket to complaining about the overconfi- the parrots, taught to... er... parrot an increasing number
dent predictions that continue to elude us. Otherwise, of words, will one day (to paraphrase Bertrand Russell)
this column will fail to qualify as “curmudgeonly.” tell you that their parents were poor but honest.
Reconsider the 1947 proposition: “It [meaning IT!] may The ACM TechNews (Oct. 20, 2006) reports the credit-
even solve economic and philosophical problems too able achievements of the aivatar (AI avatar) George,
complicated for the human mind.” One can agree that programmed by Rollo Carpenter of Televisual. George is
many complicated “economic” problems have suc- a chatbot, a term adding to the glut of bots, derived from
cumbed. Here one must think beyond cutting payroll the Czech Karel Câpek’s play R.U.R. (Rossum’s Universal
checks and the inventory control of tea cakes. Although Robot). Robot (coined by Câpek’s brother) comes from
such tasks can be quite complex because of sheer size the Slavonic root for work, but we must not overlook the
and the quirks of a volatile real-world rule-set, they political satire: robota meant statute labor in the days of
remain essentially mechanizable given the certifiable [sic]
patience of nitpicking programmers.8
Moving to the wider field of “economic” problems
that are truly “too complicated for the human mind,”
we can acknowledge the progress made in the com-
puter-modeling of such domains as global trade, energy
consumption, and climate change. These systems stretch
the normal physical laws of cause and effect by having to
cope with chaos (the susceptibility to small input changes AI and
and measurement accuracies) and the nondeterministic robotics
influences of human behavior. Nevertheless, these models
dreams
are useful in testing possible outcomes for given policy
decisions. continue to
Interestingly, Professor Steve Rayner (director of the frustrate us...
James Martin Institute, Oxford University) calls the global
environmental problem “wicked,” a technical term for
those complex problems that demand “clumsy” solu-
tions. Clumsy is also a technical term! Briefly, clumsy
solutions combine the diverse solutions hinted at by the
various competing models in an undogmatic way (see Uncle Joe. We now have as many bots in the computer
http://www.unsw.edu.au/news/pad/articles/2006/jul/jack_ laxicon as wares. Thus, genetically modified veggies are
beale_lecture.html). To this extent, the prediction that grobots, and humans are knobots (compare with wetware
computers “may even solve economic problems too com- for that mushy stuff between our ears).
plicated for the human mind” is unfolding with promise, George is said to “speak” some 40 languages, having
although the emphasis should really be on “helping us to had more than 11 million “conversations” with human
solve.” online text messages via http://www.jabberwacky.com
Finally, what can we say about the 1947 hope that (note the deliberate variant spelling from Lewis Carroll’s
computers “may even solve philosophical problems too poem “Jabberwocky.” The hint may be that WACI is
complicated for the human mind.” Je dis que NON! The emerging as an acronym for wild and crazy ideas).
AI and robotics dreams continue to frustrate us even in My own brief exchange with George reminds me of
the reduced realm of simulating what all human minds those early versions of Doctor but clearly with the richer
(and bodies) do naturally as a matter of course without range you might expect from a larger database. George’s

curmudgeon
replies lack any real signs of human intelligence. Discuss- Tigers (Japan) and the Dirty Dozen (Germany) as “look-
ing Democratic presidential candidates (George’s idea), ing remarkably like a contest between toasters on wheels,
I entered “Hillary Clinton.” George replied, “Hilary while the Four-Legged League (RoboMutts) appeared to
Clinton is a senator.” I said, “She’s also a candidate for spend most of the time sniffing each other’s shorts.”
president.” George replied, “Oh yes, I know!!!” But when I Kitano remains optimistic that by 2050 a team of
asked, “Is she Hilary or Hillary?”, George answered, “This autonomous bots will beat the human World Cup
does not make sense—how about math?” I said, “Nyet,” Champions. That’s a prediction that’s difficult to gainsay.
and George answered: “You’re right, that’s Russian.” Kitano points out that 50 years after EDSAC, the IBM
The site has a disclaimer: “Jabberwacky learns the Deep Blue beat world chess champion Garry (or some
behavior and words of its users. It may use language and prefer Gary) Kasparov. Seddon argues that playing a sym-
produce apparent meanings that some will consider inap- bolic “computable” game like chess cannot be compared
propriate. Use this site with discretion, and entirely at with the physical complexities of soccer, where the rules
your own risk.” appear simple but defy algorithmic precision. “He was
bleedin’ off-side!” “Oh no, he bleedin’ wasn’t!”
Seddon reckons that the chances of Kitano’s prophesy
coming true are about the same as Beckham ever becom-
...by 2050 a team ing world chess champion. That honor, by the way, has
of autonomous bots just been achieved by the Russian Vladimir Kramnik but
will beat the human not without some all-too-human, sordid altercations. His
rival, the Bulgarian Veselin Topalov, objected to Kram-
World Cup champions.
nik’s frequent trips to the restroom (or, in chess notation,
K x P?). The ultimate insult was that Kramnik had been
consulting a computer hidden in the Gents. This was
convincingly disproved, but one further irony ensued.
Since the normal games ended in a points tie, a soccer-
like extra-time had to be played: four games of nail-biting
rapid play, which, to the relief of all fair-chess lovers, was
won by Kramnik. Q
The point has long been made that human knowledge
and learning relies deeply on our having a corporeal REFERENCES
entity able to explore three-dimensional space and some 1. You, too, can relive those heroic days. Martin Camp-
cognitive “devices” to acquire the notions of both place bell-Kelly (no relation) of Warwick University offers an
and time. Chatbots that reside in static hardware are EDSAC simulator for the PC and Mac; http://www.dcs.
rather limited in this respect. Hence the need for a mobile warwick.ac.uk/~edsac/Software/EdsacTG.pdf. This site
bot, whether it has humanoid features such as Rossum’s will also point you to the vast EDSAC bibliography.
original robots or not. From “embedded systems” to 2. I’m lying for a cheap laugh. In fact, I’ve never know-
“embodied”? ingly stolen a file of any kind. As a member of ASCAP
For limited, repetitive actions in tightly controlled (American Society of Composers, Authors, and Publish-
environments, tremendous progress has been made, as in ers), I urge you all to obey the IP (intellectual property)
motor-car assembly. As a devout soccer fan, I’m intrigued protocols.
by the possibilities of two teams of 11 robots playing 3. EOF, possibly overloaded from EndOfFile to Extreme-
the Beautiful Game. In 1997, Hiroaki Kitano launched lyOldFart, started life as plain OF (OldFart) in the
the annual Robot World Cup, or RoboCup for short.9 Jargon File and subsequent versions of the Eric Ray-
(The name Ballbot has been taken up elsewhere for a mond/Guy Steele Hacker’s Dictionary. In the 1980s OF
robot that moves around rather like a gymnast walk- was generally applied (with pride or sarcasm) to those
ing while balanced on a large sphere; see http://www. with more than about 25 years in the trenches. It now
post-gazette.com/pg/06235/715415-96.stm.) By 2002, 29 seems appropriate to define EOFs by stretching the
different countries had entered the RoboCup staged in time served to “more than about 50 years.”
Fukuoka, Japan, attracting a total audience of 120,000 4. Wilkes, M. V., Wheeler, D. J., Gill, S. 1951. The Prepa-
fans. Peter Seddon describes the game between the Baby ration of Programs for an Electronic Digital Computer.

New York: Addison-Wesley. clothing allowance that another rule declared taxable,
5. For the effective nullification in 1994 of the Supreme leading to a later refund of the tax paid that was itself
Court’s 1972 Gottschalk v. Benson decision, which taxed and later refunded, ad almost infinitum.
had excluded mathematical algorithms from patents 9. Seddon, P. 2005. The World Cup’s Strangest Moments.
applications, see http://www.unclaw.com/chin/schol- Chrysalis Books.
arship/software.htm. Andrew Chin discusses the vital
topic of computational complexity and its impact on LOVE IT, HATE IT? LET US KNOW
patent-law complexity! Claims made for “effective” feedback@acmqueue.com or www.acmqueue.com/forums
algorithms can run afoul of well-known computer-sci-
entific theorems. STAN KELLY-BOOTLE (http://www.feniks.com/skb/; http://
6. http://www.cl.cam.ac.uk/UoCCL/misc/obit/wheeler. www.sarcheck.com), born in Liverpool, England, read pure
html. mathematics at Cambridge in the 1950s before tackling the
7. Professor P. B. Fellgett once asked the possibly rhetori- impurities of computer science on the pioneering EDSAC I.
cal question, “Is computer science?” Professor Dijkstra His many books include The Devil’s DP Dictionary (McGraw-
was dubious, posing the counter-riddle, “Is typewriter Hill, 1981), Understanding Unix (Sybex, 1994), and the
science?” thereby proving the correctness of the name recent e-book Computer Language—The Stan Kelly-Bootle
“computing science.” Reader (http://tinyurl.com/ab68). Software Development Mag-
8. Readers may have their own counter-anecdotes where azine has named him as the first recipient of the new annual
apparently trivial business functions turn out to be Stan Kelly-Bootle Eclectech Award for his “lifetime achieve-
provably noncomputable. I exclude the epistemological ments in technology and letters.” Neither Nobel nor Turing
problems of Lex Coddonis: the axioms for normalizing achieved such prized eponymous recognition. Under his
a 100-percent-conforming relational database. I recall nom-de-folk, Stan Kelly, he has enjoyed a parallel career as a
programming a UK County Council payroll on the IBM singer and songwriter.
650, where the police were given a supposedly tax-free © 2006 ACM 1542-7730/06/1200 $5.00
Instantly Search Terabytes of Text

◆ over two dozen indexed, unindexed, fielded data and full-text search options
◆ highlights hits in HTML, XML and PDF, while displaying links, formatting and images
◆ converts other file types (word processor, database, spreadsheet, email and
attachments, ZIP, Unicode, etc.) to HTML for display with highlighted hits
◆ Spider supports static and dynamic Web content, with WYSWYG hit-highlighting
◆ API supports .NET/.NET 2.0, C++, Java, SQL databases. New .NET/.NET 2.0 Spider API
dtSearch® Reviews
◆ “Bottom line: dtSearch manages a terabyte of text in a single index
and returns results in less than a second” – InfoWorld
®
◆ “For combing through large amounts of data, dtSearch “leads the market”
– Network Computing
◆ “Blindingly fast”– Computer Forensics: Incident Response Essentials
Spider ($199)
Desktop with
◆ “Covers all data sources ... powerful Web-based engines”– eWEEK
$8 00) ◆ “Searches at blazing speeds”– Computer Reseller News Test Center
h Spider (fr om
Network wit ◆ “The most powerful document search tool on the market”– Wired Magazine
$999)
eb w it h S pider (from
W ,500)
For hundreds more reviews — and developer case studies — see www.dtsearch.com
fo r C D /DVDs (from $2
Publish New
r W in & .N ET 64-bit beta Contact dtSearch for fully-functional evaluations
Engine fo The Smart Choice for Text Retrieval ® since 1991
e fo r Linux
Engin
1-800-IT-FINDS • www.dtsearch.com
Will the Real Bots
Stand Up?
curmudgeon
Stan Kelly-Bootle, Author
From EDSAC
to iPod—
W
hen asked which advances in computing technol- PREDICTIONS ELUDE US Meanwhile, adding still-
ogy have most dazzled me since I first coaxed the life pictures, such as cover
Cambridge EDSAC 1 1 into fitful leaps of calcula- art, may retain the iPod’s
tion in the 1950s, I must admit that Apple’s iPod sums simple “completeness,”
up the many unforeseen miracles in one amazing, iconic but pushing the device to
gadget. Unlike those electrical nose-hair clippers and salt TV seems to me to break the spell of sound gimcrackery
’n’ pepper mills (batteries not included) that gather dust [sic]. Peering at tiny moving pictures is a pointless pain,
after a few shakes, my iPod lives literally near my heart, whereas even modestly priced earphones provide the
on and off the road, in and out of bed like a versatile superb hi-fi we used to dream about when growing up.
lover—except when it’s recharging and downloading in The near-exponential improvement of every comput-
the piracy of my own home.2 ing power-performance parameter—physical size, clock
I was an early iPod convert and remain staggered by speed, storage capacity, and bandwidth, to name the
the fact that I can pop 40 GB of mobile plug-and-play obvious features—is now a cliché of our fair trade. Yet
music and words in my shirt pocket. I don’t really mind even my older readers3 may need reminding just how
if the newer models are 80 GB or slightly thinner or can bleak things were almost 60 years ago as the world’s first
play movies; 40 GB copes easily with my music and e- stored-program machine (note the Cambridge-chauvinis-
lecture needs. Podcasts add a touch of potluck and ser- tic singular) moved into action.
endipity-doo-dah. Broadcasts from the American public The house-size EDSAC was effectively a single-user per-
radio stations that I’ve missed since moving back to Eng- sonal computer—a truly general computing factotum, but
land now reach my iPod automatically via free subscrip- as Rossini’s Figaro warns: Ahime, che furia! Ahime, que
tions and Apple’s iTunes software. I’ve learned to live with folla! Uno alla volta, per carità! (Heavens, what mayhem!
that pandemic of “i-catching” prefixes to the point where Goodness, what crowds! One at a time, for pity’s sake!)
I’ve renamed Robert Graves’s masterwork “iClaudius,” but Originally (1947) EDSAC boasted [sic] 512 words of
I digress. main memory stored in 16 ultrasonic mercury-delay-line
The functional “completeness” of the audio iPod tanks, cleverly known as “long” tanks because they were
stems from its ideal marriage of hardware and software. longer than the short tanks used for registers. On the
The compactness is just right, respecting the scale of bright side, as we used to quip, each of the 512 words was
human manipulations. The Dick Tracy wristwatch vade 18 bits! Forget the word count, feel the width! Alas, for
mecum failed through over-cram and under-size. The iPod technical reasons, only 17 of the 18 bits were accessible.
succeeds with a legible alphanumeric screen and that By 1952, the number of long tanks had doubled, provid-
senile-proof, uncluttered, almost minimal, click-wheel ing a dizzy total of 1-KB words. Input/output was via five-
user interface. This avoids the input plague of most por- track paper tape, which therefore also served as mass [sic
table gadgets such as phones, calculators, and PDAs: the again] storage. Subject only to global timber production,
minuscule keyboards and buttons. I hasten to deflect the one might see this as virtually unlimited mass storage,
wrath of my daughter-in-law Peggy Sadler and all who although access was strictly slow-serial via 20-characters-
have mastered and swear by the Palm Pilot stylus! The per-second tape readers and 10-characters-per-second tele-
click wheel offers circular, serial access to and selection type printers. (By 1958, with EDSAC 2 taking over, paper
of your titles, but that’s a decent compromise when you tape and printer speeds had risen and magnetic tapes had
ponder the problems of searching by keywords. Spoken become the standard backup and mass storage medium.)
commands remain, as always, waiting for the next reas- Although hindsight and nostalgia can distort, one still
suring “breakthrough.” I’ll return anon to other Next-Big- looks back with an old soldier’s pride at the feats achieved
Fix-Release promises. Continued on page 52

[SOFTWARE DEVELOPMENT WEST 2007 CONFERENCE & EXPO]
MARCH 19-23, 2007 | SANTA CLARA CONVENTION CENTER | SANTA CLARA, CA
ALL THE KNOWLEDGE YOU NEED
SUPER
20
EARLY BIRD EARLY BIRD
DISCOUNT DISCOUNT
Register by JAN 19 Register by FEB 23 CELEBRATING
SAVE UP SAVE UP
TO $500 TO $300 YEARS!
• Business of • Modeling & • Security STROUSTRUP &

OVER 200 Software Design • Testing & Quality SUTTER ON C++
SESSIONS IN • C++
• Java
• People, Projects
& Methods
BACK BY POPULAR DEMAND
• Web 2.0
Join C++ creater Bjarne Stroustrup
13 IN-DEPTH
TRACKS
• .NET
• Ruby
• Requirements
& Analysis
• Weband renowned C++ expert Herb
Services/SOA
• XML
Sutter for an in-depth two-day
tutorial on C++.
NEW THIS YEAR: RUBY AND WEB 2.0!

PLUS: Keynotes, Expo, Birds-of-a-Feathers, Panels, Case Studies, Roundtables, Parties and Special Events
R E G I S T E R T O D AY AT W W W. S D E X P O . C O M
The Object Database
With Jalapeno.
˜
Give Your POJOs
More Mojo.
The object database that runs SQL faster than relational databases now comes with InterSystems
Jalapeño™ technology that eliminates mapping. Download a free, fully functional, non-expiring copy at:
www.InterSystems.com/Jalapeno1S
© 2006 InterSystems Corporation. All rights reserved. InterSystems Caché is a registered trademark of InterSystems Corporation. 10-06 CacheJal1Queue

Queuevol4no10 December2006

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Queuevol4no10 December2006

Uploaded by

Copyright:

Available Formats

Plotting the

Queue December/January 2006 Vol. 4 No. 10

Vol. 4 No. 10 Internet’s Future

The Virtualization Reality 34

Better, Faster, More Secure 42

2 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Introducing Folder Diff,

Use Folder Diff to quickly determine the differences between files in

Download a free copy of Perforce, no questions

WHAT’S ON YOUR HARD DRIVE? 11

4 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Visual Studio 2005. The difference is obvious.

Lynn D’Addesio-Kraus Executive Committee

6 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

from the editors

8 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

In this latest ACM Premium Queuecast,

10 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Who: Charles Moore Who: Shyam Santhanam

Who: Leon Woestenberg Who: Mark Westwood

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 11

12 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Updates That Don’t Go Boom!

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 13

14 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

Since IP is considered a ﬁnancial asset in today’s business climate,

16 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 17

18 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 19

20 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

new from the mit press

The Laws of Simplicity

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 21

22 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

ACM: KNOWLEDGE, COLLABORATION & INNOVATION IN COMPUTING

Uniting the world’s computing professionals,

Multicore programming with transactional memory

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 25

Transactions give the illusion of serial execution to the

UNLOCKING operations in the system. The programmer can reason

CONCURRENCY ing operation.

Lock-based vs. Transactional Map Data Structure

26 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

structure using locks so

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 27

resort to coarse-grained locking, thus losing the scalability

UNLOCKING mer must somehow reuse the ﬁne-grained locking code

CONCURRENCY map. Even if the programmer had access to this imple-

Thread-safe Composite Operation

28 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 29

runtimes, and existing libraries. The following sections

UNLOCKING SOFTWARE TRANSACTIONAL MEMORY

CONCURRENCY STM (software transactional memory) implements trans-

30 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 31

hardware or ﬁrmware mechanisms to move data between

UNLOCKING hybrid HTM-STM implementation. Transactions start

32 December/January 2006-2007 ACM QUEUE rants: feedback@acmqueue.com

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 33

SIMON CROSBY, XENSOURCE and DAVID BROWN, SUN MICROSYSTEMS

more queue: www.acmqueue.com ACM QUEUE December/January 2006-2007 35

The Virtualization Reality