Download as pdf or txt
Download as pdf or txt
You are on page 1of 160

Smart Machines

SMART
MACHINES
IBM’s Watson and the Era
of Cognitive Computing

JOHN E. KELLY III STEVE HAMM


Columbia University Press
Publishers Since 1893
New York Chichester, West Sussex
cup.columbia.edu
Copyright © 2013 IBM Corporation
All rights reserved

Library of Congress Cataloging-in-Publication Data


Kelly, John E., III (John Edward), 1954–
IBM’s Watson and the era of cognitive computing / John E. Kelly III
and Steve Hamm.
pages cm
Includes bibliographical references.
ISBN 978-0-231-16856-4 (cloth : alk. paper) — ISBN 978-0-231-53727-8 (e-book)
1. Expert systems (computer science) 2. Artificial intelligence.
3. Watson (Computer) I. Hamm, Steve. II. Title.
QA76.76.E95K434 2014
006.3'3—dc23
2013026154

Columbia University Press books are printed on permanent


and durable acid-free paper.
This book is printed on paper with recycled content.
Printed in the United States of America

c 10 9 8 7 6 5 4 3 2 1

Cover illustration: Courtesy of International Business Machine Corporation,


© International Business Machines Corporation
Cover design: Lisa Hamm

References to websites (URLs) were accurate at the time of writing.


Neither the author nor Columbia University Press is responsible
for URLs that may have expired or changed since the
manuscript was prepared.
Contents

Preface by John E. Kelly III vii

1 A New Era of Computing 1

2 Building Learning Systems  23

3 Handling Big Data  43

4 Augmenting Our Senses  69

5 Designing Data-centric Computers  87

6 Inventing a New Physics


of Computing 101

7 Imagining the Cognitive City 117

Coda: An Alliance of Human and Machine 137

Notes 141
Preface

John E. Kelly III

W hen I was a kid, my dad worked at General Electric’s


R&D lab in Niskayuna, N.Y. I would visit and watch
him work with vacuum tubes, which looked like light-
bulbs and directed electrical current in all sorts of devices,
from radios and TVs to radar and computers. At the time,
I didn’t fully understand how vacuum tubes worked, but
those visits inspired me to study science and, ultimately,
to get two degrees in physics and my Ph.D. in materials
engineering. I later came to realize that I had witnessed
one of the important transitions in the history of technol-
ogy. While my dad was showing me vacuum tubes, other
engineers were experimenting with the vacuum tube’s suc-
cessor, the transistor, which ultimately ushered in modern
electronics and personal computing.
Now we are at the dawn of a much bigger shift in the
evolution of technology—a new era affecting nearly every
aspect of the field. The changes that are coming over the
next two decades will transform the way we live and work
just as the computing revolution has transformed the
preface

human landscape over the past half century. At IBM, we


call this the era of cognitive computing.
I have had the pleasure of participating in the march of
progress in computer science since I joined IBM in 1980 as
an engineer. I’ve led organizations that contributed major
breakthroughs in science and technology, including the
company’s microelectronics division and, presently, IBM
Research, one of the most extensive corporate scientific-
research organizations in the world, with 3,000 scientists
and engineers at 12 labs in 10 countries.
Today, society is faced with a host of opportunities and
challenges that require a new generation of technologies
and a rewriting of the rules of computing. The availabil-
ity of huge amounts of data should help people to better
understand complex situations. In reality, though, more
data often lead to more confusion. We make too many
decisions with irrelevant or incorrect information, or with
data that represent only part of the picture. We need a new
generation of tools—cognitive technologies—that help
us to penetrate complexity and comprehend the world
around us so that we can make better decisions and live
more successfully and sustainably. Yet some of the tech-
niques of computer science and engineering are reaching
their limits. The technology industry must change the way
it designs and uses computers and software if it is to con-
tinue to make progress in how people work and live.
The vision of the future of information technology
presented in this brief book is the result of a large group
effort. Several years ago, I asked a handful of IBM scien-
tists to look decades into the future and sketch out how

3 viii 3
preface

computing will change. Their work sparked discussions


and debates among a wide range of IBMers, including the
company’s top executives, which, ultimately, evolved into
a shared belief about the future of computing.
We want to expose our thoughts to others and engage in
a conversation, so I asked Steve Hamm, a former journal-
ist who is now a writer at IBM, to help lay out our vision
in this book. Steve has been writing and editing stories
about the tech industry for nearly twenty-five years and
has witnessed the beginnings of several waves of innova-
tion. For instance, in 1994 he visited the founding engi-
neers of Mosaic Communications (later to be renamed
Netscape) shortly after they set up shop in a tiny office on
Castro Street in Mountain View, Calif. That was the dawn
of the Internet era. Since then he has written hundreds
of stories about innovation. In addition, he has authored
books about the rise of the Indian tech industry, mobile
computing, and the future of work. With that kind of deep
experience, Steve’s a natural partner for me in this project.
He interviewed dozens of IBM’s scientists and engineers
plus numerous outside experts to flesh out the vision and
shape the narrative.
The creation of this new era of computing is a monu-
mental endeavor, and, while IBM possesses a vision of the
future and a broad portfolio of expertise, no company can
take on this sort of challenge alone. So we look to our cli-
ents, university researchers, government policy makers,
industry partners, and entrepreneurs—indeed, the entire
tech industry—to take this journey with us. Already, other
companies, such as Google and Microsoft, are carving out

3 ix 3
preface

roles in the era of cognitive computing. The European


Union, with its Human Brain Project, is coordinating cog-
nitive scientists worldwide in an effort to map the inner
workings of the brain and mimic them in computing.
Among the leading universities in the field are MIT, Carn-
egie Mellon, and Stanford. Many other organizations are
also in the game.
With this book, which is aimed at a broad audience
rather than just the technical community, we hope to
greatly expand the search for answers by stimulating new
thinking within industry, government, and academia.
And, just as importantly, we hope to inspire university and
high school students to pursue careers in science, tech-
nology, engineering, and mathematics. Together, we can
drive the exploration and invention that will shape society,
the economy, and business for the next fifty years.

3 x3
Smart Machines
1
A New Era of Computing

o3O3o

I BM’s Watson computer created a sensation when it


bested two past grand champions on the TV quiz show
Jeopardy! Tens of millions of people suddenly understood
how “smart” a computer could be. This was no mere par-
lor trick; the scientists who designed Watson built upon
decades of research in the fields of artificial intelligence
and natural-language processing and produced a series
of breakthroughs. Their ingenuity made it possible for a
system to excel at a game that requires both encyclope-
dic knowledge and lightning-quick recall. In preparation
for the match, the machine ingested millions of pages of
information. On the TV show, first broadcast in February
2011, the system was able to search that vast storehouse
in response to questions, size up its confidence level,
and, when sufficiently confident, beat the humans to the
buzzer. After more than five years of intense research and
development, a core team of about twenty scientists had
made a very public breakthrough. They demonstrated
that a computing system—using traditional strengths
A New Era of computing

and overcoming assumed limitations—could beat expert


humans in a complex question-and-answer competition
using natural language.
Now IBM scientists and software engineers are busy
improving the Watson technology so it can take on much
bigger and more useful tasks. The Jeopardy! challenge
was relatively limited in scope. It was bound by the rules
of the game and the fact that all the information Wat-
son required could be expressed in words on a page. In
the future, Watson will take on more open-ended prob-
lems. It will ultimately be able to interpret images, num-
bers, voices, and sensory information. It will participate
in dialogue with human beings aimed at navigating vast
quantities of information to solve extremely complicated
yet common problems. The goal is to transform the way
humans get things done, from health care and education
to financial services and government.
One of the next challenges for Watson is to help doc-
tors diagnose diseases and assess the best treatments for
individual patients. IBM is working with physicians at
Cleveland Clinic and Memorial Sloan-Kettering Cancer
Center in New York to train Watson for this new role.
The idea is not to prove that Watson could do the work
of a doctor but to make Watson a useful aid to a physi-
cian. The Jeopardy! challenge pitted man against machine;
with Watson and medicine, man and machine are taking
on a challenge together—and going beyond what either
could do on its own. It’s impossible for even the most
accomplished doctors to keep up with the explosion of
new knowledge in their fields. Watson can keep up to date,

3 23
A New Era of computing

though, and provide doctors with the information they


need. Diseases can be freakishly complicated, and they
express themselves differently in each individual. Within
the human genome, there are billions of combinations of
variables that can figure in the course of a disease. So it’s
no wonder that an estimated 15 to 20 percent of medical
diagnoses are inaccurate or incomplete.1 Doctors know a
lot about diseases and the practice of medicine. What they
need help with is using evidence-based medicine to better
evaluate and treat individuals.
Dr. Larry Norton, a world-renowned oncologist at
Memorial Sloan-Kettering Cancer Center who is helping
to train Watson, believes the computer will be able to
synthesize encyclopedic medical and patient informa-
tion to help physicians more quickly and easily identify
treatment options for complex health conditions. “This is
more than a machine,” Larry says. “Computer science is
going to evolve rapidly and medicine will evolve with it.
This is coevolution. We’ll help each other.”2

The Coming Era of Cognitive Computing

Watson’s potential to help with health care is just one


of the possibilities opening up for next-generation tech-
nologies. Scientists at IBM and elsewhere are pushing
the boundaries of science and technology fields ranging
from nanotechnology to artificial intelligence with the
goal of creating machines that do much more than calcu-
late and organize and find patterns in data—they sense,

3 33
A New Era of computing

learn, reason and interact naturally with people in power-


ful new ways. Watson’s exploits on TV were one of the first
steps into a new phase in the evolution of information
technology—the era of cognitive computing.
During this era, humans and machines will become
more interconnected. Thomas Malone, director of the
MIT Center for Collective Intelligence, says a big ques-
tion for researchers as the era of cognitive computing
unfolds is: How can people and computers be connected
so that collectively they act more intelligently than any
person, group, or computer has ever done before?3 This
avenue of thought stretches back to the computing pio-
neer J. C. R. Licklider, who led the U.S. government proj-
ect that evolved into the Internet. In 1960 he authored a
paper, “Man-Computer Symbiosis,” where he predicted
that “in not too many years, human brains and comput-
ing machines will be coupled together very tightly and the
resulting partnership will think as no human brain has
ever thought and process data in a way not approached
by the information-handling machines we know today.”4
That time is fast approaching.
The new era of computing is not just an opportunity for
society; it’s also a necessity. Only with the help of smart
machines will we be able to deal adequately with the
exploding complexity of today’s world and successfully
address interlocking problems like disease and poverty
and stress on natural systems. Computers today are bril-
liant idiots. They have tremendous capacities for storing
information and performing numerical calculations—far
superior to those of any human. Yet when it comes to

3 43
A New Era of computing

another class of skills, the capacities for understanding,


learning, adapting, and interacting, computers are woe-
fully inferior to humans; there are many situations where
computers can’t do a lot to help us.
Up until now, that hasn’t mattered much. Over the past
sixty-plus years, computers have transformed the world
by automating defined tasks and processes that can be
codified in software programs in series of procedural “if
A, then B” statements—expressing logic or mathematical
equations. Faced with more complex tasks or changes in
tasks, software programmers add to or modify the steps
in the operations they want the machine to perform. This
model of computing—in which every step and scenario is
determined in advance by a person—can’t keep up with the
world’s evolving social and business dynamics or deliver
on its potential. The emergence of social networking, sen-
sor networks, and huge storehouses of business, scientific,
and government records creates an abundance of infor-
mation tech-industry insiders call “big data.” Think of it
as a parallel universe to the world of people, places, things,
and their interrelationships. This digital universe is grow-
ing at about 60 percent each year.5
The volume of data creates the potential for people to
understand the environment around us with a depth and
clarity that was simply not possible before. Governments
and businesses struggle to come to grips with complex sit-
uations, such as the inner workings of a city or the behav-
ior of global financial markets. In the cognitive era, using
the new tools of decision science, we will be able to apply
new kinds of computing power to huge amounts of data

3 53
A New Era of computing

and achieve deeper insight into how things really work.


Armed with those insights, we can develop strategies and
design systems for achieving the best outcomes—taking
into account the effects of the variable and the unknow-
able. Think of big data as a natural resource waiting to
be mined. And in order to tap this vast resource, we need
computers that “think” and interact more like we do.
The human brain evolved over millions of years to
become a remarkable instrument of cognition. We are
capable of sorting through multitudes of sensory impres-
sions in the blink of an eye. For instance, faced with the
chaotic scene of a busy intersection, we’re able to instantly
identify people, vehicles, buildings, streets, and sidewalks
and see how they relate to one another. We can recognize
and greet a friend we haven’t seen for ten years even while
sensing and prioritizing the need to avoid stepping in
front of a moving bus. Today’s computers can’t do that.
With the exception of robots, tomorrow’s computers
won’t need to navigate in the world the way humans do.
But to help us think better they will need the underlying
humanlike characteristics—learning, adapting, interact-
ing, and some form of understanding—that make human
navigation possible. New cognitive systems will extract
insights from data sources that are almost totally opaque
today, such as population-wide health-care records, or
from new sources of information, such as sensors moni-
toring pollution in delicate marine environments. Such
systems will still sometimes be programmed by people
using “if A, then B” logic, but programmers won’t have
to anticipate every procedure and every rule. Instead,

3 63
A New Era of computing

computers will be equipped with interpretive capabilities


that will let them learn from the data and adapt over time
as they gain new knowledge or as the demands on them
change.
The goal isn’t to replicate human brains, though. This
isn’t about replacing human thinking with machine
thinking. Rather, in the era of cognitive systems, humans
and machines will collaborate to produce better results,
each bringing their own superior skills to the partnership.
The machines will be more rational and analytic—and,
of course, possess encyclopedic memories and tremen-
dous computational abilities. People will provide exper-
tise, judgment, intuition, empathy, a moral compass, and
human creativity.
To understand what’s different about this new era, it
helps to compare it to the two previous eras in the evolu-
tion of information technology. The tabulating era began
in the nineteenth century and continued into the 1940s.
Mechanical tabulating machines automated the process
of recording numbers and making calculations. They were
essentially elaborate mechanical abacuses. People used
them to organize data and make calculations that were
helpful in everything from conducting a national popula-
tion census to tracking the performance of a company’s
sales force. The programmable computing era—today’s
technologies—emerged in the 1940s. Programmable
machines are still based on a design laid out by the Hun-
garian American mathematician John von Neumann.
Electronic devices governed by software programs per-
form calculations, execute logical sequences of steps, and

3 73
A New Era of computing

store information using millions of zeros and ones. Sci-


entists built the first such computers for use in decrypt-
ing encoded messages in wartime. Successive generations
of computing technology have enabled everything from
space exploration to global manufacturing-supply chains
to the Internet.
Tomorrow’s cognitive systems will be fundamentally
different from the machines that preceded them. While
traditional computers must be programmed by humans
to perform specific tasks, cognitive systems will learn
from their interactions with data and humans and be
able to, in a sense, program themselves to perform new
tasks. Traditional computers are designed to calculate
rapidly; cognitive systems will be designed to draw infer-
ences from data and pursue the objectives they were given.
Traditional computers have only rudimentary sensing
capabilities, such as license-plate-reading systems on toll
roads. Cognitive systems will augment our hearing, sight,
taste, smell, and touch. In the programmable-computing
era, people have to adapt to the way computers work. In
the cognitive era, computers will adapt to people. They’ll
interact with us in ways that are natural to us.
Von Neumann’s architecture has persisted for such a
long time because it provides a powerful means of per-
forming many computing tasks. His scheme called for the
processing of data via calculations and the application
of logic in a central processing unit. Today, the CPU is a
microprocessor, a stamp-sized sliver of silicon and metal
that’s the brains of everything from smartphones and
laptops to the largest mainframe computers. Other major

3 83
A New Era of computing

components of the von Neumann design are the memory,


where data are stored in the computer while waiting to be
processed, and the technologies that bring data into the
system or push it out. These components are connected
to the central processing unit via a “bus”—essentially a
highway for data. Most of the software programs written
for today’s computers are based on this architecture.
But the design has a flaw that makes it inefficient: the
von Neumann bottleneck. Each element of the process
requires multiple steps where data and instructions are
moved back and forth between memory and the CPU.
That requires a tremendous amount of data movement
and processing. It also means that discrete processing
tasks have to be completed linearly, one at a time. While
we have introduced some parallelism, it’s not enough. For
decades, computer scientists have been able to rapidly
increase the capabilities of CPUs by making them smaller
and faster. But we’re reaching the limits of our ability to
make those gains at a time when we need even more com-
puting power to deal with complexity and big data. And
that’s putting unbearable demands on today’s computing
technologies—mainly because today’s computers require
so much energy to perform their work.
What’s needed is a new architecture for computing,
one that takes more inspiration from the human brain.
Data processing should be distributed throughout the
computing system rather than concentrated in a CPU. The
processing and the memory should be closely integrated
to reduce the shuttling of data and instructions back and
forth. And discrete processing tasks should be executed

3 93
A New Era of computing

simultaneously rather than serially. A cognitive computer


employing these systems will respond to inquiries more
quickly than today’s computers; less data movement will
be required and less energy will be used.
Today’s von Neumann–style computing won’t go away
when cognitive systems come online. New chip and com-
puting technologies will extend its life far into the future.
In many cases, the cognitive architecture and the von Neu-
mann architecture will be employed side by side in hybrid
systems. Traditional computing will become ever more
capable while cognitive technologies will do things that
were not possible before. Already, cloud, social network-
ing, mobile, and new ways to interact with computing
from tablets to glasses are fueling the desire for cogni-
tive systems that will, for example, both harvest insights
from social networks and enhance our experiences
within them.
Should we fear the cognitive machines? MIT profes-
sors Erik Brynjolfsson and Andrew McAfee warn in their
book, Race Against the Machine, that one of the side effects
of this generation of advances in computing is they are
coming at the expense of existing jobs. We believe, though,
that the most important effect of these technologies will
be in assisting people to do what they are unable to do
today, vastly expanding the problems we can solve and
creating new spheres of innovation for every industry.
And like previous eras of computing, this will take a tre-
mendous amount of innovation over decades. “These
new capabilities will affect everything. It will be like the
discovery of DNA,” predicts Ralph Gomory, a pioneer of

3 10 3
A New Era of computing

applied mathematics who was director of IBM Research


in the 1970s and 1980s and later head of the Alfred P. Sloan
Foundation.6

How Cognitive Systems


Will Help Us Be Smarter

As smart as human beings are, there are many things that


we can’t do or simply can’t process in time to affect the
outcome of a situation. Cognitive systems in many cases
help us overcome our limitations.

Complexity

We have difficulty rapidly processing large amounts of


information. We also have problems understanding the
interactions among elements of large systems, such as the
interplay of chemical compounds in the human body or
the dynamics of financial markets. With cognitive com-
puting, we will be able to harvest insights from huge quan-
tities of data to handle complex situations, make more
accurate predictions about the future, and better antici-
pate the unintended consequences of actions.
City mayors, for instance, already can begin to make
sense of the interrelationships among urban subsystems—
everything from electrical grids to weather to subways to
demographic trends to issues reported or expressed by
citizens. One example is monitoring social media during
a major storm to spot patterns of words and images that

3 11 3
A New Era of computing

indicate critical problems in particular neighborhoods.


Much of this information will come from sensors—video
cameras, instruments that detect motion, and devices that
spot anomalies. Mobile phones will also be used as ano-
nymized sensors that help city planners understand the
movements of people and accurately predict the effects
and financial impact of various actions.

Expertise

With the help of cognitive systems, we will be able to see


the big picture and make better decisions. This is espe-
cially important when experience in an area is limited or
we’re trying to address problems that cut across profes-
sional or practical domains.
For instance, police are beginning to gather crime
statistics and combine them with information about
demographics, events, building blueprints, and weather
to produce better analysis and safer cities. Armed with
abundant data, police chiefs can set strategies and deploy
resources more effectively—even predicting where and
when crimes are likely to happen. Patrol officers ben-
efit by gaining a wealth of information about unfamiliar
locations; situational intelligence will be extremely use-
ful when they knock on someone’s door. The ability to
achieve such comprehensive understanding of situations
at every level will be an essential tool for managing a city
and will become one of the most important factors in the
economic growth and competitiveness of cities.

3 12 3
A New Era of computing

Objectivity

We all possess biases based on personal experience, pro-


fessional background, and intuition about what works and
what doesn’t, as well as the influence of group dynamics.
Cognitive systems can make it possible for us to be more
objective in our decision making.
Corporations will become socially networked busi-
nesses made up of people and systems in collaboration.
Sophisticated analytic engines will understand how an
organization works, its competitive environment, capabil-
ities, resources, and ecosystem of partners. Professionals
will experience data in new ways and will be prompted to
consider a number of hypotheses based on inferences and
scenarios constructed by a system using relevant contex-
tual data. The system will make hard-to-spot connections
and help guide individuals in achieving business goals.

Imagination

Professional background, training, and experience make


it difficult to envision dramatically different settings and
ranges of choice. Cognitive systems will help us discover
and explore new and contrarian ideas.
Today, it takes ten to fifteen years and up to $1 billion
to bring a single new drug to market. The research-and-
development teams of pharmaceutical companies are
beginning to use cognitive systems to explore existing
biological information in new ways and to discover new

3 13 3
A New Era of computing

treatments utilizing compounds that have already been


proven safe in other applications. Furthermore, cognitive
systems can spot hidden patterns in the text of thousands
of published articles to help identify new opportunities
for drug treatments. Using computer modeling and simu-
lation, researchers can run experiments to validate these
findings “in silico” much faster than can be done in a life
sciences “wet” lab. With the aid of cognitive machines,
researchers and engineers will be able to explore millions
of combinations in ways that could shave years and hun-
dreds of millions of dollars from the development process.

Senses

We can only take in and make sense of so much raw, phys-


ical information. With cognitive systems, computer sen-
sors combined with analytics software will vastly extend
our ability to gather and process such information.
Imagine a world where individuals carry their own per-
sonal cognitive system in a handheld device. These per-
sonal cognitive assistants would carry on conversations
with us and acquire knowledge about us, in part, from
observing what we see, say, touch, and type so they can
better anticipate our wishes. In addition, the assistant
might be able to use sophisticated sensing to monitor
threats to a person’s well-being. If there’s carbon monox-
ide in a room, for example, the device might be able to
alert its user.
Over time, humans have evolved to be more success-
ful as a species. We continually adapt to overcome our

3 14 3
A New Era of computing

limitations. This partnership with computers is simply


the latest step in a long process of adaptation. Just as the
personal computer, the Internet, mobile communications,
and social networking have given rise to tens of thousands
of software applications, Web services, and smartphone
apps, the cognitive era will produce a similar explosion of
creativity.
Think of the coming technologies as cognitive apps. At
the enterprise level, corporations might use these apps for
handling mergers and acquisitions, crisis management,
competitive analysis, and product design. The technolo-
gies already exist to assist a team within a company that’s
in charge of sizing up acquisition candidates using a cog-
nitive M&A app. The team can augment its knowledge of
potential targets, many of which are private companies, by
gaining a deeper understanding of data, including transac-
tions and partnerships in the public domain, intellectual
property filings, and insights from Web sharing and other
social channels. The cognitive software maps the links
and the nature of the interactions. The M&A team will also
track the performance of previously acquired companies.
Those insights, constantly updated and integrated in a
learning system, will help the team identify risks and syn-
ergies and become smarter with each acquisition.
Moreover, such apps delivered from the cloud will
help individuals tackle highly complex though com-
mon decisions, for example, selecting a college, making
investments, choosing among insurance options, and
purchasing a car or home. Say you’re starting a family
and you and your spouse want to move out of your tiny

3 15 3
A New Era of computing

apartment into a house in a neighborhood or town with


good public schools. The cognitive house-hunting app will
start by understanding your needs and sizing up places
to live, school systems, commuting factors, and costs of
living. The app will even tap into social networks to dis-
cover hidden truths, such as the fact that the property a
house was built on was once part of landfill. You will be
able to conduct an ongoing conversation with the app as
you learn more about what’s available and the app learns
more about you and the things you care about. Then,
when you begin shopping for houses and mortgages, the
app will guide you through the thicket of information
and considerations that will inform your decisions. And,
finally, when the two of you are standing in the foyer of
your dream house wondering if you really can afford it,
and you’re under pressure to make a bid immediately, you
can consult your trusted app for advice on one of the most
critical decisions you will ever make.

Technology Breakthroughs:
Opportunities and Necessities

Much of the progress in science and technology comes in


small increments. Scientists and engineers build on top
of the innovations that came before. Consider the tablet
computer. The first such devices appeared on the scene
back in the 1980s. They had large, touch-sensitive screens
but weighed nearly five pounds and were an inch and a half
thick. They were more like bricks than books, and about all

3 16 3
A New Era of computing

you could do with them was scrawl brief memos and fill out
forms. After thirty years of gradual improvements, we have
slim, light, powerful tablets that combine the features of a
telephone, a personal computer, a television, and more.
There’s nothing wrong with incremental innovation.
It’s absolutely necessary, and, sometimes, its results are
both delightful and transformational. A prime example is
the iPhone. With its superior navigation and abundance
of easy-to-use applications, this breakthrough product
spawned a burst of smartphone innovation, which com-
bined with the social-networking phenomenon to pro-
duce a revolutionary shift in global human behavior. Yet,
technologically, the iPhone was built on top of many
smartphone advances that preceded it.
New waves of progress, however, require majorly dis-
ruptive innovations—things like the transistor, the micro-
chip, and the first programmable computers. These are
the advances that fundamentally change our world.
Today, many of the core technologies that provide the
basic functions for traditional computers are mature; they
have been in use for decades. In some cases, each wave
of improvements is less profound than the wave that pre-
ceded it. We’re reaching the point of diminishing returns.
Yet the demands on computing technology are growing
exponentially.
Soon incremental innovation will no longer be suffi-
cient. People who demand the most from computers are
already running into the limits of today’s circuitry. Michel
McCoy, director of the Simulation and Computing Pro-
gram at the U.S. Lawrence Livermore National Laboratory,

3 17 3
A New Era of computing

is among those calling for a nationwide initiative involv-


ing national laboratories and businesses to come up with
radical new approaches to microprocessor and computer-
system design and software programming. “In a sense,
everything we’ve done up until this point has been easy,”
he says. “Now we have reached a physics-dominated
threshold in the design of microprocessors and comput-
ing systems which, unless we do something about it, is
essentially going to stagnate progress.”7
We need more radical innovations. In the years ahead, a
number of fundamental advances in science and technol-
ogy will be required to make progress. In order to appre-
ciate the scope and breadth of these innovations, we can
conceptualize a cognitive system as one of those colorful
Russian wooden dolls where each of the smaller dolls
nests inside a slightly larger one.
The top layer is the way we interact with computers and
get them to do what we want. The big innovation at this
outer layer is “learning systems,” which we will explore
more deeply in chapter 2. The goal is to create machines
that do not require as much programming by humans.
Instead they’ll be “taught” by people, who will set objec-
tives for them. As they learn, the machines will work out
the details of how to meet those goals.
The next layer represents how we organize and interpret
data, which we’ll discuss in chapter 3. Today’s databases
do an excellent job of organizing information in columns
and rows. Tomorrow’s are being designed to manage huge
volumes of different kinds of data, place information in
context, and crunch data in real time.

3 18 3
A New Era of computing

The next layer represents the architecture or design of


systems—how we fit together all the physical components
that make up a computer. The challenge here, which we
address in chapter 5, is creating data-centric computers.
The designers of computing systems have long treated
logic and memory as separate elements. Now, they will
meld the components together, first, on circuit boards
and, later, on single microchips. Also, they’ll move the
processing to the data, rather than visa versa.
Finally, in the innermost layer is nanotechnology
employed in building core components of a cognitive sys-
tem, where we manipulate matter at the molecular and
atomic scale. In chapter 6, we’ll explore what it will take to
invent a new physics of computing. To overcome the lim-
its of today’s microchip technology, scientists must shift
to new nanomaterials and new approaches to switching
from one digital state to another. Possibilities include har-
nessing quantum mechanics or chips driven by “synapses
and neurons” for data processing.

A New Culture of Innovation

We’re still in the early stages of the emergence of this new


era of computing. Progress will require a willingness to
make big bets, take a long-term view, and engage in open
collaboration. We’ll explore the elements of the culture of
innovation in each of the subsequent chapters in what we
call the journeys of discovery. An absolutely critical aspect
of the culture of innovation will be the ambition and

3 19 3
A New Era of computing

capabilities of the inventors themselves. For rapid prog-


ress to be made in the new era of computing, young people
must be inspired to become scientists, and they must be
educated by teachers using superior tools and techniques.
They have to be rewarded and given opportunities to chal-
lenge everything we think we know about how the world
works. It requires dedication and investment by all of soci-
ety’s institutions, including families, local communities,
governments, universities, and businesses.
When we ask scientists at IBM Research what motivates
them, the answer is often that they want to change the
world—not in minuscule increments but in great leaps
forward. Mark Ritter, a senior manager in IBM Research’s
Physical Sciences Department, leads an effort, inspired
by the human brain, to rethink the entire architecture
of computing for the era of cognitive systems. As a child,
Mark, whose father was a plumber, had an intense curios-
ity about how things work on a fundamental level. It was
his good fortune that his grandparents, who lived near
his family in Grinnell, Iowa, had two neighbors who were
physics professors at Grinnell College. One of the physi-
cists, whom Mark pestered with science questions while
the neighbor repaired his VW in the driveway, lent Mark a
book on particle physics when he was about twelve years
old. As a teenager, Mark bicycled over to the campus to
attend physics lectures. He built a simple gas laser in the
basement of his home. It was the beginning of decades
of inquiry into how things work and how they can work
better. A few years ago, after more than twenty years in
IBM Research, Mark and his colleagues recognized that

3 20 3
A New Era of computing

the computing model designed for mid-twentieth-century


demands was running out of gas. “This is the most excit-
ing time in my career,” Mark says. “The old ways of doing
things aren’t going to solve efficiently the big, real-world
problems we face.”
For his part, Dr. Larry Norton of Memorial Sloan-
Kettering Cancer Center is driven to transform the way
medicine is practiced. Ever since he can remember, he
was motivated by the desire to do something with his life
that would improve the world. Born in 1947, he grew up
at a time when people saw science as a powerful means
of solving humanity’s problems. He recalls a thought-
crystallizing experience when he was an undergraduate
at the University of Rochester. He lived in a dorm where
students often gathered in the mornings for freewheeling
discussions of politics, values, and ethics. He was already
contemplating a career in medicine, and the topic that day
was, if you were a doctor and had done everything medical
science could offer to save a patient but she died anyway,
how would you feel? The students were split. “I realized I
would feel terrible about it,” he says. “Offering everything
available isn’t enough. I should have done better. And
since, because of limitations in the world’s knowledge, I
couldn’t do better, I should be involved in moving things
forward.”
During his forty-year career, Larry has been an innova-
tor in cancer treatment. Among his contributions is the
central role he played in developing the Norton-Simon
hypothesis, a once-revolutionary but now widely used
approach to chemotherapy. In Larry’s years as a clinician,

3 21 3
A New Era of computing

he has saved the lives of many patients, but, of course,


some of his patients have died. Those deaths haunt him.
He believes that he owes it to those people and to their chil-
dren to improve the treatment of cancer and, ultimately, to
help eradicate the disease. He sees his work with Watson
as another way to contribute. Through a machine, he can
share his knowledge and expertise with other physicians.
He can help save cancer victims from a distance—people
he has never met.
You will meet a host of innovators in this book. There’s
Dharmendra Modha, the IBM Research manager who is
leading a team of IBM and university scientists in a quest
to mimic the human brain. And there’s Murali Ramana-
than, a professor of pharmaceutical sciences and neu-
rology at State University of New York, Buffalo, who is
studying the role of genetic and environmental factors
in multiple sclerosis. These innovators are remarkable
people. We depend on them to produce the surprising
advances that knock the world off kilter and, ultimately,
have the potential to make it a better place. We will need
many of them to make the transition to the era of cogni-
tive systems. In the end, this era is not about machines but
about the people who design and use them.

3 22 3
2
Building Learning Systems

o3O3o

I n November 2009, when IBM’s Watson was under devel-


opment for its showdown on Jeopardy!, the machine
made one laughable mistake after another in test matches.
In one particularly funny instance, it was prompted to
identify what the “Al” in the company name Alcoa stands
for. It fired back, “What is Al Capone?” Everybody in the
room cracked up. The machine had confused the first two
letters of aluminum with the name of a famous gangster.1
No harm was done. One of the main purposes of these
training sessions was for Watson to make mistakes so
that the scientists on the technology development team
could tweak the algorithms used by the computer and
prevent the same mistakes from happening again. It was
a painstaking, manual process. But there was no way that
the scientists could anticipate every dilemma that Watson
might be confronted with and write a rule to deal with it,
so, as they developed Watson, they realized they would
need to invent a system that would enable Watson to learn
on its own.
building learning systems

Their invention represented a major advance in the


science of machine learning, a branch of artificial intel-
ligence that focuses on building systems that learn from
data. The field was first defined in 1959 by IBM scientist
Arthur Samuels and has found plenty of uses over the
years—including common applications such as optical
character recognition and e-mail spam filters. Such sys-
tems are trained to recognize repeated patterns in words
or shapes and to react in a certain way when they encoun-
ter them again.
Watson takes machine learning to a new level. In cre-
ating the technology for Watson, called DeepQA, which
includes the learning capability, the developers provided
the machine with a large corpus of unstructured informa-
tion and the algorithms to extract knowledge from it. They
trained the machine to identify categories of information,
such as famous people, locations, and relationships, and
also to parse language. Then they created a set of statisti-
cal methods for learning how words are used in different
contexts. This combination of techniques allows Watson
to learn from the data rather than just follow detailed
instructions. In a sense, Watson learns the way people
learn, by experiencing a lot of things and drawing infer-
ences and lessons from those encounters.
Watson’s learning abilities represent a major step for-
ward in the new era of computing. Today, tech companies,
corporations, and governments employ armies of software
programmers to create specific applications to accom-
plish specific goals. Those programmers write millions of
lines of detailed instructions telling computers what to do,

3 24 3
building learning systems

following the linear and deterministic approach charted


out by von Neumann at the dawn of the programmable
computing era.
But this approach simply isn’t sufficient anymore. We
need to design and build a new class of computers that use
Watson-style learning systems in order to help us deal with
the complexities of the modern world. These systems will
be able to learn from both structured and unstructured
data, discover important correlations, create hypotheses
for those correlations, and suggest actions that produce
better outcomes. In addition, they’ll have feedback loops
where they measure the results of their work and modify
the way they do things based on that feedback. Once peo-
ple design and program them and train them, they’ll do
the rest—essentially reprogramming themselves as they
interact and learn.
We’re in the early stages of the evolution of learning
systems, but it’s possible to anticipate how things will
develop—if not the exact timing and details of the tech-
nology. The first stage is static learning. The version of
the Watson system that played on Jeopardy! is in this cat-
egory, according to IBM Research executive Dario Gil, who
headed a team that mapped out the progression. At this
point, it’s still necessary to assemble teams of technology
and domain experts to provide a learning system with the
rules and information it needs to perform a certain well-
defined task—then train it to use the information. One
example is the customer-service call center. Today, when
consumers call companies for help with technology prob-
lems, representatives read from scripts and search rigid

3 25 3
building learning systems

knowledge-management systems for answers. The pro-


cess is too often frustrating for the consumer and expen-
sive for the company. Now, companies are training Watson
to magnify the capabilities of their representatives. The
systems will ingest vast amounts of information that’s
useful in solving a specific set of problems, say, glitches
involving smart phones. The cognitive system will learn
from interactions, steadily improving the advice the rep-
resentatives give to consumers. In addition, there will be a
self-service version of Watson that helps consumers with
their problems and concerns directly.
In a second stage, dynamic learning, machines will
continually mine information from a variety of sources
(e.g., textual, numerical, visual, sensory). Also, scientists
will create a variety of modes that the machines can use to
communicate, including spoken conversations. Dynamic
learning systems will expand their hypothesis generation
across multiple domains that frequently intersect with one
another, such as economics, business, and social sciences.
How “smart” will learning systems get? It’s possible that
they’ll be able to achieve a deep comprehension of natural
language, images, and other sensory input. Already, pro-
grams exist that make it possible for machines to engage
in scientific discovery, and, at the far end of the arc of
time and imagination, they’ll likely be able to take on
complex problems that they identify themselves.2 They’ll
read a host of scientific, social science, and technical pub-
lications; “watch” TV shows and read news reports; and
perhaps even monitor environmental shifts like climate
change. Years from now, when research scientists stroll

3 26 3
building learning systems

into their offices they will greet systems that have access
to vast knowledge resources and can use them to come up
with new hypotheses.

Journey of Discovery:
The Path to Watson

The Watson project got its start in a surprising way. In


the fall of 2004, IBM’s head of computing systems soft-
ware, Charles Lickel, traveled from his home in Tucson
to spend the day with a small team he managed at an
IBM facility in Poughkeepsie, New York. At the end of the
workday, the team gathered at the nearby Sapore Steak-
house for dinner. They were bemused when, at seven
p.m. sharp, many of the diners abruptly got up from
their tables, rushed into the bar, and clustered excitedly
around the TVs. One of Charles’s guys explained that they
were watching long-time champion Ken Jennings defend
his title on Jeopardy!3
Charles hadn’t followed Jeopardy! for years, but the
scene made an impression on him. A few months later,
research director Paul Horn asked his lieutenants to think
up a high-profile project that the lab could take on that
would demonstrate IBM’s scientific and technological
prowess. The company calls these its “grand challenges.”
The previous grand challenge had been a huge success:
IBM’s Deep Blue computer had beaten the world’s top
chess grand master in a highly publicized match in the
mid-1990s. But a lot of time had passed since that victory.

3 27 3
building learning systems

During one of the brainstorming sessions aimed at picking


the company’s next grand challenge, Charles suggested
building a computer that could compete on Jeopardy!
IBM has long used man-versus-machine games to moti-
vate scientists, focus research, and engage the public. In
the early 1960s, IBM researcher Arthur Samuel, the AI pio-
neer, created one of the first computer programs capable
of learning when he wrote a checkers-playing program
designed to run on the 701, IBM’s first commercial com-
puter. Samuel challenged one of the top U.S. checkers
champions to a match—and won. IBM researcher Gerry
Tesauro in the late 1980s developed a program called TD-
Gammon, which used a technique called temporal differ-
ence learning to teach itself how to play backgammon. It
was competitive in matches with some of the world’s top
backgammon players.4
Later, IBM turned the contests into true spectator sports.
The 1997 match between Deep Blue and the reigning world
chess champion, Garry Kasparov, was streamed live on a
website, and millions of chess enthusiasts watched every
move. The Deep Blue program didn’t require much learn-
ing since chess is such a highly structured game, but,
according to Murray Campbell, one of the researchers
on the Deep Blue team, the match demonstrated that a
combination of clever engineering and sophisticated algo-
rithms could rival the performance of a top human expert
in a specific domain of human achievement. “It made
everybody understand in a clear way that problems previ-
ously considered too hard for a computer could now be
tackled,” Murray says.5

3 28 3
building learning systems

Of course, while IBM has played an important role in


the evolution of artificial intelligence and learning sys-
tems, many other scientists and organizations have made
major contributions. John McCarthy, a professor at Stan-
ford University, coined the term “artificial intelligence” in
1955, and Marvin Minsky, a professor at MIT, has produced
a long series of advances in the field since the 1950s. He’s
now focusing on giving machines the ability to perform
humanlike commonsense reasoning.6 Today, Andrew Ng
of Stanford is leading a team of scientists in an attempt to
create algorithms that can learn based on principles that
the brain might also employ.
The field of artificial intelligence has advanced in starts
and stops. Periods of soaring optimism have been fol-
lowed by so-called AI winters, when seemingly promis-
ing avenues of research failed to produce the anticipated
results. Put simply, this is hard stuff. So it’s no surprise
that when Charles Lickel proposed Jeopardy! as the next
grand challenge, his suggestion was met initially with reac-
tions ranging from skepticism to outright derision. But he
quickly won Paul Horn’s support. Paul thought the proj-
ect could be very exciting—both to computer scientists
and the public at large.7 In mid-2006 Charles gave the go-
ahead to researcher David Ferrucci, who was an enthusias-
tic evangelist for the project, to explore whether building
a machine that could win on Jeopardy! was even plausible.
Progress was extremely slow at first. For several years,
Dave had been leading a group of researchers who had
produced good results with question-answering technolo-
gies. However, their performance gains had plateaued. So

3 29 3
building learning systems

Dave and a small team cast around for the right technol-
ogy and strategy with which to approach the Jeopardy!
challenge. By mid-2007, though, Dave was convinced the
job could be done, and Charles and Paul gave the proj-
ect their blessing.8 However, there was a danger that the
project would never actually be realized as Paul retired
from IBM soon after and became senior vice provost for
research at New York University. His replacement as head
of IBM Research was one of the authors of this book, John
Kelly. Dave was terrified that John would kill the project,
and his fears were not assuaged when, at their first meet-
ing, John expressed deep reservations. At the time, the sys-
tem, then called Blue J, was only able to answer 30 percent
of previous Jeopardy! questions accurately. John told them:
“Guys, we can’t put the IBM brand on TV with 30 percent
accuracy.”
Some of the team members were discouraged. But Dave
told them that he was convinced they could achieve their
goals. He said he was so committed to the project that he
was willing to risk failure and public humiliation. And he
asked each of the twelve scientists who were then on the
team to fully commit themselves, as well. “Are you ready
for this?” he asked.9
They were. Over the following weeks and months they
gradually improved the technology to the point where,
even though the gap was still huge, John was convinced
they could build a machine that could compete at the
highest levels on Jeopardy! He saw that the effort could
boost IBM’s reputation as an innovator and that the sci-
ence would be transformative, opening up massive new

3 30 3
building learning systems

opportunities for computing. He said: “I’ll give you what-


ever you need in the way of resources, but, if we’re going to
put the IBM brand on national TV, we must win. We must
win.” That launched the intense effort that led ultimately
to Watson’s victory.
On January 14, 2011, the Jeopardy! contest was conducted
on stage in the IBM Research auditorium. Top IBM execu-
tives, research scientists, and guests packed the room.
Emotions ran high. John told the audience that he did not
know if Watson would win, but he believed that the con-
test represented an important moment in the history of
computing. He said, “It’s not a matter of if a computer will
one day win at Jeopardy!, but when.”
John was extremely nervous as the games began. He
prayed that the system wouldn’t crash and that Watson
wouldn’t make any embarrassing mistakes. During the
final game, when it became clear to him that Watson
would emerge victorious, he glanced across the room and
caught Dave Ferrucci’s eye. Both smiled. They knew it was
done. They had bet big and won.

The Next Steps for Watson

Watson was not designed just for playing Jeopardy! From


the start, the goal was to create a technology platform that
could be adapted to a wide variety of uses—a practical
tool with the potential to transform business and society.
Rather than trying to create a monolithic set of rules for
analyzing data, the development team used many simpler

3 31 3
building learning systems

algorithms that could be added to or mixed and matched


depending on the task before Watson. They created an
analytics program for weighing and evaluating evidence
and conclusions, no matter what domain. And they made
it possible to have experts in particular domains to con-
tribute knowledge to the program. The result: “We created
an architecture of discovery,” says Eric Brown, who heads
IBM Research’s Watson team.10
Today, IBM is adapting Watson to a number of indus-
tries. A team of software developers, with considerable
help from experts in various domains, is crafting Watson
solutions for health care and other industries. They’re
also creating applications for cross-industry uses, such as
customer-service management.
WellPoint, one of the nation’s largest health-benefits
providers, worked with IBM to produce capability aimed at
improving the efficiency and effectiveness of health-care
payer operations—focused initially on the process of treat-
ment preapproval management. To expand Watson’s capa-
bilities, nurse clinicians trained Watson to understand the
U.S. medical treatment code system and WellPoint’s medi-
cal policies, clinical guidelines, and process for reviewing
members’ requests for treatment authorization. As of early
2013, Watson had absorbed more than 25,000 test cases and
1,500 of WellPoint’s real-life cases and gained the ability to
interpret the meaning of queries and analyze them. Then
WellPoint and IBM co-developed two integrated applica-
tions, one for use by payers, such as WellPoint’s health
plan subsidiaries, and the other for use by medical-care
providers. In one continuous process, physicians can

3 32 3
building learning systems

recommend treatments for specific patients and submit


those recommendations to health-care payers, who can
quickly approve the treatments. The system is now being
tested by WellPoint’s health plans and several hundred
health-care providers.11
A major cancer research center in the southwest is work-
ing with IBM to use Watson technologies as a medical-
research application to help its physicians choose the
best treatments for patients who have unusual conditions.
There’s a fairly large population of cancer patients for
whom reliable treatments have not yet been found. Here,
researchers for the cancer center are collaborating with
IBM engineers to feed into Watson vast quantities of dis-
ease, treatment, and outcome data and analyze it to find
hidden patterns and correlations. Do people with a combi-
nation of genetic dispositions and health problems react
better to a certain treatment? Would a drug that’s used
now for one cancer be useful for another? Or would a com-
bination of drugs help out? The hope is that the system
will give physicians a huge head start in identifying effec-
tive treatments, testing them in clinical trials, publishing
the results, and quickly getting solutions into the hands of
physicians all over the world.
ANZ, a financial-services firm headquartered in Aus-
tralia and with operations in thirty-three countries, is
working exclusively with IBM to explore the possibilities
of using the Watson technology in the financial-services
industry. The goal is for ANZ to provide its wealth manag-
ers with a cognitive assistant to help them better advise
the bank’s 2 million wealth-management clients. Here’s

3 33 3
building learning systems

a scenario the bank has worked up: Imagine that you’re


talking to your advisor about retirement planning. Amid
all of the uncertainty in today’s markets, you’re not sure
how to coordinate your investments, retirement sav-
ings, and insurance coverage in a way that will result in
a secure, well-funded retirement. The advisor, aided by
Watson, will be able to quickly develop a custom-tailored
plan to fit your needs.12
While this work is going on, the IBM Research team is
developing new capabilities that will be useful no matter
what the specific application is. The goal is to make Wat-
son better at learning and at interacting with humans. On
Jeopardy!, Watson could respond to a clue with the answer
it was most confident in. In the next-generation Watson
technology, people will be able to engage Watson in ongo-
ing dialogues aimed at finding the most useful insights.
After receiving an initial query, Watson will be able to ask
for additional information to help it understand more pre-
cisely what the human being wants to know. The learning
capability includes a series of steps: hypothesis generation
from evidence, hypothesis ranking, question generation,
and answer acquisition. In addition to getting answers
from humans it’s interacting with, Watson will be able to
tap a directory of experts for advice or even learn through
crowd-sourcing. As it acquires answers, it will build a col-
lection of learned axioms that strengthen its command of
given domains.
Other improvements to Watson have come. People
are now able to view the logic and evidence upon which

3 34 3
building learning systems

Watson presents options. Watson is now able to digest


not just textual information but also structured statisti-
cal data, such as electronic medical records. A different
group at IBM is working on natural-language-processing
technology that will allow people to engage in spoken
conversations with Watson. At the highest level, many of
the changes are aimed at moving Watson from answering
specific questions to dealing with complex and incom-
plete problem scenarios—the way humans experience
things.
In fact, as people in particular professions and indus-
tries experiment with Watson, they find that the basic
question-and-answer capabilities, while useful, are not
the most valuable aspects of the systems. Rather, they see
Watson as a discovery tool that doesn’t just make them
better at what they already do; through human-machine
interactions, it helps them think differently and work dif-
ferently. David Nahamoo, who heads up IBM Research’s
speech group, says it this way: “Traditional computing sys-
tems are systems of record. The computers of the future
will be systems of engagement. They’re about dynamic
knowledge, not static knowledge.”13
Language is the key that makes it possible for machines
to help us better understand our world and to engage with
us. But don’t think of language only in the most obvious
sense—English, Chinese, Spanish, and the like. Every
domain of knowledge and expertise has it’s own special-
ized language, such as chemical symbols, medical images,
or the specific meaning embedded in arcane legal terms.

3 35 3
building learning systems

And that poses particular challenges for computers. As we


ask computers to understand and reason more like we do,
we must teach them not just the structure and vocabu-
lary of spoken and written languages but the words and
concepts that are particular to professional and business
domains. When they have that kind of capability, they can
begin to offer us not just answers to questions but fresh
insights that might not have occurred to us on our own.
Think, for example, about a challenge confronting a
corporate marketing department. Time and again, as
the marketing team tries out new techniques in a world
that’s quickly being transformed by digital media, social
networking, and mobile communications, it finds that
its campaigns are not effective. But what if a cognitive
assistant could help? The system could examine past cam-
paigns in detail, looking for common patterns that might
help the team spot what it’s doing wrong. At the same
time, the system can crawl the world of information for
case studies of successful digital-marketing campaigns
and spot the patterns that point to success. In order to
do this kind of work, the system will need to comprehend
the language and incorporate knowledge from a variety
of specialized domains, from classic marketing to search-
engine optimization and from behavioral psychology to
demographics and statistics. Everything that has a mean-
ing within a profession or domain must have the same
meaning to the system if it is to collaborate with people
at a deep level.
One of the most promising domains for this kind of
human-machine interaction is the health-care field. That’s

3 36 3
building learning systems

where the IBM partnerships with Memorial Sloan-Ketter-


ing Cancer Center and Cleveland Clinic come in.
Eric Brown’s team at IBM Research is working with
Cleveland Clinic on two Watson projects. First, they’re
embedding Watson into the curriculum of the clinic’s
Lerner College of Medicine. Physicians, faculty, and medi-
cal students there are helping to adapt Watson for use as
a collaborative learning tool to assist with medical infor-
mation exploration and training in clinical reasoning. In
addition, Cleveland Clinic is using Watson to help unlock
hidden insights in the unstructured data in anonymized
electronic medical records, with the goal of helping physi-
cians make more informed decisions about patient care.
Leaders at Lerner are among those who believe that med-
icine must be taught differently—through problem-based
learning and simulation laboratories where small groups of
students solve sample cases. The medical school’s students
use a manual visualization technique to work their way
through chains of evidence, inferences, and beliefs about
what’s wrong with a patient and what should be done to
help. They draw circles and arrows with pencil and paper
to represent pieces of evidence and potential conclusions.
The IBM Research team adapted the technique to cre-
ate a user interface called WatsonPaths, an interactive pro-
gram for teaching Watson how physicians solve problems.
Now, working with WatsonPaths, faculty and students will
be able to review the evidence the computer presents and
the inferences it draws. They will type in additional infor-
mation and insights. As the faculty members and students
interact with the machine, the arrows that represent the

3 37 3
building learning systems

strongest connections will become thicker and the circles


that represent important evidence and strong conclusions
will become more pronounced. In the background, Watson
will learn how to interpret the way physicians talk and write
about their craft. The students will train the machine and,
at the same time, the machine will help train the students.
Just as developing a version of Watson for Jeopardy!
helped focus the early research efforts on the technology,
the team has set a new goal that’s even more challenging:
becoming capable of passing the online section of the U.S.
Medical Licensing Exam, the professional exam that physi-
cians with an MD are required to pass before they’re permit-
ted to practice medicine. It includes sections on anatomy,
pharmacology, pathology, behavior science, genetics, and
other medical disciplines. The goal is for Watson to be
capable of passing the exam as a measure of progress,
though the system will not actually take it. Professors and
students at Lerner are helping Watson achieve this capabil-
ity. “Problem-based learning techniques for teaching medi-
cal students align with the way doctors think. In addition,
medical students and physicians need help in harnessing
the volumes of evidence now available,” says Dr. James
Young, executive dean of the Cleveland Clinic Lerner Col-
lege of Medicine of Case Western Reserve University. “We
are helping to train Watson so that eventually Watson will
be a partner in the delivery of healthcare for our patients.”
Meanwhile, Memorial Sloan-Kettering Cancer Center,
WellPoint, and IBM are developing a Watson-powered prod-
uct called Interactive Care Insights for Oncology to help
oncologists access the latest information on treatments.

3 38 3
building learning systems

MSKCC is helping provision Watson with the body of


knowledge it will need to serve as a valuable resource
to help physicians make evidence-based decisions. The
corpus includes everything from case histories and vast
libraries of medical literature to clinical guidelines, best
practices from top physicians, and reports from drug tri-
als. Watson has by now ingested 600,000 pages of medical
evidence, 2 million pages of text from 42 medical journals
and clinical trials, and several thousand case histories.
The physicians are using all that data to train Watson to
make treatment recommendations that are aligned with
best practices. Once oncologists from MSKCC and com-
munity-based oncology centers begin testing the product,
additional learning will take place. Think of it as on-the-
job training for Watson. The more physicians interact with
the machine, the smarter it will get.
WellPoint has selected several community-based oncol-
ogy centers to test the Interactive Care Insights for Oncol-
ogy technology, including WESTMED Medical Group in
Purchase, N.Y. By evaluating the system’s performance
against their existing patient records, they can see how
well it works and provide feedback to make it more usable.
Once the application is operational, oncologists at WEST-
MED and other beta sites will be able to use it to tap into
the expertise and knowledge base of MSKCC and other
sources, resources they would not have had such ready
access to otherwise.
A key element of preparing Watson for this new role as a
physician’s helper is training it to interact with people the
way that’s most natural for the humans. Earlier attempts

3 39 3
building learning systems

at creating support systems for physicians fell short partly


because they forced the doctors to adapt to the ways of the
computer. Watson’s user interface is being designed to
work on different types of computers including desktops,
laptops, or even tablets that doctors can easily carry around
with them. Its procedures are based on the way physicians
like to work. Getting things done should be far simpler than
is typical in today’s human-machine interactions.

Scenario: Physician’s Assistant

Dr. Mark Kris, the chief of thoracic oncology at Memorial


Sloan Kettering Cancer Center, has developed a detailed
scenario describing how he thinks Watson could be used
by oncologists:

Preparation: The oncologist is preparing to treat a


Japanese-American woman who has lung cancer. Ahead
of their first meeting, he reviews her personal informa-
tion and medical records, including diagnosis and test
results—presented to him by the Watson interface on
his tablet computer. He has already spoken to the refer-
ring physician. Armed with all of this information, he
begins developing a hypothesis.
But the oncologist wants a second opinion—from
Watson. So he types in a request for Watson to use all of
the information at its disposal to evaluate the treatment
options for this patient. Then he taps the “submit” but-
ton. In a matter of seconds, Watson pulls together the

3 40 3
building learning systems

patient information, medical records, relevant medical


literature, and treatment results based on MSKCC’s ex-
perience. Watson points out that women from Japan
who are nonsmokers are fairly likely to have a mutation
in a gene, EGFR. If so, a drug, Erlotinib, shrinks the can-
cer 95 percent of the time. Watson suggests a test to see
if the patient has the gene mutation.
Watson suggests other tests as well, and, after re-
viewing the options, the oncologist can tap a few but-
tons to submit a request to the patient’s insurance car-
rier for preauthorization of the tests.
First Consultation: The physician meets with the pa-
tient in his office. He talks to her about her condition
and asks her about her treatment preferences. She tells
him that she would prefer to avoid a treatment that
would result in loss of hair. It would be too disturbing
to her children. After the meeting, the doctor types his
consultation notes into Watson and orders the new tests.
Review of Test Results: A few days have passed since
the first visit, and the oncologist is reviewing the test
results in Watson. It turns out that the patient has one
of the EGFR mutations, but it’s a mutation for which
Erlotinib will not work.
Second Consultation: During the second consulta-
tion with his patient, the physician explains his delib-
erations. He also asks the patient if she has noticed
any changes in her condition since they last met. She
says yes. Two days earlier, she began noticing blood in
her phlegm. The physician asks Watson how that new
piece of information affects its suggested treatment

3 41 3
building learning systems

options. The program comes back with a very different


set of options, ranked on its confidence level.
Watson has 90 percent confidence in the top option
on the list, a cocktail of two often-used chemotherapy
drugs. In addition, Watson points out that the treat-
ment is less likely than others to cause loss of hair.
After discussing the treatment with his patient and
securing her agreement, the doctor taps a button on
the Watson interface to request preauthorization of the
treatment from the insurance company. All of the evi-
dence that he and Watson have gathered is forwarded
to the insurance company to support the request.

The system works for all the parties involved. The phy-
sician gets access to a tremendous amount of up-to-date
information that he might not have gathered and digested
on his own. The patient gets a carefully chosen treatment
that’s likely to be the most beneficial and that may spare
her unpleasant side effects. She also doesn’t have to wait
long to begin treatment. The system works for the insur-
ance company, too. It gets the evidence it needs to con-
fidently give the go-ahead for a treatment regimen. And
society gets better health care that’s more efficiently and
affordably delivered. “We’re making evidence-based deci-
sions. Watson proves it,” says Mark. “It’s going to be much
better for all of us.”14

3 42 3
3
Handling Big Data

o3O3o

T he global financial crisis of 2008 had many causes, but,


boiled down to the essentials, it comes to this: Insti-
tutions and individuals didn’t fully appreciate the risks
they were taking. Lenders issued high-risk home loans
and securitized them without performing enough due
diligence. Investment banks packaged and resold them
in ways that obscured the underlying risks. Credit-rating
agencies gave the securities their seal of approval. Inves-
tors around the globe snapped them up. Government reg-
ulators failed to spot flaws in the system.1
The crisis was a failure of information. Though institu-
tions and individuals had access to immense quantities of
data about both the financial system as a whole and their
own parts in it, they were not capable of seeing what was
going on well enough to make prudent decisions. Major
changes are needed to prevent this kind of thing from hap-
pening again, including more effective regulatory moni-
toring and more responsible behavior by participants in
the system. But there’s an additional element that could
Handling Big Data

help prevent such a calamity: the emergence of technol-


ogy designed to help people understand the workings of
the entire system more deeply so they can better assess
the risks they face. Think of it as an early-warning system.
The financial crisis is a dramatic example of where tech-
nologists have the ability to harness data to help solve a
complex and critical problem.
Remember the metaphor of the nested Russian dolls
that we used earlier to illustrate the layers of cognitive
computing? In chapter 2, we described the outermost
layer—the transformation of computers into powerful
learning systems. Chapter 3 explores the doll that nests
inside that one—the new approach that these computers
will need for managing and analyzing data.
Today, we are witnessing the emergence of a new force in
society and business: big data. Organizations and individ-
uals are faced with a torrent of data, everything from struc-
tured information such as transaction records to a wide
variety of unstructured information—still images, video,
audio, and sensor data. The biggest new source of data is
the so-called Internet of things, data produced by sensors
and harvested via the Internet. The sensors involved range
from the RFID tags that retailers use to track merchandise
to video cameras that capture the flow of traffic. Every day,
as a group, human beings generate about 3 exabytes of
computer data—a prodigious output that is expected to
produce a data universe of 40 zettabytes of digital stuff by
2020.2 A zettabyte is a decidedly big number: a 1 followed
by 21 zeros. One zettabyte of storage would hold 250 billion
two-hour HD movies. This flood of information should be

3 44 3
Handling Big Data

tremendously valuable. When you get down to it, big data


is nothing less than the digital expression of life in the
raw, in all of its richness and ambiguity. As we mentioned
in chapter 1, it’s useful to think of all of these data as a
new natural resource, potentially even more valuable than
resources like oil, coal, or natural gas because data are an
essential ingredient in everything we do.
But a resource isn’t worth much unless you can take full
advantage of it. Today, less than 1 percent of the digital data
that have been collected is actually analyzed.3 This profu-
sion of data is difficult to capture, make sense of, and move
around. And, unfortunately, today’s computing systems
aren’t up to the task of handling all of this raw informa-
tion in an efficient and affordable way. That’s why we need
to develop new systems for managing big data. We need a
new generation of data storage, management, and analyt-
ics tools that will improve our ability to gather, meld, and
make sense of huge amounts of data and, in some cases, to
perform a complex ballet of tasks in real time.
The new generation of tools must be designed to han-
dle the four Vs of big data: volume, variety, velocity, and
veracity.

n Volume: The sheer amount of data we’re gathering is


a challenge in itself. While the cost of transporting and
storing a unit of data declines steadily, the amount we col-
lect is increasing at a much faster rate, so we have to find
new ways of moving and storing it.
n Variety: Until fairly recently, most business and gov-
ernment information was stored in simple spreadsheets or

3 45 3
Handling Big Data

relational databases made up of columns and rows—eas-


ily accessed using standard queries. But a host of newer
data types, including video, geospatial information, web-
pages, and speech, require new ways of managing data.
n Velocity: In some scenarios, dealing with the velocity
of data will be essential. So rather than first gathering and
storing data in databases, we must develop effective ways
of analyzing data in motion, drawing actionable insights,
and keeping only the bits that are essential.
n Veracity: Much of the new data we have access to now
are not precise or certain—or clearly organized. A lot of it
is just so much noise. The challenge is to make sure that
we’re getting the right information, that it’s accurate, and
that we’re able to draw the right conclusions from it.

How to Solve the Big Data Problem

A tremendous amount of progress has been made in the


way we handle data since the dawn of the programmable
computing era. The first commercial disk drive, intro-
duced by IBM in 1956, was the size of a small car and stored
just five megabytes of data. Today, a standard laptop com-
puter comes with a two-and-a-half-inch disk drive capable
of storing 1 terabyte of data, one million megabytes. Early
database-management systems stored information in
lists organized by categories and subcategories. The com-
puter user had to manually navigate through a series of
lists to find a particular nugget of information. In contrast,
today’s relational databases store information in columns

3 46 3
Handling Big Data

and rows that can be searched with standard query lan-


guages. Early computers calculated probabilities of partic-
ular outcomes based on statistical analysis of data related
to relatively narrow problems. Today’s advanced analytics
software programs find patterns in large sets of data and
extract meaning from them.
In order to handle the four Vs of big data, though, major
disruptive advances will be required in the storage, man-
agement and analysis of data.

Volume

One of the most significant breakthroughs we see coming


for storage of huge amounts of data is the ability to use
cognitive techniques to more efficiently manage the flow
of data within computing systems. Today, data are stored
using a variety of technologies, depending on how often
access is needed and the cost of the storage medium. Mag-
netic tape is the cheapest form of storage, but data retrieval
is slow. Disk drives are more expensive but faster to access.
Data storage on memory chips is lightning quick but even
more expensive. Typically, database administrators move
data from one technology to another based on rigid rules
and schedules.
But that approach isn’t satisfactory in the era of big
data. So scientists at the IBM Research lab in Zurich,
Switzerland, are developing a sophisticated system for
optimizing the storage of data for particular uses by con-
sidering how much it costs to store the data and how
quickly the computer system will need to fetch specific

3 47 3
Handling Big Data

pieces of information. The system will learn by study-


ing the application it’s being asked to manage. Then it
will make adjustments as it monitors the application in
action—putting bits of data in the right places at the right
times.4
For instance, an application for monitoring security on
a city street will store most of the video and audio it gath-
ers on tape or disk drives. But when it picks up anoma-
lies that might indicate security threats, it will store data
about them on memory chips and revisit that information
frequently as additional sensor data are gathered. Only
the most relevant data are stored on the most expensive
media, and only for the time they are needed.

Variety

Over the past decade or so, computer scientists and math-


ematicians have become quite proficient at handling spe-
cific types of data by using specialized tools that do one
thing very well. For example, a sales manager can employ
data analytics software to reveal past sales trends and fore-
cast future sales, using a specific tool for a specific data
type. But that approach doesn’t work for complex opera-
tional challenges such as managing cities, global supply
chains, or power grids, where many interdependencies
exist and many different kinds of data have to be taken
into consideration.
What’s required is a holistic approach to data manage-
ment and analysis. You have to be able to mash together
different kinds of data and act on them with different

3 48 3
Handling Big Data

kinds of tools. An early example of this approach is a proj-


ect that IBM researchers in New York and Brazil completed
on behalf of the city of Rio de Janeiro. Rio has recurring
flooding and landslide problems in many hilly neighbor-
hoods. The researchers used data describing the physics
of the atmosphere to create a mathematical model of how
storms are likely to unfold in Rio. With it, they can pre-
dict up to forty hours ahead of time how much rain will
fall in a particular location with 90 percent accuracy. They
also used topographical data and historical flood informa-
tion to create a model of how the water would flow in the
streets once it hit the ground. After combining the two
models, they were able to predict flooding and landslide
conditions. They also consider factors such as traffic pat-
terns on city streets. In this way, they’re making it possi-
ble for the city to anticipate flooding and landslides and
put resources in place ahead of time to deal with them—
including deciding where to deploy emergency crews and
where to open shelters. By combining the different data
types and techniques in this way, they created a system
that gathers a wide variety of information, evaluates it sys-
tematically, and reaches conclusions the way humans do.

Velocity

Most of today’s computing tasks involve data that have


been gathered and stored in databases. The data make
a stationary target. But, increasingly, vitally important
insights can be gained from analyzing information that’s
on the move. This technology can be used for everything

3 49 3
Handling Big Data

from real-time stock trading to drilling for oil a half-mile


below the surface of the ocean to monitoring patients’
health in hospitals. Consider the benefits of real-time
monitoring of the vital signs of preemies in a neonatal
ward. Babies don’t show signs of infection the way adults
to. The danger is that they will become very sick before the
problem is detected and die before an effective treatment
can be administered. But by monitoring their vital signs
and detecting subtle patterns that indicate an infection
has set in, physicians can spot problems and begin treat-
ments twelve to twenty-four hours earlier—potentially
saving lives.
This approach is called streams analytics. Rather than
placing the data in a database first, the computer ana-
lyzes it as it comes in from a variety of sources, continu-
ally refining its understanding of the data as conditions
change. This is the way humans process information.
Streams analytics programs are extremely sophisticated.
Each kind of data must be monitored on its own, trans-
lated so it is compatible with other data types, integrated
with other data, and analyzed again in the context of the
overall flow of information. The translation part is partic-
ularly challenging. Spoken words, for instance, have to be
converted to machine-interpretable form.
In the coming years, streams programs will become
more cognitive and adaptive, explains IBM researcher
Nagui Halim, head of the streams technology project.
He foresees a time when managers will provide a basic
description of what they want to a system to do and the
system will follow those specifications to design an

3 50 3
Handling Big Data

application made up of preprogrammed components


and then launch the application to get the job done. As
the applications gain real-world experience with the data,
the software will continuously sharpen its understanding
of what people are looking for and refine the answers it
provides.5

Veracity

Organizations face huge challenges as they attempt to get


their arms around the complex interactions between nat-
ural and human-made systems. The enemy is uncertainty.
In the past, since computing systems didn’t handle uncer-
tainty well, the tendency was to pretend that it didn’t exist.
Today, it’s clear that that approach won’t work anymore.
So rather than trying to eliminate uncertainty, people
have to embrace it.
One strategy is to build analytics software that makes it
possible to identify an optimal way of getting things done
based on mathematical probabilities while taking uncer-
tainty into account. The technique is called stochastic
optimization. It has been used for some time but is being
further developed to handle the problems of big data. The
optimization discipline has its roots in the early days of
computing. One of the pioneers in the field was Ralph
Gomory, who became the head of IBM Research. After
he got his Ph.D. in mathematics from Princeton in 1954
and served in the U.S. Navy, the navy hired him as a con-
sultant and asked him to use mathematical techniques
to help design the optimal naval taskforce—how many

3 51 3
Handling Big Data

aircraft carriers, battleships, and so on. He invented a new


method, called integer programming, to help solve the
problem. Initially, he did this work with pencil and paper,
then, with a mechanical calculating machine. Finally, he
was invited by the Rand Corporation, a think tank in Cali-
fornia, to use their large-scale computer, one of only a few
that existed at the time.6
Ralph created a program that could compute an opti-
mal design given a specific set of choices. But today we’re
faced with immensely complicated problems that contain
so many possibilities that it would take a long time, even
with a powerful computer, to explore all of them using
Ralph’s deterministic techniques. In fact, solving some of
the more complex problems would require more comput-
ing power than exists on earth. That’s why in the era of
big data we need to use stochastic optimization to get our
answers.
The term stochastic is derived from a Greek word for
“aim.” It refers to the fact that when an archer shoots a
series of arrows at a target, most will miss the bull’s-eye
and will create a cluster around it. Where each arrow falls
depends on a combination of factors, including the skill
and visual acuity of the archer, his emotions, the quality
of the bow and arrow, weather conditions, and the dis-
tance from the target. Stochastic optimization takes into
account such variables.
To perform stochastic optimization, computer scien-
tists use probability theory, the branch of mathematics
concerned with the analysis of random phenomena, to
analyze complex situations and figure out how to achieve

3 52 3
Handling Big Data

the best outcome while allowing for the effects of vari-


ability. The first step is building a mathematical model—
translating a real-world situation into a formula. They use
probability theory to identify the mostly likely effects of the
variables then perform a series of analyses or simulations,
based on a subset of the possibilities, to determine the
most likely outcomes. Previously, using less sophisticated
techniques, it might take days of computation to arrive at
a useful result; now it can be done in seconds. In addition,
as models are tested against real-world outcomes, they
learn and get better over time.
To see how stochastic optimization works, consider
an electricity grid. Today, many electrical-distribution
systems are outfitted with smart meters that make it pos-
sible for consumers and operators of the system to know
how much energy is being used in real time. Based on
that information, consumers can make informed choices
about their consumption levels, and operators can pre-
dict with a reasonable certainty what demand will be at
a specific time. But what happens if the weather changes
suddenly or there’s a failure in the system and power is in
short supply? Typically, the operator would go to the spot
energy market and pay a premium for additional power.
In a worst-case scenario, the utility might not be able to
meet demand, resulting in service brownouts.
Using stochastic optimization, mathematicians at IBM
Research have built a computing system that can pre-
dict what incentives would be required for consumers to
reduce demand in response to a sudden change in supply.
Rather than having to decide what to do on a case-by-case

3 53 3
Handling Big Data

basis, consumers can preprogram their smart energy-


management systems to respond instantly and to dial
back energy use under certain circumstances. Over time,
cognitive computer programs will be designed that learn
from the ongoing interactions and continuously optimize
the incentive systems.7 The consumer’s devices will be
in touch with the operator’s devices, and, as a result of
their automated negotiations, the operators will be able
to avoid paying a premium for energy on the spot market
and, in some cases, to prevent service outages.
Stochastic optimization addresses some of the chal-
lenges posed by the uncertainty inherent in big data, but
not all of them. Machines, on their own, don’t necessarily
recognize discrete pieces of information that are related
to one another but are scattered among multiple data
storehouses. How do they sort through all the evidence
and draw conclusions? They must learn to understand
context. Today, the most advanced approach to achieving
that aim is an emerging class of technology called entity
analytics. The software collects diverse data about real-
world entities (people, places, and things) and makes it
possible to identify who or what they are and to figure out
the relationships among them.
One of the pioneers of the field of contextual analytics
is Jeff Jonas, the chief scientist of IBM Entity Analytics. Jeff
took an unusual path to get to IBM. A serial entrepreneur
for two decades, he developed the entity analytics technol-
ogy to help casinos quickly detect the presence of known
cheaters and card counters. His big breakthrough came
after a casino asked him to find a way to ensure it was not

3 54 3
Handling Big Data

doing business with people who had been barred by the


gaming regulators. At the time, this casino had eighteen
different kinds of watch lists and information available
from other internal sources, ranging from job applica-
tions and hotel reservations to enrollment in the loyalty
club. Jeff’s technology, known as NORA, for Non-Obvious
Relationship Awareness, helped casinos make connec-
tions among related pieces of information, which, among
other things, helped them spot the notorious MIT card-
counting team that was featured in the best-selling book
Bringing Down the House and the movie 21. IBM bought
Jeff’s company, Systems Research & Development, and
since then he has added new features, including the abil-
ity to analyze streams of information in real time.
Jeff’s software essentially builds and maintains a giant
jigsaw puzzle on the fly, making connections between
newly acquired data and older information. With each
additional piece of information, more context accumu-
lates, and the software’s insights and understanding
improve. The technology might conclude, for example,
that the Jonathan W. Smith who is applying for a credit
card in New York City is the Jon Smith who was convicted
of credit card fraud in Jersey City five years earlier. The
system draws a conclusion based on data matches, such
as date of birth, previous addresses, phone numbers, and
other bits of data.
The system is quite nimble. New data can be ingested
in real time, and new insights can be drawn from them
immediately. It can even change its mind about a conclu-
sion it has reached based on new evidence. And the system

3 55 3
Handling Big Data

can even spot connections that are meaningful and alert


people to them before they have to ask. Says Jeff: “Because
there are not enough humans to dream up and ask every
question every day, the data must find the data and the
things that are relevant must find you.”8
We’re still in the early days of being able to gather and
understand context in effective ways. The IBM Research
scientist Sam Adams foresees the emergence of comput-
ing systems that will gather huge amounts of information
from a large number of sources, find patterns and cor-
relations and linkages in it, and then apply specialized
information-management software that Sam calls context
engines. The process is like finding many needles in a field
full of haystacks, combining those needles, then search-
ing the smaller but still substantial collection for the nee-
dles made of gold—the truly valuable slivers of insight.9
All of this may sound frightfully inefficient, but it’s the
way we humans make sense of the world around us. We
have experiences, gather sense- and language-based infor-
mation, see patterns, learn from successes and mistakes,
draw conclusions, and act on them. Increasingly, machines
will operate in a similar way. Computing systems will need
different kinds of data-organizing structures to accom-
plish this. A technology called the graph database could
become essential. This allows people to envision a web of
relationships that lies hidden in collections of unstruc-
tured data. For instance, consider how the technology
could be used in mining insights from social networks.
Jim Smith, who lives in New York City, has friendships on
social-networking sites with a number of people who, like

3 56 3
Handling Big Data

him, list Bruce Springsteen among their favorite perform-


ers. Several of the others live in Buffalo and went to the
same university as Jim, the State University of New York at
Buffalo. Some of the others have indicated on their social
networks that they “like” the Chipotle restaurant chain.
Springsteen is performing in Buffalo next month. Armed
with this web of information, an Internet travel site might
consider offering Jim an airline and entertainment pack-
age at a discount combining airfare to Buffalo, a night
in a hotel, two tickets to the Springsteen concert, and a
coupon for dinner at Chipotle. The travel site’s computer
system will use the graph database in much the same way
that a human being makes connections among his or her
relationships and interests.

The Emergence of Pervasive Analytics

We’re at an inflection point in the evolution of data ana-


lytics. The combination of massive amounts of informa-
tion and the tools to deal with it creates a new opportunity
that has huge implications for business, society, and
individuals. Today, computers are everywhere, thanks,
in large part, to the revolution in mobile communica-
tions that has brought us all manner of smartphones and
digital tablets. Now we’re on the front edge of a second
wave of everywhere-ness as data analytics also becomes
pervasive. Thanks to mobile communications, we no
longer need to be tethered to the monitor and keyboard
to get new insights because analytics allows individuals

3 57 3
Handling Big Data

and organizations to have the information they need


when they need it. The insights from analytics will soon
no longer be reserved for an elite few. In the future, these
insights will be available to just about anybody.
Pervasive analytics will enrich our private lives, as well.
People love the apps on their smartphones that allow them
to ask a simple question, such as, “Where’s the nearest
pizzeria?” and get the answer via a friendly robotic voice
in a matter of seconds. But what if you could ask more dif-
ficult questions, such as, “Can I afford to pay for the car?”
with the implicit assumption that the app already knows
you need to help out with your kid’s college tuition next
year, or, “Is it worth fixing my car?” knowing that the app
is tuned in to the mileage, maintenance record, and book
value of your car. Suddenly, you can get not just handy
information but useful insights that help you conduct
your life more successfully.

Scenario: The Cognitive Enterprise

In the coming era of cognitive systems, organizations


will acquire powerful new capabilities for using big data
to make better decisions. Dario Gil, an executive at IBM
Research who develops technology solutions for the
energy and natural resources industries, believes that
for organizations to maximize gains from cognitive sys-
tems, they will need to transform the way they operate to
become more dynamic, transparent, and collaborative.

3 58 3
Handling Big Data

And he believes the spaces in which they operate will influ-


ence whether they can achieve the transformation.
That’s why he designed the Cognitive Enterprise Lab,
a special workspace at the IBM Research in Yorktown
Heights, N.Y. As Dario sees it, the way companies do busi-
ness is expressed in the way they organize their work envi-
ronments. And most companies organize things in the
same ways—with private offices for executives, cubicles
for everybody else, conference rooms for group meetings,
the boardroom, and, in some cases, special rooms for
teleconferencing. Over time, the ways that people behave
and interact in these rooms become codified. Traditional
power hierarchies often determine who sits where and
who says what. The status quo is reinforced. Think of it
this way: ergonomics is destiny. The Cognitive Enter-
prise Lab is a designed to feel different from other work
environments, signaling that the old rules of interaction
have been suspended. The room is not even rectangular;
instead, one of the four walls slants at an angle. The chairs
and tables can easily be reconfigured depending on how
the space is to be used.
But the technology is what truly sets the space apart.
One entire wall is covered with a large high-definition
visualization screen upon which presentations and
graphics can be displayed and manipulated using hand
gestures. Other screens designed for aiding group inter-
actions can be positioned around the room by moving
them via tracks suspended from the ceiling. Cameras
capture speakers for video conferencing no matter where

3 59 3
Handling Big Data

they sit or stand in the room—and also record facial


expressions and body language. (The video might be ana-
lyzed later by the computer system, and insights drawn
from it can be given back to the speakers so they see how
people reacted to specific moments of a presentation.)
Acoustic arrays help with speech recognition and pro-
cessing. The idea is to create the richest possible environ-
ment for people’s interactions with data and one another
in order to enhance the group’s collective intelligence—
a term used by MIT’s Thomas Malone. Dario says, “The
goal is to have the system partner with the humans
inside the cognitive environments to transform the deci-
sion-making activities of teams that are dealing with very
complex problems.”10
Imagine how a large oil company might use cognitive
capabilities in such a space a few years from now. A criti-
cal decision-making juncture for the oil-exploration busi-
ness is whether to bid to secure leases on new oil fields.
Today, typically, the analysis that leads to such decisions
is performed in relative isolation by a handful of scientists,
financial analysts, and executives who rely on information
from seismic studies and publications from scholarly
and trade journals. The information is then combined
with their own experience-driven hunches about where
rich and accessible oil deposits might be located. Geolo-
gists, geophysicists, petroleum engineers, and financial
analysts each have their own sources of data, expressed in
the vocabularies of their domains. The decision maker in
charge of the bid speaks to each group of specialists sepa-
rately to gather information and advice.

3 60 3
Handling Big Data

But what if all of the parties to the decision were united


in an environment such as the Cognitive Enterprise Lab?
There, the group could carry on its conversations in part-
nership with a learning system that has access to mil-
lions of pages of scientific and industry documents, data
about the fields being considered, and information about
the company’s financial resources and about current and
future market scenarios. The system could integrate the
data, analytics, and simulations from different sources
so the group can perform cross-domain analysis of the
options. When considering how much oil will be avail-
able in a particular location and how much it will cost to
extract, members can call up visualizations of the scien-
tific data on the shared digital wall or personal screens,
where the data are expressed in an individually tailored
way. Because everyone is talking together and viewing and
interacting with the data at the same time, it’s less likely
that there will be misunderstandings or that anybody will
make recommendations or decisions that are not fully
supported by the evidence.
This environment could be used in a wide variety of
situations. A group of doctors could gather to evaluate a
particularly complex case. Public health officials could
manage a pandemic. Venture capitalists could size up a
new category of investments. Consumer-electronics com-
panies could plan the launch of new gizmos. Government
bodies could develop new legislation. City managers
could respond to natural disasters. In the Cognitive Enter-
prise Lab, it is more likely that the best ideas—backed by
evidence—will win.

3 61 3
Handling Big Data

Dario’s lab illustrates a key point about the era of cogni-


tive systems. No longer will the deep analysis of complex
situations belong only to data specialists. Instead, prob-
lem solvers of many different talents will be able to inter-
act with one another and with data in ways that were not
possible before.

Journey of Discovery:
The Ultimate Big Data Challenge

The Square Kilometre Array is one of the most ambitious


scientific projects ever undertaken. Its organizers plan on
setting up a massive radio telescope made up of more than
half a million antennas spread out across vast swaths of
Australia and southern Africa. When it’s completed in 2024
or so, astronomers will be able to analyze the data to bet-
ter comprehend the history of the universe and the nature
of matter. The SKA is a scientific sibling to the Large Had-
ron Collider in Switzerland. There, scientists are studying
the tiniest elemental particles for answers to some of the
fundamental riddles of existence. In contrast, “Our labora-
tory is the whole universe,” says Marco de Vos, managing
director of ASTRON, the Netherlands Institute for Radio
Astronomy, which, along with IBM, is proposing an infor-
mation-technology system to manage the SKA’s data.11
The SKA is an iconic example of the need in the coming
era for what we call radical collaboration. It’s such a com-
plex project, requiring so many skills and so much invest-
ment, that many participants are required. Ten nations

3 62 3
Handling Big Data

are backing the project, and thousands of scientists and


hundreds of companies are ultimately expected to play
roles in designing, building, and operating it. IBM and
ASTRON are melding their skills and resources in a true
partnership, and they have invited other organizations to
participate. Already, Square Kilometre Array South Africa,
an agency of the South African government, has joined.
In addition, ASTRON and IBM plan on tapping a global
network of hundreds of subject experts from industry and
academia to help design the system.
The SKA is the ultimate big data challenge. The tele-
scope will collect a veritable deluge of radio signals from
outer space—amounting to fourteen exabytes of digital
data per day. (The data collected by the SKA in a single
day would take nearly two million years to playback on an
iPod.)12 Because the telescope is to be made from so many
individual antennas, the antennas are to be so widely scat-
tered, and such a large volume of data is being gathered,
a novel computing system must be developed to manage
the process of gathering, storing, and analyzing data from
end to end.
ASTRON and IBM have combined forces on a five-year
initiative called DOME to create such a system. Their work
together began years before, when they collaborated on
the information-technology system for a radio telescope
called LOFAR. LOFAR is radically different from conven-
tional radio telescopes. Unlike the traditional large steel-
mesh dishes pointed at the heavens, the more than 10,000
LOFAR antennas, which act like a single virtual telescope,
are low profile and low cost. The antennas are set up in

3 63 3
Handling Big Data

clusters on vast fields in the Netherlands and elsewhere


in northern Europe. The antennas don’t move. The sig-
nals gathered by them are combined in a clever way to tar-
get objects of interest in the sky. ASTRON hopes LOFAR,
which is now up and running, will be used as a model for
the SKA telescope. It will be the test for the DOME proj-
ect and one of the pilot projects testing the SKA antenna
technology.
During a recent visit to one of the LOFAR fields,
ASTRON’s Albert-Jan Boonstra, a codirector of the DOME
project, explained how the equipment works. One antenna
type is made up of five-foot-tall posts that are held in place
by four wires, which are also key elements of the antenna
systems. A small disk at the top of the post contains elec-
tronics that amplify the radio waves and then transmit the
data along underground cables to a nearby metal box the
size of a small car, where more processing takes place. The
other antennas are spear-shaped metal pieces arrayed in
clusters on electronic circuit boards and covered with Sty-
rofoam sheets and plastic tarps. Most of the data is sent
by fiber-optic cables to an IBM Blue Gene supercomputer
at nearby University of Groningen, where it is filtered and
correlated.13
Albert-Jan grew up in a small village in the Netherlands
and has been obsessed with space since watching the
Apollo moon landing on TV as a ten-year-old. Inspired, he
and a friend made their own rocket out of tin cans and
wires, with gasoline for fuel—only to see it explode and
burn on the launch pad. “I’m fascinated with the idea that

3 64 3
Handling Big Data

we’re tiny humans in an endless, nearly empty universe. I


wanted to see how far we can look into space,” he says.
But Albert-Jan knows he won’t be able to penetrate deep
into space without some significant advances in computer
science. In fact, the DOME project will require more than
a half-dozen major breakthroughs because of the huge
amount of data and the fact that the antennas are to be
scattered over such a wide area. The signals from outer
space are a combination of valuable data and meaningless
noise and have to be processed to sort out the useful stuff.
To control costs, designers of the computing system have
to figure out how to minimize the amount of energy used
for processing data. At the same time, since so much of
the energy in computing is required to move data around,
they have to discover ways to move the data as little as pos-
sible. Faced with these challenges, Albert-Jan and Marco
engaged IBM Research–Zurich in a sort of show-and-tell
for emergent technologies. In a series of meetings held
at the lab, IBMers presented twenty proposals, of which
ASTRON ultimately accepted seven, each intended to over-
come a major information-technology hurdle posed by
the SKA.
While these technology projects are focused on the
SKA, they could help solve problems across a wide array
of computer science fields, from data management and
analytics to computer system design and chip design. The
SKA “is an extreme situation,” says Ton Engbersen of IBM
Research–Zurich, the codirector of the DOME project.
“When we solve it, we’ll have a good handle on solving the
other big data challenges.”14

3 65 3
Handling Big Data

Several of the DOME subprojects feature cognitive com-


puting technologies. The most audacious of them, called
Algorithms & Machines, is focused on creating an ultra-
sophisticated software program that will help the team
design the entire DOME system in a holistic manner—so
the technology can handle the extreme data-processing
demands of the SKA without breaking the international
organization’s budget. It’s one of most demanding sys-
tem-optimization challenges ever—the great-grandchild
of Ralph Gomery’s U.S. Navy fleet-optimization project in
the 1950s.
Algorithms & Machines is the brainchild of Ronald
Luijten, a Dutchman who has worked at IBM Research–
Zurich for twenty-eight years. He’s a student of computer
history and was impressed when the supercomputing pio-
neer Seymour Cray recounted in a speech years ago that
he designed the architecture for the Cray 1 supercomputer
in the early 1970s by essentially locking five engineers in
an office with him and a white board for two weeks. Such
a feat would be impossible today. Ronald figures that it
would require upward of 250 experts in a wide array of
computer science domains to design the architecture of
the DOME system. Hiring all those people and bringing
them together isn’t practical. So instead with Algorithms
& Machines, he and his team are gathering all of the per-
tinent knowledge in a repository, setting the parameters
for the entire computing system, and creating optimiza-
tion algorithms. The system will learn what they want and
prepare a recommendation on how to fulfill their needs.
Ronald sees this effort as nothing less than an attempt to

3 66 3
Handling Big Data

map out the contours of the new era of computing. “This


design-exploration tool will help us figure out how to
design everything from the chip of the future to the data-
center of the future,” he says.
Big data creates gigantic challenges, but it also offers
the potential of tremendous advances in the ways human
beings use information. In these huge data-intensive
projects—the SKA, Watson—global communities of sci-
entists and domain specialists are gathering large data
sets and building sophisticated models for understanding
natural and human phenomena. In a sense, with this data
and with the tools to understand it, we’re creating a collec-
tive intelligence that can be shared and used by many for
the betterment of humankind. That’s the ultimate prom-
ise of big data.

3 67 3
4
Augmenting Our Senses

o3O3o

W atson’s performance on Jeopardy! was remarkable.


The machine searched through millions of pages
of information for possible answers and came up with
solutions in about three seconds. But Watson was only
searching one kind of data—textual information. All Wat-
son could do is read and reason. But what if it possessed
machine equivalents of all of the human senses. What if it
could see, hear, touch, taste, and smell?
These new capabilities are crucial because of the explo-
sion of new kinds of data. The rapid growth of low-cost
sensors makes it possible to monitor everything from
leaks in water systems, to the performance of electrical
grids, to the location and movement of merchandise in
stores and warehouses. The proportion of the world’s data
that comes from such sensors is expected to increase from
11 percent in 2005 to 42 percent in 2020. At the same time,
the amount of video, still images, and audio content on
the Web is exploding. It’s estimated that seventy-two hours
of video content are uploaded to YouTube every minute.1
Augmenting Our Senses

Yet most images and audio on the Web are searchable


only via metadata—words that are manually typed into
forms by humans. So it’s vital to develop systems that can
recognize images and sounds more like humans do. As
the world becomes more complex and as its complexity
becomes increasingly decipherable via analytics, we will
need machines that can comprehend what’s going on in
the physical world and provide an expansion of human
senses.

Sensing the Future

It’s also going to be mind-blowing. At IBM Research, just


for fun, we recently conducted an exercise where we asked
specialists in each of the machine-sensing fields to imag-
ine advances that would likely come in the next five years.
They envisioned a handful of scenarios. Some of them are
practical; others, intriguing:

Sight

Computers will help us sort through the 500 billion photos


that are taken each year, finding relevant images not only
through metadata but also by understanding what’s going
on in the images. We’ll train the computers to know what
to look for, and they will learn more on their own as they
interact with the images and information pertaining to
them. Imagine a computer program that combs through
all of the family photos you and your relatives have taken

3 70 3
Augmenting Our Senses

and assembles a family album, in chronological order,


complete with descriptions of the who, what, and where
of each photo.

Hearing

Computers will help us understand what we hear. We


will record sounds that babies make when they’re hun-
gry, sleepy, in need a pat on the back, or in distress. Then
the system will map the sounds to brain, heart, digestion,
and lung activity. As a result, we’ll be able to translate
their cries and whimpers into messages adults can readily
respond to, extracting language from prelinguistic sounds.

Smell

Computers will be able to detect diseases via smell. Simi-


lar to how a breathalyzer can detect alcohol from a breath
sample, sensors in hospital rooms or even smartphones
will be able to identify molecules in the air and match
them with known biomarkers associated with liver and
kidney disease, diabetes, and tuberculosis. People won’t
have to wait for their annual physical to learn that some-
thing might be amiss. They’ll get an early warning.

Touch

You will be able to reach out and touch through your smart-
phone. When you’re shopping online, haptic technolo-
gies, which use electro-mechanical means to simulate the

3 71 3
Augmenting Our Senses

feel of things, will make it possible for you to distinguish


among different materials, textures, and weaves. You’ll be
able to feel a sweater, jacket, or upholstery sample right
through the screen.

Taste

Using human beings as sensors, we’re exploring the use


of data analytics and learning technologies to create
recipes that both taste good and are good for you. The
technology analyzes food in terms of how chemical com-
pounds interact with one another coupled with knowl-
edge about which combinations produce pleasant tastes.
Then it matches those factors with the health charac-
teristics of the compounds. The results could not only
delight people but help them lose weight and live longer
and happier lives. The technology represents an advance
in the field of computation creativity. It’s the first time
that chemistry has been combined with cultural artifacts
(recipes) and psychological principles to create some-
thing that’s brand new.

3O3O3O3

Today, robots with rudimentary sensing abilities generate


a lot of excitement in the popular media. The Mars rover
Curiosity operates on the surface of the Red Planet semi-
autonomously and “sniffs” the air for methane gas search-
ing for signs of life. Experimental self-driving vehicles

3 72 3
Augmenting Our Senses

are capable of using vision and other senses to navigate


through cities without colliding with buses, running over
pedestrians, or getting speeding tickets. Yet, as of now,
robots remain firmly in the von Neumann computing par-
adigm. They must be programmed in advance by people
to deal with nearly every situation they encounter.
Already, learning systems are playing important roles
in advances in speech recognition and image recognition.
Some voice-recognition systems, for instance, get better
at understanding an individual’s manner of speech the
more they interact. But the scientific community is just at
the beginning of making machines that learn like humans
do, so we’re just scratching the surface of the potential of
computer sensing. “The excitement with machine learn-
ing is figuring out how we build a computer to recognize
the things in the world and name them,” says John Smith,
a senior manager in intelligent information management
at IBM Research. “A machine needs to understand some-
thing about the world to understand pictures and videos,
and one way that it learns about the world is by examining
pictures and videos.”2
All sensing and learning technologies have basic ele-
ments in common, starting with the sensor, a device or
capability that can detect something about the physical
world. Then there’s the ability to extract features from the
sensory input, such as colors or edges in visual data or
phonemes in voice recognition. The next step is modeling.
The computing systems use a variety of analytical tools to
group together features that constitute an entity, such as
a person, a building, a song, or an odor. Then they apply

3 73 3
Augmenting Our Senses

semantics to place the entities in categories, such as loca-


tions, events, and behaviors.
Machine learning is employed at each step. The system
is instructed by the software to recognize features, enti-
ties, and semantic categories. Ultimately, because we can’t
teach machines to identify everything and every relation-
ship among things, they have to be capable of inductive
reasoning, in which they derive propositions from specific
examples. That’s the point at which the cognitive systems
begin to advance beyond the confines of von Neumann
computing. They don’t just do what they are told; they fig-
ure out what to do on their own.
In order to make this great leap and become true think-
ing machines, the cognitive systems of the future will inte-
grate information from multiple sensing technologies.
Today, as scientists labor to create machine technologies
to augment our senses, there’s a strong tendency to view
each sensory field in isolation as specialists focus only on
a single sensory capability. Experts in each sense don’t
read journals devoted to the others senses, and they don’t
attend one another’s conferences. Even within IBM, our
specialists in different sensing technologies don’t inter-
act much. Yet if machines are to help humans understand
the world, they have to make sense of it and communi-
cate about it in a way that’s familiar and comprehensible
to humans. This integration of data from various sensing
technologies is beginning to happen in multimedia and
visual analytics, where vision and sound are correlated.
But that’s just the start of what will be required in the next
era of computing.

3 74 3
Augmenting Our Senses

Imagine a security system in a major port city a dozen


years from now. On the docks, video cameras record the
movements of people and equipment and ships. Micro-
phones record sounds. Sensing devices sniff for smoke or
noxious gases. It would be prohibitively expensive to have
people constantly monitoring all of these streams of infor-
mation, but computers can integrate the information in
real time and detect patterns and correlations that indicate,
say, that an arsonist has boarded a ship late at night, broken
a window, and set a fire. The authorities might be alerted so
quickly that they arrive on the scene not only before the fire
has spread but also before the arsonist has fled the scene.
All of these new tasks will require a tremendous amount
of data processing. Millions of bits of digital information
are contained even in a single frame. A computer has to
extract various features, classify them, and build a seman-
tic understanding to make sense of that single image—
and a full video contains thousands of these. That’s a lot
of data to process.
Yet we constantly size up the world around us as a mat-
ter of routine. The human brain is a marvel. A mere 20
watts of energy are required to power the 22 billion neu-
rons in a brain that’s roughly the size of a grapefruit. To
field a conventional computer with comparable cogni-
tive capacity would require gigawatts of electricity and a
machine the size of a football field. So, clearly, something
has to change fundamentally in computing for sensing
machines to help us make use of the millions of hours of
video, billions of photographs, and countless sensory sig-
nals that surround us.

3 75 3
Augmenting Our Senses

In order to move forward, we need a new kind of chip,


now on the drawing boards, that embodies the principles
of cognitive computing in the realm of the senses. The goal
is not to reproduce a human brain in electronic circuitry;
rather, it is to take inspiration from the way the brain
works and use it as a model for designing chips and sys-
tems. By combining the principles of neuroscience with
techniques from nanotechnology, scientists can design
chips and systems that not only analyze massive amounts
of complex information from multiple sensory modalities
at once, like brains do, but also can dynamically repro-
gram themselves as they interact with their environments.
So how would a brain-based technology like this work?
In the simplest terms, the brain operates like this: Spe-
cialized cells, the neurons, receive, integrate and send
messages, which are electrochemical pulses of varying
amplitudes. The messages are passed along axons, the
paths between the neurons’ cell bodies, to the synapses,
the junctions between neurons. Through this messaging,
neurons collaborate to form networks that are capable
of sophisticated information processing.3 This network
of neurons is the organic circuitry of the brain, and the
neurosynaptic chips now under development mimic that
design.
Conventional chips are limited by the fact that a sin-
gle transistor can communicate with a maximum of four
other transistors at the same time. In neurosynaptic
computing, each tiny electronic element speaks to thou-
sands of others. The chip circuitry is networked like the
brain. Further, logic and memory are intermixed within a

3 76 3
Augmenting Our Senses

single chip, breaking the von Neumann bottleneck that we


talked about in chapter 1. That allows computing systems
based on the cognitive chips to store enormous amounts
of information and act on it efficiently.
There’s a debate in the scientific community, including
within IBM, about whether it will be better to use a digital
or analog approach to designing cognitive chips. Digital
chips process information in ones and zeros. They com-
pute using mathematics. But some researchers, includ-
ing IBM’s Mark Ritter, argue that the best approach for
designing neurosynaptic chips will be to make them ana-
log, meaning they will transmit information via electric
pulses of varying amplitude, similar to the messages in
the brain. In that scenario, the elements representing neu-
rons will “remember” what happened to them by storing
information about the strength and routing of the pulses
they receive and transmit. At the elemental level, these
mechanisms will make it possible for the machine to
think. “This is how we’ll make machines that truly learn—
that understand and digest data like the brain does,” Mark
says.4

Journey of Discovery:
A Chip Modeled on the Brain

On January 13, 2011, just days before the first broadcast


of Watson on Jeopardy!, a box arrived at IBM Research–
Almaden in San Jose, Calif. It had been mailed from the
IBM chip fabrication plant in East Fishkill, N.Y. Inside the

3 77 3
Augmenting Our Senses

box, cushioned by Styrofoam squiggles, were 200 experi-


mental neurosynaptic computing chips modeled on the
human brain.
Dharmendra Modha, the manager of cognitive com-
puting at the Almaden lab, and John Arthur, one of the
research scientists on his team, were thrilled as they cut
open the box and lifted out its precious cargo. To be sure,
the chips weren’t very smart compared to a human brain.
Each had approximately the intelligence of a worm, in that
it had a number of artificial neurons comparable with a
worm’s actual neurons. But they demonstrated that chips
could be made that to an extent re-create the interaction
of neurons and synapses in the brain through the combi-
nation of advanced mathematical algorithms and silicon
circuitry.
It was a significant step toward their ultimate goal: to
create a cognitive machine that can sense and draw con-
clusions from the things that happen in the world—not
through the abstraction of numbers but more directly, as
humans do via our senses. “This is not a science project.
It’s a quest,” says Dharmendra. “It involves many people,
many years, and lots of unknowns. But if we can engineer
a computer to exhibit the versatility and robustness of the
mind, it would have huge ramifications for society.”5
Their cognitive chips were the first harvest of the
SyNAPSE (Systems of Neuromorphic Adaptive Plastic Scal-
able Electronics) project, a collaboration of the Defense
Advanced Research Projects Agency (DARPA), IBM, and
faculty members from several universities, among them
Columbia, Cornell, Stanford, and the University of

3 78 3
Augmenting Our Senses

Wisconsin. It illustrates yet another key element of suc-


cessful innovation in the cognitive era. Researchers have
to keep an open mind. They can’t stubbornly pursue a
path because they’re invested in it personally. They have
to be open to making drastic changes.
Dharmendra grew up in Mumbai, India, and his early
education came in a vocational school. But after a tutor
took interest in him and gave him his first lessons in trigo-
nometry, he came to believe that he could go far in life if
he put his mind to it. He eventually got an undergradu-
ate degree from the prestigious IIT Bombay and a Ph.D. in
electrical engineering from UC San Diego, where, among
other things, he studied the mathematics of neural net-
works, one of the foundations of cognitive science. Dhar-
mendra landed a position as a researcher at the Almaden
lab, where he specialized in writing algorithms for stor-
age devices. When another company tried to recruit him,
Dharmendra instead went to lab director Mark Dean with
a radical proposition: he wanted to build a brainlike com-
puter.6 Dharmendra convinced Mark to permit him to
organize a conference on cognitive computing in 2006
where leaders in the field—including Nobel laureate Ger-
ald Edelman—presented their latest insights. Dharmen-
dra came out of that event convinced that it was possible
to, in a sense, reverse-engineer the brain.
At about the same time, Henry Markram and his team
at the Swiss Federal Institute of Technology conducted
an experiment where they simulated a slice of the brain
on a supercomputer—which led a few years later to the
European Union’s Human Brain Project. The race was on

3 79 3
Augmenting Our Senses

to create a virtual brain within a computer. Dharmendra’s


big break came in 2008, when DARPA launched SyNAPSE.
That program would provide funding for Dharmendra’s
project beyond what IBM would invest.
The goal was to design electronic circuitry that would be
based on the workings of the brain. It was a difficult jour-
ney. Realizing that the brain couldn’t be easily mapped to
today’s chips, which are designed and manufactured using
an approach called complementary metal-oxide-semi-
conductor (CMOS), the team planned on inventing a new
technology that would mimic the functions of neurons
and synapses. They intended to use the analog approach,
rather than digital, because biology is analog. But, after
a time, it became clear to Dharmendra that to develop a
new technology rivaling CMOS was an effort that would
take too long and cost too much to meet the deadlines and
budgets of the DARPA project. On December 16, 2009, he
arrived home from work feeling like a defeated man. Tears
welled up in his eyes as he told his wife, “SyNAPSE is toast.
It’s over.”
Fortunately, that was not the case. A few weeks later,
Dharmendra was struck with a serious case of the flu that
left him feverish and in bed. He couldn’t even muster the
energy to answer e-mail. But being stuck at home gave
him a lot of time to think over the SyNAPSE dilemma. He
concluded that while it was futile in the short term to try
to invent a new technology for a cognitive machine, that
didn’t mean the project should be abandoned. Instead,
the team needed to refocus on CMOS chip technology
and on digital circuitry rather than analog circuitry. They

3 80 3
Augmenting Our Senses

would create an entirely new non–von Neumann archi-


tecture in both silicon and software that would simulate
the functions of neurons and synapses. Using that archi-
tecture, they would produce chips for sense-making tasks
that would be vastly more efficient than today’s standard
digital processors.7
Dharmendra called a meeting of all of the participants
in the project. On January 21, 2010, about twenty-five peo-
ple crowded into the library at the Almaden lab. Others
were on the phone from IBM labs in New York and from
the universities. He knew that it would be difficult to con-
vince some team members that his new approach was
the way to go. To do so, he would have to get people to
abandon emotional attachments to fiercely held beliefs,
primarily the thought that the project could only succeed
through a new technology and a new kind of analog chip.
So he decided to use a brainstorming exercise called “six
thinking hats.” The technique requires participants in a
contentious discussion to identify, using colors, the essen-
tial nature of the points they make in discussions, with six
different colors standing for facts, emotions, and so on.8
After Dharmendra laid out his proposal an intense dis-
cussion ensued. Some of the scientists argued that they
should continue with the long-term exploration even
though it would not be possible to meet DARPA’s deadlines.
But Dharmendra made a convincing argument that they
needed to adopt a strategy that could succeed within the
DARPA timeframe. He argued that, in the future, armed
with more knowledge, they’d be able to continue the quest
for the novel technology. Also, he pointed out that by using

3 81 3
Augmenting Our Senses

CMOS and digital circuitry, they would be able to test their


designs thoroughly by simulating them in software using
supercomputers before going to all the work of designing
and manufacturing a physical chip. “Learn through simu-
lation. Burn on the chip,” he told them. People gradually
came around this way of thinking. Using the “six think-
ing hats” technique made it easier to separate gut feelings
from rational arguments. The exercise “gave us structure
and clarity, and made it easier to solve things,” says Bill
Risk, an IBM Research physicist who is one of the project
managers for SyNAPSE.9
Out of that meeting came agreement on the way for-
ward. Soon afterward, on February 17, 2010, DARPA enthu-
siastically endorsed the new direction. Within a year, the
team had designed the chips that arrived in the box from
East Fishkill. Rather than processing data sequentially,
as in classic von Neumann computing, the chips mimic
the brain’s event-driven, distributed, and parallel process-
ing. Using those chips, team members were able to write
applications, including teaching the system to play Pong,
one of the earliest computer games, which demonstrated
capabilities such as navigation, machine vision, and pat-
tern recognition.10
In mid-2013, IBM produced a second generation of
chips, with the design name TrueNorth. They possess
approximately the same number of neurons as a bum-
blebee. If you put one hundred of the chips together in
a computing system, you’re got something that occupies
less than two liters of volume and consumes less energy
than a lightbulb. Dharmendra calls it a “brain box.” The

3 82 3
Augmenting Our Senses

coming generations of brain boxes will be able to combine


information from different sensors and draw conclusions
from it, much as humans combine what they see and hear
to understand what’s going on around them.11
Dharmendra’s team has developed a number of sce-
narios to help people understand how their technologies
might be employed, using anywhere from a single chip to
dozens of them. They worked with industrial designers to
create physical models of potential devices:

n Assistive Vision, eyeglasses designed to help peo-


ple with impaired vision. The glasses would be outfitted
with video cameras that can identify obstacles, dangers,
and objects the wearers might be looking for. The glasses
would plot a pathway and tell the wearer how to navigate
via embedded speakers or ear buds.
n Tumbleweed, a robotic ball capable of rolling
through a disaster scene to find injured people. The ball,
perhaps the size of a volleyball, would have numerous vid-
eo cameras embedded in it, and would concentrate on the
images that are most useful. The cognitive system would
evaluate the video streams and other sensor outputs to
find injured individuals and send messages to the authori-
ties reporting their location via a radio transmitter.
n Conversation Flower, a device that captures both im-
ages and sounds during business meetings. The device,
with petal-like elements extending from a central core,
would sit in the middle of a conference room table and
monitor the proceedings. The flower might be designed to
give visual cues to the participants in meetings, opening if

3 83 3
Augmenting Our Senses

the conversation is dynamic and useful and closing if it is


boring and repetitive.12

The cognitive chips and Watson are complementary


technologies. For the sake of simplicity, you can think of
them as the right brain and left brain of the era of cognitive
systems. Watson, the left brain, focuses on language and
analytical thinking. The cognitive chips address senses
and pattern recognition. Over the coming years, IBM sci-
entists hope to meld the Watson and TrueNorth capabili-
ties together to create a holistic computing intelligence.

Scenario: Medical Imaging Assistant

Over the past century, tremendous progress has been made


in technologies designed to help spot evidence of disease
inside the human body. X rays, ultrasound, computed
tomography, nuclear medicine, and magnetic-resonance
imaging make it easier to identify and treat everything
from heart disease and breast cancer to complex gastro­
intestinal maladies. Yet innovation has lagged when it
comes to helping physicians interpret the information cap-
tured in such images. They must routinely look through
hundreds or even thousands of images to find the ones
that are relevant and telling. And, in the process, they may
miss signs of diseases they’re not explicitly looking for.
Imagine a new cognitive technology, a medical-imaging
assistant, that can automate the process of examining
medical images and also spot evidence of disease that

3 84 3
Augmenting Our Senses

physicians might miss. The technology doesn’t replace


radiologists; it relieves stress and eyestrain and helps
make them better at what they do. Today, a radiologist
typically reports for work at a hospital or clinic and plows
through two dozen cases during a shift. Often, all she
knows about each patient is a tentative diagnosis and
instructions on what to look for in the images. Radiolo-
gists are highly skilled people who are required to do too
much manual inspecting.
The medical-imaging assistant will help transform the
way radiologists do their jobs. When a specialist begins
considering a cardiology case, for instance, she will be
presented with a digital-image storyboard created by the
computer that shows the most significant twenty-five
images in a series. Write-ups created by the computer will
explain why each frame has been selected for close exami-
nation. These will alert the radiologist when evidence of
disease is found, whether or not it is the particular disease
being sought. The medical imaging assistant will have
been trained by examining millions of images along with
the diagnoses, treatments, and outcomes associated with
each of them. It will have a huge memory packed with
encyclopedic medical information, medical guidelines,
and rules established by the hospital or medical group.
Even more importantly, the system will make all of the
health-care information about the particular patient avail-
able to the radiologist, including his family history and
entire personal medical record. The assistant will offer
up annotated summaries of the results of other tests. The
radiologist will be able to carry on a spoken dialogue with

3 85 3
Augmenting Our Senses

the system to delve deeper into the particulars of each


case. In this way, the radiologist gets a full picture of the
patient and his problems.
The automation inherent in this system could make
health care much more efficient. A coronary angiogram
is typically made up of 25 short video sequences, contains
about 5,000 images, and takes a radiologist about 30 min-
utes to examine. Using the medical-imaging assistant, the
time spent looking at images could be reduced to perhaps
five minutes. The impact on patient outcomes could be
even more significant. A radiologist supplied with all of
the contextual information and given the ability to explore
the case through a conversation with the cognitive system
is likely to be more effective at providing sage advice to
treating physicians.

3 86 3
5
Designing Data-centric
Computers

o3O3o

A ll most people know about Watson’s hardware is


what they saw on TV: the machine’s cool, purple
façade, the avatar that stood between the human contes-
tants on Jeopardy! and Watson’s quirky robotic voice. But
there’s more to the machine than its TV persona. If you
were to take a tour inside the original Watson computer at
the IBM Research lab in Yorktown Heights, N.Y., you’d bet-
ter understand what made its performance on Jeopardy!
possible—and why computers must be redesigned for the
new era of cognitive computing.
The computer takes up a corner of a datacenter on
the lab’s second floor. Two parallel rows of refrigerator-
sized metal racks are loaded with ninety-two servers. A
door at one end turns the space between the racks into
a room within a room. At the time of the Jeopardy! con-
test, the microprocessors were running twice as fast as
the next-fastest general purpose computer on the planet.1
Air-conditioning fans and smaller fans within the servers
create a roar that’s nearly deafening. The microprocessors
Designing data-centric computers

give off an immense amount of heat. And that’s not a good


thing. All the heat and wind signals just how much energy
Watson consumes—85,000 watts of electricity when it’s
running at full throttle. That’s enough to keep the lights
on in a small town. In comparison, the human brain con-
sumes just 20 watts of energy. Unless we can make com-
puters many orders of magnitude more energy efficient,
we’re not going to be able to use them extensively as our
intelligent assistants. Computing intelligence will be too
costly to be practical. Scientists at IBM Research believe
that to make computing sustainable in the era of big data,
we will need a different kind of machine—the data-centric
computer.
Today’s computers are processor-centric. The micro-
processor, which is the central processing unit in the von
Neumann architecture, is where much of the action hap-
pens in computing. Working hand in hand with the oper-
ating system, the microprocessor sends out instructions
to various components within the computer, requesting
data from where it’s stored, including memory chips and
disk drives. If the computer is part of a larger network,
the processor fetches data from storage systems located
out on the network. A new system design is needed to
greatly reduce the amount of movement required. The
new design puts the data at the center, rather than the
microprocessor. This design will accomplish several ben-
eficial things at once, according to Bijan Davari, an IBM
fellow who operates at the cutting edge of system design.
Machines will perform computations faster, make sense
of large amounts of data, and be more energy efficient.

3 88 3
Designing data-centric computers

The race to make ever more capable computers began


in earnest in the 1970s, when engineers began packing
more and more transistors on a single chip. In the com-
puter industry, the term “scaling” refers to ability to add
more resources, for example, more electronic circuitry
and storage capacity. The process of packing more tran-
sistors on a chip is called “scaling down.” In addition,
over the years, as businesses and governments needed
ever more data-processing power, computer companies
built server computers that were designed specifically to
handle large and complex tasks. They pursued two strat-
egies. One was to combine more processing power and
other resources within a single machine—called “scaling
up.” The IBM mainframe is an example of this approach.
Another is “scaling out,” which is stringing many servers
together so they behave like a single large system. Exam-
ples of this are today’s supercomputers and Google’s vast
server farms.2
IBM scientists believe an essential element of data-
centric computers will be a new kind of scaling: “scal-
ing in.” Engineers will integrate memory with logic
and messaging in dense, three-dimensional packages.
Today’s memory chips are single layers of silicon with
large amounts of data-storing circuitry packed into them.
They’re used to store data temporarily and serve them up
to the microprocessor when requested. The chips take up
space in a computer and require a lot of data movement.
But what if you were to stack memory chips like so many
pancakes? They’d take up much less space, and much less
data movement would be required. That’s exactly what

3 89 3
Designing data-centric computers

IBM and other computer industry leaders are doing with a


concept called the hybrid memory cube.
This will be an amazing little device. In a single cube, you
will be able to stack many layers of memory chips. The lay-
ers are cemented together and connected electronically by
tiny holes that extend vertically through the stack and are
filled with copper. A layer of logic circuitry on the bottom of
the cube exposes the memory resources of the entire cube
for use by the microprocessor. The design will require 90
percent less space than today’s memory chips and 70 per-
cent less energy.3 In time, memory cubes will likely have
their own processors attached to them. Logic and memory
will be one, greatly reducing the von Neumann bottleneck.
This technology will have a profound effect on the
way computing is done. Giant supercomputers, used for
everything from oil exploration to crash tests in the auto
industry, will shrink in size and will use much less energy.
The cubes will be used in next-generation server farms to
greatly reduce the amount of space and electricity required.
Further down the road, the technology will make it pos-
sible to pack an immense amount of computing resources
into smartphones, tablets, and other portable devices.4
A second key element of a data-centric computer will
be the way processing is done within the system. Today, a
computer is equipped with a microprocessor, or CPU, that
performs all or most of the computation. All of the data
that the CPU acts upon must be transported from where
they are stored to the microprocessor then back again. In
some cases, data are transported hundreds or even thou-
sands of miles. A data-centric computer would distribute

3 90 3
Designing data-centric computers

processor chips throughout the system to greatly reduce


the movement of data.
This concept is being tried out today on specialized
computing systems that are used to analyze huge volumes
of data. Murali Ramanathan, a neuroscientist at the State
University of New York at Buffalo, is in charge of one such
project. His research focuses on spelling out the role of
genetic and environmental factors in multiple sclerosis.
MS is a cruel disease that most often strikes young adults.
The body’s own immune system attacks the brain and spi-
nal cord, causing physical disabilities and cognitive prob-
lems. The cause isn’t known, and there is no cure. Murali’s
goal is to find a cure for the disease—or discover a way
to prevent it in the first place. What makes his analysis
so challenging is the fact that there are so many potential
combinations of genes that could contribute to MS and,
when genetic information is combined with environmen-
tal factors such as diet and smoking, the number of pos-
sible causes increases exponentially.
This is a data-intensive problem. Problems like these
can’t be addressed effectively with conventional super-
computing, where massive parallelism is the key con-
cept. A machine breaks the problem into small chunks
and farms them out to thousands of microprocessors to
address in parallel, then combines the results of those
calculations to derive an answer. To deal with tasks like
Murali’s, it makes sense to meld parallelism with the strat-
egy of moving some of the processing closer to the data.5
Murali and his colleagues acquired a new machine
specifically designed for data-intensive analysis. The

3 91 3
Designing data-centric computers

refrigerator-size machine uses specialized processor chips,


called field-programmable gate arrays, to filter the data on
storage disks before passing along only the relevant pieces
of information to the main processors. The chips can be
custom programmed after they’re in the machine to han-
dle certain kinds of queries, and, as a result, they reject
as much as 90 percent of the data. This implies that the
system performs some of the analysis as the data are mov-
ing off the disks, rather than handling all of it on the main
processor or, in some cases, shipping it off to specialized
analytics computers. As a result, the data do not travel as
far, and less energy is required to move bits around. The
work can be done faster and more efficiently.6
To get an idea of how effective this kind of comput-
ing can be, consider the results that Murali and his col-
leagues are achieving. Shortly after they installed the new
computer, they ran a problem that contained 100,000
genetic and environmental variables. That means there
were nearly 5 billion two-predictor combinations and 100
trillion three-predictor combinations to be evaluated. Yet
it took just eleven minutes to do the analysis—compared
to twenty-seven hours on their conventional computer.
Murali was sitting at his desk on the SUNY campus when
he received an e-mail containing the summary results. “I
was very, very excited. This can open the door for us,” he
says. “We’re on the cusp of being able to solve problems
that were previously unsolvable.”
No matter where the data are crunched in a comput-
ing system, it will be increasingly essential to move data
around more quickly and efficiently. This is the third key

3 92 3
Designing data-centric computers

dimension of designing data-centric computers. The most


promising means of moving data faster is by harnessing
photonics, the generation, transmission, and processing
of light waves. Already, telephone companies and broad-
band communications providers use pulses of light travel-
ing via fiber-optic cables to speed the transmission of data,
voice communications, and multimedia content such as
movies and TV shows over long distances. In addition,
fiber-optic technologies move data within buildings and
between racks of machines in computer datacenters. Next,
photonics will be used to connect servers in a single rack
and chips on a circuit board. Ultimately, the technology
will be used within the chips themselves.
In a decade-long initiative, a team of scientists at IBM
Research has achieved breakthroughs in nanophotonics
that could greatly improve the performance of all kinds
of computers. Their newest technology, called CMOS inte-
grated nanophotonics, combines traditional electrical
components with optical components on a single silicon
chip. In today’s advanced computers, optical elements
are placed on separate chips, which impedes the perfor-
mance of the computers and adds to costs. By combin-
ing optical and electrical elements on a single chip, the
researchers hope to make it possible for manufacturers
to produce large quantities of such devices through pro-
duction processes that are already in use. Their work has
now advanced beyond the pure science stage into product
development, with a goal of reaching the market within
the next few years. This melding of light and electric-
ity will help computers move data around with a speed

3 93 3
Designing data-centric computers

and energy efficiency that is impossible using today’s


technologies.

Scenario: Adding Intelligence


to the Electric Grid

Today’s “smart grids” aren’t actually very bright. Most


electric utility systems that claim the “smart” label use
advanced metering systems that enable customers to
monitor their electricity use and enable two-way commu-
nications between the meter and the utility provider. The
systems make it possible for utility companies to set pric-
ing that varies by the season and the time of day, providing
discounts to customers who shift energy use to off-peak
periods. Useful, yet there’s so much more that can be done
to make utility grids smarter. One of the most important
elements will be distributing intelligence throughout the
grid by moving a lot of analysis to where the data are gen-
erated, that is, moving the data processing to the data.
A utility grid is a complex system that connects sources
of energy generation through a distribution network with
users of energy, both commercial and residential. Grids
are becoming more complex as more sources of energy
are added to the mix, including wind and solar; as more
consumers produce their own energy; and as new uses of
electricity spread, such as the charging of electric vehicles.
Sophisticated information-technology networks need to
be built to monitor and manage the grids, better balanc-
ing supply and demand, coordinating transactions among

3 94 3
Designing data-centric computers

the diverse parties involved, and optimizing the grid so it’s


both efficient and resilient.
Potentially, electrical grids could be managed by large,
integrated computing systems—cognitive systems—that
stretch from the home appliances to wind farms off the
coast. Today’s grids are equipped with many sensors that
measure not just generation, transmission, and con-
sumption but more detailed information such as prob-
lems with particular transformers or power lines. Data
collected from equipment scattered throughout the sys-
tem are transmitted to central computers to be recorded
and analyzed. But, in the future, with more data pouring
in from sophisticated sensors across the entire network,
this approach will require more energy, bandwidth, and
storage than is practical. Some of the analysis will have to
move to the data.
In a decade or so, intelligent devices capable of data
analysis will be embedded in every piece of equipment
and appliance connected to the grid. These devices will
collect detailed information on energy consumption
and help us manage our energy use. Intelligent software
agents installed in the devices will control how appliances
function, based initially on preset parameters. For exam-
ple, in a family home, an air-conditioning system might
be set up to reduce the temperature beginning one hour
before the family members arrive home at the end of the
day. But the system might learn from motion sensors that
one resident is coming back earlier and working in the
home office. It may begin to cool earlier—but just in the
part of the house that person frequents. The intelligent

3 95 3
Designing data-centric computers

agent could also poll family members, perhaps via their


smartphones, to see how satisfied they are with the heat-
ing and cooling setup, and, based on their reactions, make
modifications.
In another scenario, several families in a neighborhood
might switch to electric vehicles. Standard practice would
be to charge the vehicles’ batteries at night, but if mul-
tiple households charge their vehicles at the same time
it will overtax the neighborhood transformer. To avoid
power outages, intelligent devices in the transformer and
homes will detect sudden increases in demand for vehicle
charging and will coordinate with one another to sched-
ule charging sessions for the various EVs at different times
during the night.7
Putting intelligence in the network has the potential to
transform the way energy is generated and used. This will
make it easier for new sources of energy to be integrated
into the system. It will provide strong incentives for con-
servation, hasten repairs after power outages, and make it
possible for societies to expand electricity services with-
out needing to make massive investments in new genera-
tion plants. When all of this comes about, smart grids will
begin to truly deserve the name.

Journey of Discovery: Rethinking


How Computers Are Designed

How much does it cost to move a single bit of data from


point of origin to point of computation? Apparently, until

3 96 3
Designing data-centric computers

recently, nobody bothered to try to answer that question—


perhaps because it didn’t seem consequential. But now, in
the era of big data, computer scientists see that when you
move a massive amount of data over long distances—or
even over short distances many times—the costs quickly
add up. So that’s why David Turek, IBM’s vice president
of exascale computing, asked the question of a roomful of
IBM computing systems experts in 2011. “Fortunately, they
were willing to listen to a stupid question,” he says. “In the
data world, it’s becoming a big issue, and we have to come
up with a way of quantifying it.”8
That simple question helped focus an initiative at IBM
that’s aimed at fundamentally re-architecting computing
systems. It’s the most radical rethinking of how computers
should be designed since IBM’s introduction of its System
360 family of computers in the early 1960s. The project is
an example of the new culture of innovation, demonstrat-
ing a willingness to abandon old ways of seeing things and
start clean. The initiative is called Data Centric Deep Com-
puting, or DC2.
IBM fellow Jim Kahle is responsible for piloting DC2—
and for answering Dave Turek’s question. He has played
leading roles in some of IBM’s most significant advances
in chip design over the past three decades. They include
the Cell processor, a chip that was employed in the Sony
PlayStation game console, the IBM Roadrunner super-
computer and advanced digital TV sets. Yet the Cell chip
did not become as widely used throughout computing as
Jim had hoped. With DC2, he dreams of creating a revo-
lutionary design that, ultimately, permeates the entire

3 97 3
Designing data-centric computers

computing world, starting with high-performance com-


puting and filtering out to everything else.9
Jim was born in Venezuela, where his father worked in
the oil business, and the family later moved to Indonesia
before Jim came back to the United States to study com-
puter science at Rice University. Jim credits his childhood
experience with diverse cultures with preparing him to run
large and complex engineering projects where close col-
laboration is essential. (At its peak, the Cell development
team was 450 people strong and included engineers not
just from IBM but from Sony and Toshiba.) Jim reached
out to people with a wide range of expertise to make sure
new ideas came into the discussion. He’s doing the same
thing now with DC2.
In the early months of the DC2 project, Jim convened large
gatherings of experts on silicon semiconductors, photon-
ics, memory technologies, computer architecture, software,
mathematics, and analytics. They began by describing to
one another the new uses for computing that they see com-
ing in the next decade and then explained how they’re pre-
paring for them. Based on that input, Jim is now mapping
a new computer systems architecture for the cognitive era
and a process for achieving it. “We have to figure out how to
reorganize ourselves, which is not easy culturally. We have
to steer the boat in another direction,” Jim says.
He and his crew haven’t yet finished the discovery phase,
but, along the way, they have developed some firm ideas
about the principles that will guide them to the end of the
journey. The cornerstone of their thinking is the belief that
there must be a shift from processor-centric computing

3 98 3
Designing data-centric computers

to data-centric computing. Jim remembers the day in late


2011 when the idea of data-centric computing first came to
light. He and about a dozen scientists and engineers had
gathered in conference room 36-002 of the IBM Research
lab in Yorktown Heights—the DC2 war room. They were
stumped. They had a lot of great technology ideas but no
unifying principle. Then team member Ravi Nair, arriving
late to the meeting, rushed into the room and showed his
colleagues a simple drawing he had made. On the left of
the page was a graphic depiction of traditional computing
showing how data move across all the levels of the data-
storage hierarchy, from disk and tape storage to various
types of memory and finally to the processor. On the right,
he showed how processing could take place at each level
of the hierarchy and said energy use would decrease as a
result. Jim suddenly saw the light. The team would turn
computing inside out. The brains in a computer wouldn’t
be in one place anymore. They’d be all over the place. The
processing would move to the data.
They had their big insight, but how would it play out in
the detailed design of the computer? To get to that, Jim
had to answer the more basic question Dave Turek had
asked him a few weeks earlier: How much does it cost to
move a bit of data? Jim went off and did his homework and
came up with some useful comparisons. For instance, it
costs anywhere from 800 to 6,000 picocents (one picocent
is a billionth of a penny) to transfer a single bit through a
network, depending on how far the bit travels. In contrast,
it costs roughly from 6 to 27 picocents to move data within
a single computer.10 So it’s 200 to 1,000 times more costly

3 99 3
Designing data-centric computers

to move data a long distance than it is to perform the pro-


cessing next to the place the data are stored.
The fact that the costs of data transfer are so variable in
different situations means that Jim and his team had to
carefully evaluate each kind of transfer with a cost-benefit
analysis. That led them to some interesting conclusions.
They decided that specialized processors attached to
components of the computing system would be assigned
to perform particular tasks. They’d do a slice of the
processing—the part that is optimally done next to the
data in memory, on disks, or on tape drives or at the point
where data flow into the system. In other cases, depend-
ing on the task at hand, most of the data may be moved to
the main processors.
Now the specialists on Jim’s team are going off to engi-
neer new components, and the software architects have to
design the enabling software. All this has to work together.
Designing a data-centric computer will take several years.
Ultimately, Jim and his team hope to produce computing
systems that recognize the tasks that are being assigned
to them and automatically parse out the various process-
ing tasks to different chips similar to the way a music com-
poser assigns notes and percussive sounds to different
instruments in an orchestra.
It will take a tremendous effort to redesign computers
to take on big data, but the potential payoff is great. It’s
possible that a decade or so from now the power of today’s
fastest supercomputer will fit in a box on your desk and
tomorrow’s fastest computers will be performing tasks we
can’t even imagine today.

3 100 3
6
Inventing a New Physics
of Computing

o3O3o

M ore than other single technology, the microchip has


driven the tremendous improvements in comput-
ing performance over the past sixty-plus years in every-
thing from personal computing to consumer electronics
to space exploration. Every two years, scientists develop
techniques for putting twice as many transistors on a tiny
chip of silicon, doubling the chip’s capacity for storing or
processing information. The microchip is arguably the
most essential technology of the information age. Unfor-
tunately, this rocket ship’s ascent is slowing, and it’s in
danger of falling back toward earth. With each passing
year, the expected performance gains become harder to
achieve. And now scientists foresee a time, approximately
a decade from now, when progress will slow to the point
that it will begin to affect not only the computer industry
but the whole of society.
The microchip combines two of the crucial scientific
advances of the mid-twentieth century—the transistor
and the integrated circuit. The transistor, invented in 1947
Inventing a new Physics of Computing

by John Bardeen, Walter Brattain, and William Shockley


at AT&T’s Bell Labs, provides the switching capability
that’s required for computation and data storage. The
integrated circuit, invented separately by Jack Kilby at
Texas Instruments and Robert Noyce at Fairchild Semi-
conductor, packs numerous transistors on a substrate of
semiconducting material, enabling massive-scale pro-
duction of a wide variety of tiny electronic devices. Over
the years, silicon emerged as the semiconducting mate-
rial of choice. In the 1960s, the microchip pioneer Gor-
don Moore set the tempo of an entire industry when he
observed that the number of transistors on integrated cir-
cuits had been doubling roughly every year and predicted
that the pace of progress would continue. Ultimately,
he forecast that the density of transistors on ICs would
double every two years—a statement that has been codi-
fied as Moore’s Law.1 Robert Dennard of IBM Research,
who is best known for inventing the DRAM memory chip
in 1967, also led the team that pioneered the concept of
integrated circuit scaling. His rules provided a recipe for
miniaturization that engineers have followed ever since.
Today, after much innovation by a host of scientists, bil-
lions of transistors can be placed on a single chip. The
work that Bob launched enabled Moore’s Law to hold for
decades.
But there’s a problem: physics. One of the miracles of
the miniaturization of silicon-based semiconductors has
been that transistors actually perform better the smaller
you make them. But because of the laws of physics, we
have reached a point where transistor performance will

3 102 3
Inventing a new Physics of Computing

increase only marginally or not at all with each new wave


of miniaturization. Transistors have ceased to perform as
perfect switches. Electrical charges “leak” from one part
of the transistor to another when they’re extremely small
and the transistor does not perform well, which leads to
a significant increase in the power consumption of cur-
rent chips and might lead to more severe malfunctions in
future chips. The densely packed chips can heat to near
the point of failure—not a pleasant experience for the
computer user.
Chip designers and manufacturers have resorted to a
series of refinements and work-arounds to extend the life
of Moore’s Law and improve the performance of chips. In
the past few years, designers have changed some of the
materials they use to build transistors and have limited
the clock rate—the speed at which electrical pulses race
through chip circuitry. In order to make up for the loss of
speed, they have divided elements of the processors into
separate cores, where data processing is performed in par-
allel. Now, to counter the effects of the limits of physics,
additional layers of silicon and metal are being added to
create three-dimensional chips.
But as important as these advances are for enabling
progress in microelectronics, they don’t address the core
problem: atoms don’t “scale.” No one can make traditional
silicon devices whose essential elements approach the size
of a few atoms. Also, in devices that are just a dozen or so
atoms across, unpredictable things start to happen. They
cross the border between classical physics and quantum
mechanics. “We’re very near the end of classical scaling.

3 103 3
Inventing a new Physics of Computing

It’s a real discontinuity with huge implications,” says Bob


Dennard.2
What’s required is a fundamental rethinking of micro-
electronics. The goal is to produce computer chips that
are much more densely packed than today’s and offer
much more functionality, without being so power hun-
gry. Toward that end, scientists are exploring a variety of
approaches, ranging from developing new semiconduc-
tor materials that would replace the silicon transistor to
designing new types of devices using the principles of
quantum physics or new circuits and computing strat-
egies that, for instance, take as their models the neu-
ral networks of the human brain. At IBM, we have a
name for this endeavor: the search for a new physics of
computing.
This journey will take computing from today’s top
microprocessors, with billions of transistors, to the nano-
systems of the future, which will consume 100 to a 1,000
times less the power and be more efficient. Observers
have predicted that some day we’ll have chips with tril-
lions of transistors, but IBM’s top scientists believe that
if we succeed with the most ambitious new approaches,
it might actually require fewer switches to perform more
work. When scientists achieve these breakthroughs, it
will be as important a development as the invention of the
transistor itself.3
This search for a new physics of computing is vital to
the progress of electronics. We need to do more comput-
ing in smaller spaces using less electricity. But the effort is
also necessary because of the new aspirations we have for

3 104 3
Inventing a new Physics of Computing

computing. To make the most of the flood of big data, to


enable massive interconnectivity between systems and peo-
ple, and to produce the next advances in learning systems,
we need vastly more capable computers. If we are to take
on complex problems that involve the interplay of multiple
social, technical, natural, and economic systems, we need
computers that “think” differently at the nano level.
These efforts are all part of the quest to invent the
switch that will replace the transistor. These are early days.
Nobody knows for sure what the replacement will be. One
approach that seems especially promising to researchers at
IBM is a new semiconductor device category called piezo­
tronics. The idea is to manipulate sound waves in a tiny
piece of material that will serve as a switching element—
the piezoresistive element, perhaps twenty nanometers
thick. Applying voltages to those tiny elements changes
their thickness. The change in resistance constitutes
a change in state—the switch from zero to one that is
the basis for computation. These devices, theoretically,
could be packed densely on a chip and would require
much less energy than today’s transistors—about ten
times less power.4
Taking a longer view, there’s great potential in applying
the principles of quantum mechanics to create a new kind
of computational device. Quantum computing has been
a Holy Grail for researchers ever since Nobel Prize–win-
ning physicist Richard Feynman challenged the scientific
community in 1981 to build computers based on quan-
tum mechanics. A classical von Neumann–type computer
makes use of bits, where each bit represents either a one

3 105 3
Inventing a new Physics of Computing

or a zero. In contrast, a quantum bit, or qubit, can repre-


sent a one, a zero, or both at once. Therefore, two qubits
can be in the states 00, 01, 10, and 11 at the same time. For
each added qubit, the total number of potential states
doubles. Hence, the use of qubits in certain kinds of com-
putation could enable us to process exponentially larger
quantities of data than is possible with the same number
of conventional bits.5
For nearly two decades after Feynman issued his famous
challenge, the field of quantum computing remained pri-
marily theoretical. But in the 1990s and early 2000s, break-
throughs in algorithms and experimental techniques gave
researchers hopes that what had seemed like one bril-
liant physicist’s pipe dream could actually become reality.
And now new achievements are being reported every few
months. “The work we’re doing shows it’s no longer just a
brute-force physics experiment. It’s time to start creating
computing systems based on this science,” says Matthias
Steffen, the manager of an experimental quantum com-
puting project at IBM Research.6
Quantum computing has huge implications for the
field of data encryption because quantum computers can,
theoretically, factor large numbers like those used for mak-
ing sensitive data indecipherable to prying eyes. So, the
first application for quantum computing could be code
breaking—just as it was for the first programmable com-
puters in the 1940s. It would take the fastest contemporary
supercomputers billions of years to factor numbers hun-
dreds of digits long. But, theoretically, a quantum com-
puter could perform these tasks in hours or even minutes.

3 106 3
Inventing a new Physics of Computing

While quantum machines seem unlikely to replace


general-purpose computers any time soon, these tech-
niques could affect other domains of computing in ways
not yet foreseeable. An especially intriguing target is help-
ing people to understand the interactions of complex sys-
tems of systems that underlie everything from the human
body to cities to the global financial industry. One thing
seems certain: only through fundamental breakthroughs
in physics will we be able to deal with so much complexity
and uncertainty on a planetary scale.

Scenario: Designing Products


from the Molecule Up

In 1935, scientists at the E. I. DuPont Co. invented a syn-


thetic fabric, later named nylon, that became the first
commercially successful synthetic polymer. Nylon was
first used as a replacement for silk in women’s stockings
and parachutes but later came to be employed much more
broadly in everything from clothing to gears in machines.
The invention of nylon was a long and painstaking pro-
cess that stretched for five years as a team of DuPont sci-
entists experimented with one combination of molecules
after another until they discovered a combo that had all
the characteristics they desired in terms of usefulness,
manufacturability, and affordability.7 It was a triumph of
ingenuity.
But what if scientists and engineers could use comput-
ers to help them develop new materials—indeed, whole

3 107 3
Inventing a new Physics of Computing

products—from the molecule up? That way, they could


much more quickly identify promising materials, mech-
anisms, and designs, potentially saving a tremendous
amount of time and effort. If the computation, search, and
learning capabilities of IBM’s Watson could be applied
not to searching for answers that are already known, as
Watson does, but to a search for answers that are not yet
known, the result would be a discovery machine.
It’s possible that discovery machines like this will be
commonplace in the new era of computing. Already, phar-
maceutical companies use supercomputers to identify
promising molecules in the drug-discovery process. And
the U.S. Department of Energy’s national laboratories use
massively powerful racks of computers to simulate the
behavior of fissionable materials in the design of nuclear
weapons. But to be able to design products from the mol-
ecules up, computers will have to become much more
capable and computation will need to become much more
affordable than they are today.
In fact, these tasks will require several orders-of-mag-
nitude improvements in processing power and energy
efficiency. The machines will have to be 1,000 times more
powerful than today’s fastest supercomputers. If we were
to try to achieve that level of performance using the chip
designs and materials we have today, a computer would
require about 100 megawatts of electricity to run, which
would result in an annual electricity bill topping $100 mil-
lion.8 So the improvements we need can only be achieved
through fundamental changes at computing’s nanotech-
nology level.

3 108 3
Inventing a new Physics of Computing

But brute-force computation is just part of what will be


needed to design products from the molecule up. Discov-
ery machines will also require a combination of sophis-
ticated simulation, machine learning, and analytics
software to get the job done. The machines will be capable
of ingesting all the scholarly articles, patent documents,
and basic scientific knowledge pertaining to a specific
area of enquiry. They’ll learn to be expert chemists, mate-
rials scientists, and physicists. Based on what has been
tried before and other knowledge they acquire, they’ll sim-
ulate millions of potential combinations of molecules and
make predictions about which combinations will achieve
the desired results—then make recommendations to the
real chemists, materials scientists, and physicists.
Consider a scenario from the energy industry. The photo­
voltaic cells in solar-energy panels are made from elec-
tronic circuitry combined with crystalline silicon. These
cells are too expensive to enable the kind of mass deploy-
ment that will be necessary if solar energy is to play a
major role in moving the world away from fossil fuels. But
what if you could develop solar cells out of much cheaper
organic materials? A discovery machine could help by
coming up with a way to capture energy from light by
using an organic dye like chlorophyll, which is the mole-
cule that allows plants to absorb energy from light during
photosynthesis. Because every place on earth has differ-
ent solar characteristics, the machine would have to come
up with recipes for thousands of organic dyes that could
capture the maximum amount of energy in each location.
It’s mass invention at the nano-scale.

3 109 3
Inventing a new Physics of Computing

Jed Pitera, an IBM researcher who specializes in using


computer simulation to address fundamental questions
about biology and chemistry, has an interesting take on
the potential for discovery machines. Because they aren’t
humans, he says, they won’t be limited by the prejudices
that human engineers and scientists are burdened with
when the process of discovery takes them into the realm
of nanotechnology. “At the nano-scale, a lot of what peo-
ple think of as normal or intuitive tends to be wrong or
incomplete,” Jed says. “If we can build a cognitive comput-
ing system that starts with a blank slate, it could develop
a better understanding of how things work in that world.”9
So far, we’ve been talking about chemistry and physics
at the molecular level, but there’s another dimension of
computer-aided discovery: invention at the macro-scale.
Eventually, using a combination of high-performance
computation and data analytics, we will be able to simu-
late the operations of entire physical systems, such as a jet
engine or an entire aircraft. Products like these are made
up of countless materials and have multiple simultane-
ous physical processes. But the capability to simulate the
behavior of these systems while they’re still on the draw-
ing boards could cut years off the time it takes to develop a
new aircraft, nuclear power plant, or submarine.10
When companies can design entire products from the
molecules up, it will revolutionize innovation and produc-
tion, remaking entire industries and roiling economies.
But in order to be able to design products this way, scien-
tists will first have to design new computing components
from the atoms up.

3 110 3
Inventing a new Physics of Computing

Journey of Discovery:
The New Physics of Computing

On March 2, 2011, Andreas Heinrich and his team of sci-


entists at IBM Research–Almaden accomplished a major
breakthrough in nanotechnology. They produced the
smallest device that can be used to reliably store a single
bit of magnetic information—albeit at low temperature.
(A bit is either a one or a zero.) It comprised just twelve
atoms.11 Today’s disk drives use about one million atoms
to store a bit. The team’s breakthrough has the potential
to make a big difference in the realm of data storage, mak-
ing it possible, for instance, to store an entire movie and
music collection on a charm-sized pendant. In addition,
the techniques could be used by scientists who are explor-
ing quantum computing
Andreas was thrilled by the discovery. Sitting in his
team’s tiny laboratory in a building perched on top of Cal-
ifornia’s Santa Cruz Mountain range, he pressed the but-
ton on his computer mouse to reverse the magnetic states
of the twelve atoms and stared at a small computer screen
where the atoms showed up graphically as small bumps
on a flat surface. When the structure received an electri-
cal pulse from the tiny probe tip of the microscope, some
atoms grew visibly taller while others shrank, based on
their antiferromagnetic arrangement. Andreas switched
the magnetic state of the atoms back and forth for nearly
four hours. “It was basically, ‘Wow, this works.’ Do it again.
‘Wow, this works,’ ” he recalls.12 Such are the joys of a sci-
entist working at the edge of the known physical universe.

3 111 3
Inventing a new Physics of Computing

Andreas and his work illustrate another key element of


what we call the new culture of innovation: the willing-
ness to invest a tremendous amount of time and energy
in a project, to go deep on the science and see what it
teaches us. This approach to research produced the tran-
sistor and many other advances that came from corporate
laboratories in the twentieth century. But corporate labs
have withered as one company after another cut funding
for long-term, high-risk projects. At IBM, we believe that
it’s essential that corporations, working with their part-
ners in government and universities, restore their com-
mitment to fundamental scientific research—to boldness
and persistence.
IBM has operated at the front edge of nanotechnology
since the field’s earliest days. The first breakthrough came
at IBM Research–Zurich in 1978 after the lab hired the
recent Ph.D. Gerd Binnig and teamed him with the senior
researcher Heinrich Rohrer. Within a few months they got
the idea for the scanning tunneling microscope, which
can detect individual atoms by scanning over a surface
with a tiny electronic probe. The instrument takes advan-
tage of a phenomenon called tunneling, which causes
electrons to jump from the atoms on the surface to the
apex of a sharp tip placed in very close proximity. By mea-
suring the amount of tunneling current that occurred, the
scientists could plot a picture of the surface and the atoms
positioned on top of it.13 For the first time, scientists could
“see” landscapes composed of individual atoms. This was
the birth of nanotechnology. For their efforts, Gerd and
Heinrich won the Nobel Prize for Physics in 1986.

3 112 3
Inventing a new Physics of Computing

Another of Gerd’s inventions, the atomic force micro-


scope, allowed scientists to observe atoms within insula-
tor (nonconducting) materials and also, in a sense, to see
inside molecules. IBM researcher Gerhard Meyer, also at
the Zurich lab, is the senior member of a team that uses
the AFM to directly observe the chemical structure of a
molecule by imaging the chemical bonds and the elec-
trical charge distribution within the molecule. With this
capability, scientists are able to study chemistry on a
single-molecule level, holding great promise for atomic-
scale computing and energy storage.
Moving atoms came in 1989. A team of scientists at IBM
Research–Almaden headed by Don Eigler made improve-
ments to the STM, and Don became the first person to
manipulate an individual atom. A few weeks later, in a
marathon twenty-two-hour session in the lab, he arranged
thirty-five xenon atoms to spell out “IBM.” While the mov-
ing of a single atom was remarkable in itself, Don later
commented that there was an even more important result:
“The biggest change was it changed people’s perspective.
We showed that there was a world out there made of small
things, which we had never had access to, but we now
have direct access to. It changes the way we think about
things.”14
That world of small things now looms large on the
human landscape. Advances like this don’t usually come
quickly or predictably. Andreas had been experimenting
with the STM for ten years when the team achieved the
big magnetic-storage breakthrough. His most important
attribute, he says, is persistence. “Nobody tells us what to

3 113 3
Inventing a new Physics of Computing

pursue. We can explore what we want. The atoms tell us


what they want to do.”15
Andreas’s persistence originally won him his job at
IBM. While studying for his Ph.D. in physics from Ger-
many’s Göttingen University, he was so determined to
land a job on Don Eigler’s team at the Almaden lab that
he kept trying even after Don turned him down twice.
Each time, he proposed a new experiment using the
STM. He finally won Don over with a proposal to study
the spin effects of atoms on surfaces. Spin is a quantum-
mechanical property of particles such as electrons, a
phenomenon that could prove useful in next-generation
computing. Instead of switching the states of electronic
components by applying electrical pulses, you might
be able to do it by changing the spin, which would take
much less energy.
In order to do these experiments, the team had to build
another STM with new capabilities—a project that took
nearly two years. Over time, Andreas’s research focused
began to focus on the smallest number of atoms that
could retain the stable magnetic characteristics of clas-
sical physics. The creation of the smallest device that
could store a bit of magnetized data was a byproduct of
the team’s study, not the goal. That’s what Andreas means
when he says the atoms talk to him.
The newest member of the STM team is Susanne
Baumann, a twenty-five-year-old Swiss Ph.D. candidate.
She has the same intensity and persistence as Andreas.
Susanne recalls the day in second grade that she first

3 114 3
Inventing a new Physics of Computing

became obsessed with science. She was reading a book


that contained a description of atoms and a photograph
of Don Eigler’s tiny “IBM” logo. Susanne remembers run-
ning in shock to her mother and demanding to know:
“Atoms are in the air and in me and in metal. How can
this be?” Years later, when she was studying for her mas-
ter’s degree, she began pestering Don and, later, Andreas,
about giving her a job in the lab. (After Don retired from
IBM, Andreas succeeded him as head of the STM team.)
“I even offered to find my own money for the stay,” she
says. Finally, Andreas relented—based on the quality of
Susanne’s proposal: to experiment with ways to make it
easier for the STM to differentiate the atoms on different
parts of the surface. This research, she hopes, will help
the group make progress toward quantum computing.
That’s the team’s new quest. Now that they have discov-
ered that as few as twelve atoms can combine and retain
the effects of classical physics, they’re moving in the
opposite direction and asking how many atoms can be
combined while still achieving the behavior of quantum
mechanics. And they want to know how to make the quan-
tum effects more pronounced and how to control them.
The goal is to control a single quantum bit on a surface
within three years. Once they can do that, they believe
they’ll be able to assemble devices from qubits that can
potentially be used for computing.
Andreas and his team are in the early stages of their
exploration. It could take them ten years or more to pro-
duce a working prototype for quantum computing. But

3 115 3
Inventing a new Physics of Computing

since they know how to build structures from the atom up


they have some advantages over scientists pursuing other
approaches. If they get this right, they may be able to
develop practical techniques that could be used for man-
ufacturing massive quantities of quantum-mechanical
devices.16 Those ultra-tiny devices could become the equiv-
alent of today’s microchips for the next era of computing.

3 116 3
7
Imagining the Cognitive City

o3O3o

A fter a fire broke out in a building visible from Yankee


Stadium during the 1977 World Series, broadcaster
Howard Cosell captured the zeitgeist of that moment in
New York City’s history with the on-air comment, “Ladies
and gentlemen, the Bronx is burning.”1 New York had
fallen into a decline from which many people worried it
would never recover. The city was near bankruptcy. Cor-
porations had fled to the suburbs. Crime was rampant. A
couple of years earlier, a sanitation workers’ strike had left
fetid mounds of garbage bags piled on the streets. It was a
city where nothing seemed to work anymore. Indeed, New
York had become emblematic of the fate of cities across
the globe. In developed economies, people fled to the sub-
urbs to escape crime, crowding, crumbling infrastructure,
pollution, and failing schools.
Today, in major world economies, young people, profes-
sionals, and empty nesters are migrating back into cities
in search of economic opportunities, cultural enrichment,
and fun. Meanwhile, in the emerging world, cities are
Imagining the Cognitive City

magnets for people from the countryside who dream of


brighter futures for themselves and their children. Esti-
mating that one million people move into cities each week,
experts predict the global urban population to double by
2050 to 6.4 billion, which would make 70 percent of the
total world population.2 Yet, at the same time, many of the
problems that sent people fleeing from cities decades ago
remain, stubbornly resisting traditional fixes. It’s clear
now that the future of cities is the future of the planet, so
it’s essential that solutions to those problems be found.
Fortunately, cities are tremendous engines of innova-
tion. The urban center is where the dreamers, planners,
engineers, builders, social activists, and artists congre-
gate. The more diverse groups of people interact with one
another, the more likely it is that new ideas will germi-
nate and take root. This charged mixture, along with bold
leadership and new technology, could lead to a global
renaissance for cities, which could grow not just bigger
but better. According to Geoffrey West of the Santa Fe
Institute, cities are the sources of our problems but also
can be the sources of our solutions. In order to achieve
that goal, however, “we desperately need a serious scien-
tific theory of cities,” Geoffrey said at a Ted Talk in Edin-
burgh, Scotland, in 2011.3 What would a scientific theory
of cities look like? IBMers who study cities describe it as
a large set of structures, patterns, and processes that pro-
vide a formal, quantitative approach to understanding the
complex systems of cities of all sizes. In this way, people
who are involved in running, planning, and building cit-
ies will be able to comprehend how the human and built

3 118 3
Imagining the Cognitive City

systems interplay with one another. Today, different role-


players in cities see the urban environment though their
own lenses—the planner, the architect, the builder, the
mayor, the transit boss, the financier, the retailer, the res-
taurateur, and the labor leader. They’re like medical spe-
cialists who have tremendously deep knowledge of one
aspect of human biology but don’t really understand how
the biological systems of the body interact. A science of
cities will provide that fundamental understanding of the
urban organism, a common language for discussing it,
and methods for diagnosing the problems of cities.
In the new era of computing, we believe that technology
will help people not only create a scientific theory of cities
but act on it. City leaders and residents alike will be able
to harness technologies to better understand how their
cities work—and why they don’t work better. And they’ll
be able to use that knowledge to make cities more conve-
nient, safer, and more economically vibrant.
These days, advanced data-analytics technologies are
used in cities for everything from developing crime-fight-
ing strategies to improving the efficiency of transit sys-
tems. In fact, technology has played a significant role in the
turnaround of New York City. In the early 1990s, the police
commissioner, William Bratton, initiated the Comstat pro-
gram, a detailed tracking of all crimes in the city coupled
with weekly police department meetings in local precincts
aimed at using the data to come up with improved policing
tactics. Since those early days, New York’s policing strate-
gies and crime-tracking technologies have become even
more sophisticated, and the crime rate has been dropping

3 119 3
Imagining the Cognitive City

for twenty years. The latest advance in technology for polic-


ing is a network of 3,000 cameras and hundreds of radiation
detectors deployed in Manhattan that look for packages
that might contain bombs. The system uses artificial intel-
ligence to review an immense amount of video imagery and
radiation data in real time so a suspicious package can be
spotted and checked out quickly by police.4
But smart policing is just one narrow slice of the vital
role that cognitive systems could play in the lives of cities.
City leaders and sociologists often talk about the impor-
tance of a holistic approach to solving a city’s problems.
Cognitive systems that learn, mine insights from moun-
tains of data, and penetrate complexity enable this holis-
tic approach toward the major issues cities face, from
crime to traffic to pollution to education. For example, the
manager of a city hospital who has identified juvenile obe-
sity and diabetes as growing health problems in the urban
communities could use cognitive computing to model a
solution. This would require gathering data on all the fac-
tors that affect local children, from their home activities
to the availability of healthy food in local markets restau-
rants, their transportation options, the proximity of safe
places to play, and the culture of their families. The com-
plexity of these data require a cognitive system that can
refine and correlate the various inputs and allow the hos-
pital team to run simulations and develop a model. This
model points toward programs the hospital can initiate or
advocate to improve community health.5
Cognitive systems have the potential to knit together
all the systems in a city so they can be understood and

3 120 3
Imagining the Cognitive City

managed better. Over time, waves of technological inno-


vation have brought us the utilities that enable the mod-
ern city—water and sewer systems, electric grids, roads,
bridges, subways, communications infrastructure. In
addition, a multitude of city services emerged to provide
for the health, education, safety, and welfare of residents.
The computer era enabled us to monitor and control
some of these systems as discrete entities. Now it’s clear
that none of these can be viewed in isolation. Indeed, cit-
ies must be seen as complexes of interrelated systems. We
need a new utility: a cognitive fabric overlaying all the
other systems that can make sense of them and manage
them.
In chapter 3, we briefly mentioned a major project
undertaken by the city of Rio de Janeiro, Brazil. The city
has set up a state-of-the-art operations center, Centro de
Operacoes Rio, where the staff monitors dozens of data
feeds for information concerning weather, traffic, and
medical services and anticipate looming problems, put-
ting defenses in place to diminish their impact. Rio’s
operations center embodies the principle that only by
considering the human-made and natural systems of a
city in a holistic way can municipal leaders hope to man-
age the complexities of a modern metropolis.
The tropical climate that makes Rio so attractive also
means that it suffers from severe storms, which endanger
lives and property. The city leaders got a shocking wakeup
call on April 5, 2010, when torrential rains caused floods and
landslides that brought much of the city to a standstill and
killed more than seventy residents. Rio’s mayor, Eduardo

3 121 3
Imagining the Cognitive City

Paes, vowed that such a disaster would not happen again.


He moved decisively to bolster defenses against weather-
related disruptions with the new operations center. But the
facility does much more than respond to emergencies. By
coordinating the activities of more than thirty city and state
departments plus private utility and transportation compa-
nies, the operations center is the first such facility in the
world that is on the path to integrating all of the functions
of a city in a single digital command-and-control system.
Because data from all of the agencies and companies are
integrated, operations managers get a full picture of what’s
happening in the city at any time. They can prepare for a
large concert or sporting event in the same way they handle
emergencies. The agencies don’t just share data; represen-
tatives from the various departments actually sit side by
side in the operations center and look at live videos of city
streets and facilities or graphical representations of data
feeds, making group decisions on the fly.
The Netherlands is also moving boldly into the realm
of intelligent operations. With 20 percent of its land below
sea level, the country depends on a vast network of dykes
and sluices to hold back and divert sea, river, and rain water.
Yet global warming and the threat of rising sea levels and
droughts means the Dutch can never be complacent. So
the government is launching an innovative collaboration
aimed at harnessing big data to improve management of
the water system while restraining costs.
IBM scientists believe that sophisticated learning sys-
tems will be useful in situations like those faced by the

3 122 3
Imagining the Cognitive City

Dutch. Authorities there still monitor the structural integ-


rity of the country’s dykes primarily through visual inspec-
tion every five years. But already they are experimenting
with continuous monitoring using satellites and sensors
embedded in the dykes. Cognitive systems will tap all of
those data, learn how dykes react under normal condi-
tions, and automatically spot anomalies that might signal
breaches or the need for maintenance.
The Digital Delta initiative brings together the Dutch
Ministry of Water, the Delfland local water authority, the
Deltares science institute, the University of Delft, IBM, and
other organizations. IBM’s role is to research how to inte-
grate and make available water-related data from more
than one hundred sources to scientists and businesses so
they can draw insights from them. The expectation is that
when people have easy access to relevant data and tools
for analyzing them, they will be able to reduce the time
it takes to develop new science and technology solutions
and reduce development costs.
This project could serve as a model for countries, regions,
and cities worldwide that face complex challenges from
climate change, natural disasters, rapid urbanization, and
other large-scale problems. The Dutch government didn’t
try to solve this problem by itself. Instead, it formed a
public-private partnership drawing on the expertise and
resources of government, local water authorities, universi-
ties, and private industry. The project could teach leaders
elsewhere around the globe valuable lessons about how to
cope with rising waters.

3 123 3
Imagining the Cognitive City

An Operating System for the City

You can think of the software undergirding Rio’s opera-


tions center and the Digital Delta project as the operat-
ing system for the city or country, just as Linux, Windows,
iOS, and Android are operating systems for computers
and smartphones. Operating systems not only make com-
mand-and-control connections among all of the functions
of the machines but also create a platform that enables a
tremendous amount of innovation. Makers of computer
hardware such as printers and publishers of software
applications such as word processors essentially plug
their products into the computer via the operating system.
Also, since so much basic functionality is packed into the
operating system, that frees other developers to focus on
making useful or fun products that fit on top.
The same will go for cities. They need operating sys-
tems that make it easier to share information and coordi-
nate functions. They also need operating systems to serve
as platforms for innovation, enabling city employees,
businesses, and citizens to comprehend more deeply the
workings of the city and to think up ideas that make cities
better places in which to live. IBM and other companies
are designing and building these urban operating systems.
In the meantime, another type of urban technology is
already flourishing: applications that make cities work
better or help people manage their lives in urban settings.
Think of them as city apps. Open-data initiatives and con-
tests launched by cities are giving rise to an explosion of
creativity—thousands of homemade, easy-to-use apps

3 124 3
Imagining the Cognitive City

that do everything from gathering public opinion on hot-


button issues to helping people plan the best way to use
public transit. Some of these city apps are quite sophis-
ticated. Streetline, a company based in San Francisco,
offers a parking-space locator service in a number of cit-
ies, including San Francisco and Los Angeles. Sensors are
embedded in the pavement, and drivers use their smart-
phones to find open spots, cutting down on the wasted
time and fuel that turn parking in cities into a nightmare
for many drivers. But most city apps are not connected
to one another via operating systems. They will be much
more useful when they share data and insights and can
be used in concert by city managers to take on previously
unsolvable challenges.

Discovering the Invisible


Potential of Cities

When it comes to cognitive systems in cities, we’re just


beginning to scratch the surface of what is possible. As we
go forward, data will increasingly help us spot the invisible
signals being sent out by our cities. For instance, Arizona
State University scientists have created a program called
Hestia that makes it possible for city leaders to measure
how much greenhouse gas is being emitted from specific
areas of an urban landscape—right down to the street and
building level. That way they can make better-informed
decisions about how to reduce carbon dioxide emissions
and do their part in combating climate change.6

3 125 3
Imagining the Cognitive City

In Indianapolis, where the program is being tested,


Hestia gathers data from a host of sources, including traf-
fic counts, air pollution measurements, and tax-assessor
parcel information. It performs traffic simulations and
building-by-building energy-consumption modeling. All
of these data then go into a massive computer model of
the entire city. Hestia is able to capture the patterns of
emissions and display them in three-dimensional maps.
It’s immediately clear where the big problems are. The
ultimate goal is to map the carbon dioxide emissions in
all the major cities in the United States, which accounts
for about 25 percent of all global CO2 emissions.
Science must become an integral part of the rebirth of
cities. The old adage, “You can’t improve what you can’t
measure,” is as true today as it ever was. What’s different
is that we’re now getting our hands on immense amounts
of information we didn’t have access too before, and we
are getting new tools with new measuring capabilities we
could not have imagined previously. The city is becoming
a living laboratory for technological innovation.
One of IBM’s top city scientists is Guruduth Banavar,
who led much of the work in Rio. As chief technology offi-
cer of IBM’s Smarter Cities initiative, Guru spent several
years traveling the world, visiting cities and interacting
with their governments and business and civic leaders.
Now head of industries technologies at IBM Research,
Guru sees the problems and potential of cities in stark
terms: It’s a supply and demand equation. You’ve got to be
able to quantify the available resources—people, capital,
natural resources, land, goods, and services. And you have

3 126 3
Imagining the Cognitive City

to understand the current and future demands for those


resources. So the way to think about improving the opera-
tions of cities is through resource management, matching
supply with demand.7
This concept is well known to people in the electric util-
ity business. They create demand-response models with
data and analytics that make it possible to manage vast
grids with multiple types of supplies and myriad consum-
ers of electricity, as well as variables such as weather and
earthquakes.
Now, Guru says, we need to create much larger demand-
response models for improving the livability and eco-
nomic vitality of cities. He thinks of these models as
cognitive engines for city management. The first step is
to continuously gather vast quantities of data of all types
about available resources. This would essentially be an
evolving digital copy of the city in all its complexity. Analy­
sis of the data then spots patterns: changes in human
mobility, increased demand for water and electricity, and
so on. Simulations help anticipate changes in supply and
demand. “The system is learning as it goes,” Guru says.
“It’s aware of the past, the current state, and the potential
for the future, and it helps you manage for the present and
plan for the future.”
Guru grew up in Bangalore, India, and spends a lot
of time in Indian cities today. He knows firsthand about
traffic congestion, crowding, poverty, inadequate infra-
structure, and the rapid migration of country people into
urban centers. For him, this quest is personal. He’s hope-
ful that with better systems for managing cities—cognitive

3 127 3
Imagining the Cognitive City

systems—there will come vast improvements in mass


transportation, freshwater supplies, waste-water treat-
ment, dependable electricity, education, health care, and
public safety. His vision of the cognitive city isn’t about
futuristic places where cars are suspended in the air and
humanoid robots hand out martinis as we zip through
our front doors on conveyor belts. It’s about cities that
improve the quality of life for billions of people and that
advance the sustainability of the planet.
Today, city planners and managers focus on informa-
tion that we can relatively easily capture and control. Much
of this information fits neatly in the tech industry’s con-
ventional databases and is reliable. In the future, our com-
puting systems will be able to face much more demanding
tasks. They’ll ingest large chunks of the world in raw form
and make sense of it in all of its complexity, sloppiness,
and uncertainty. In order to do this, our computers must
be capable of learning. They’ll be thinking all the time,
gathering and analyzing, going deeper and deeper, ready
to respond to humans’ questions and pose questions of
their own.
Up until now, we have used counting and calculating
to find precise answers to our questions. In the future,
how we frame the questions will define our successes.
The new era of computing will make it possible for us to
reshape our cities and our world based not on instinct and
tabulated data but on new approaches to problem solv-
ing and questioning. With these new cognitive tools, we
can grapple with the world as it really is: emergent, not
deterministic. And that will enable us to respond much

3 128 3
Imagining the Cognitive City

more intelligently and adaptively to its complexity and


unpredictability.8
In the future, cognitive systems will monitor a city’s
every twitch and tremor and predict what will happen
next. They will guide us to decisions and actions that pro-
duce the best outcomes for individual citizens and soci-
ety. But it won’t be enough to invent a new generation of
tools and new ways of managing the world. The people
factor must not be ignored. IBM has learned through hun-
dreds of interactions with cities in recent years that it’s
essential for leaders to engage with citizens if they hope
to get big things done. When they don’t, they’re likely to
meet resistance. When they do engage, they stand a bet-
ter chance of not just getting support but of enlisting the
creativity and energies of the people. The many online
systems for reporting potholes that cities have set up are a
good example of this concept in action. When given the
opportunity to report holes in pavement, citizens typically
respond enthusiastically. That can save the city public
works department a lot of time and expense in inspecting
for potholes, and analysis of the reports can help improve
preventive maintenance.
This kind of citizen engagement can have much more
profound effects as well. When citizens are well informed,
they understand better how a city operates and what it
takes to provide the services and the physical infrastructure
that they expect. Big data analytics can help bring them that
depth of understanding. Perhaps then they’ll be more sup-
portive of long-term investments in service and infrastruc-
ture improvements.

3 129 3
Imagining the Cognitive City

In order to make our cities even more livable and effec-


tive, we’ll also have to remake ourselves. To fulfill our cit-
ies’ potential, we’ll have to become more collaborative and
less narrowly self-interested. We’ll have to be more opin-
ion seeking and less opinion offering, more interested in
asking tomorrow’s questions than in giving yesterday’s
answers.

Journey of Discovery: Cities in Motion

Stand on a busy city street corner in the middle of the day


and you will witness a chaotic scene: thousands of people
walking, getting on and off buses, descending to subways,
riding in cars, and moving in and out of buildings. Where
did all these people come from? And where are they going?
Today, we can begin to comprehend the complexities of
cities in motion and use those insights to help design the
cities of the future.
Through an IBM Research project called Insights in
Motion, scientists have developed analytics software that
provides anonymized information about the movement
of massive numbers of people. This information can be
used by city managers to plan new transit routes, improve
the efficiency of current transit systems, and coordinate
the various transportation modes in order to make mov-
ing around in cities more convenient and comfortable.
This kind of rich data about the movement of people
comes as a revelation to city planners and researchers
alike. “It’s like a blind person for the first time opening

3 130 3
Imagining the Cognitive City

their eyes and seeing,” says Milind Naphade, leader of


Insights in Motion.9
The Insights in Motion project is remarkable in two
ways. It marks the first time that cities (including Dubuque,
Iowa, and Istanbul, Turkey) have used massive amounts of
data from cell-phone records to track the movement of citi-
zens without capturing personal information, thus protect-
ing privacy. It also illustrates the importance of developing
technology in close collaboration with the people who will
eventually use it. One of the key elements of innovation for
the era of cognitive systems will be the willingness to use
the real world, in all of its messiness and complexity, as a
living laboratory for developing new technologies.
Milind grew up in what he calls the “concrete jungle”
of Pune, India. He loved the fact that even though it was
a large city, he could get almost anywhere he wanted to
go on foot. Later, he studied for a Ph.D. in electrical engi-
neering at the University of Illinois at Urbana-Champaign.
The focus of his dissertation was on pattern recognition
in multimedia content, which, it turns out, has some
similarities to mapping the movements of people in cit-
ies. After working at IBM Research for several years, he got
interested in using data and advanced analytics tools to
solve transportation problems. His boss suggested that he
find a city that would be willing to let him work with real
data. Milind chose Dubuque because IBM was opening a
service-delivery center there and because the city’s mayor,
Roy D. Buol, had laid out of vision of turning Dubuque
into one of the most environmentally sustainable cities in
the United States.

3 131 3
Imagining the Cognitive City

One of the mayor’s goals there was to increase the use


of the public buses. Ridership had dropped off severely
over the years, which made it difficult to maintain ser-
vice levels. How could this be reversed? Milind and his
colleagues proposed developing a system for improving
public transit in a small city. IBM Research has a program,
First Of A Kind, where researchers work closely with a
partner to develop technology solutions that have never
been tried before. The projects are jointly funded by IBM
and the partner. Milind proposed Dubuque as a FOAK. He
was asked to also choose another city, a larger one in an
emerging market, and, ultimately, he picked Istanbul. So
they had two living laboratories that are as different as you
can imagine. Dubuque has 23,000 households and only
one source of public transit, buses. Istanbul is a sprawling
city with 14 million residents, millions of tourists annu-
ally, and a wide range of transit options—subway, light rail,
express buses, traditional buses, minibuses, and ferries.
Until now, city officials have based their knowledge
of transportation activities on a wide variety of infor-
mation, everything from ticket-sales data, to surveys of
transit passengers, to actual counts of people on a par-
ticular bus or subway car at a particular time in a particu-
lar place. The problem is that these types of information
only provide fragments of the bigger picture. A survey of
people on a particular bus route, for instance, only tells
you about the people who are riding the bus—not about
people who are moving in the same direction at the same
time via other means who might ride the bus under other
circumstances.

3 132 3
Imagining the Cognitive City

To change the game for city managers and planners,


the Insights in Motion technology draws on transit data,
geospatial information, census records, points-of-interest
information, and data from cell phones. By tracking the
movements of thousands of people from place to place
and correlating it with time and the speed of travel, the
system can identify the mode of transportation people
are using and figure out where they are traveling to and
from, whether it’s home, work, school, or shopping. Using
cognitive technologies, the system learns what their needs
and expectations are.
The team started in Dubuque. In addition to gathering
mobile-phone location data from telecom carriers, they also
recruited 1,000 volunteers who own smartphones. The high-
end handhelds are equipped with GPS and accelerometers,
which make it possible to track the volunteers as they travel
around the city with pinpoint accuracy. The team developed
an application for the smartphones that would enable the
volunteers to use them to navigate the transit system.
Their task was challenging. They had to build a system
that could, in a sense, “understand” what a city is, what
people are, what people do, and how they move from
place to place. They needed a digital model of a city that
they could experiment with. So they built a framework of
knowledge about each individual’s activities, including
detecting meaningful locations, segmenting trips, cal-
culating the duration of stays, identifying the purpose
of each trip, plotting origins and destinations of each
trip, tracking the time of day, and specifying the mode of
transportation. They even estimated the carbon footprint

3 133 3
Imagining the Cognitive City

of each trip. Then they aggregated all of the information


about individuals to generate citywide patterns.
The team used the results to optimize the routes and
schedules for the city’s bus system. The objective was
to minimize the sum of operator costs, user costs, and
unsatisfied demand costs for the entire network. Costs
aren’t just expressed in dollars. For users, for instance,
they include factors like waiting, walking, and driving a
car. Dubuque is now trying the information out with two
new bus routes. “Having the data is crucial. You don’t have
to operate on tea leaves,” says Chandra Ravada, director
of the transportation department for Iowa’s East Central
Intergovernmental Association. “You know where people
are going. You can change things based on that. You’re
designing your system not on somebody’s opinion but
based on facts.”10 The goal is to reduce operating expenses
by 40 percent, meet 37 percent more demand, reduce aver-
age commuter time by 60 percent, and reduce per-traveler
combustion emissions by 40 percent.
Milind and his team took the lessons from in Dubuque
and applied them in Istanbul. The task there was much
more complex. There were so many more people and
modes of transportation. Also, in Istanbul, they didn’t
track volunteers with smartphones. They had to rely on
less precise records from regular mobile phones. But the
technologies they had used in Dubuque also worked in
Istanbul. The city’s transit authority, Istanbul Transporta-
tion, is using the Insights in Motion tool to help design
feeder bus routes connecting to the city’s new subway
lines.

3 134 3
Imagining the Cognitive City

Having these two cities to experiment in was incredibly


helpful for Milind and his team. When researchers don’t
have a real-world situation to interact with, they can’t eas-
ily identify the constraints that products and services cre-
ated using their technologies will encounter when they’re
launched. The volunteers in Dubuque provided valuable
feedback on the shortcomings of the smartphone applica-
tion, which the team used to improve it. In Istanbul, Mil-
ind came to appreciate the importance of personal privacy
and securing data. Because of privacy concerns among
the city officials there, the IBMers worked on the data only
when they were in Istanbul. The experience made them
very thoughtful about designing security and privacy fea-
tures into all of the analysis systems they build.
At the highest level, Milind believes that Insights in
Motion has tremendous potential to alter the relation-
ships of cities and their citizens. “One thing I think about
is how we have become slaves of the infrastructure rather
than having the infrastructure work for us,” he says. “Cit-
ies should help people live their lives, not get in the way.”

3 135 3
Coda: An Alliance
of Human and Machine

E ver since Watson won at Jeopardy!, people have been


asking the research scientists who designed the
machine if they’d like to try to pass the so-called Turing
test. That’s an exercise suggested by computing pioneer
Alan Turing in his 1950 paper “Computing Machinery and
Intelligence,” where he raised the question: “Can machines
think?” He suggested that to test whether a machine can
think, a human judge should have a written conversation
via computer screen and keyboard with another human
and a computer. If the judge couldn’t tell the human from
the machine based on their responses, the machine would
have passed the test.1 With this test, Turing set a standard
for measuring the capabilities of machines that has not
yet been met.
While the IBM researchers are intrigued by the Turing
test, they have no plans to prepare Watson to take it. A Tur-
ing test would merely show how good Watson is at imitat-
ing human beings and our quirks and social conventions.
Instead, they want to concentrate on further developing
Coda: An Alliance of Human and Machine

the machine so it can become an expert, trusted advisor


to humans, such as the oncologists at Memorial Sloan-
Kettering Cancer Center. They do not foresee a situation
where machines would play the same cognitive role in
the universe as humans do. Instead, they can be helpers.
Machine cognition and human cognition are complemen-
tary. We have different strengths and weaknesses. The
opportunity here is not to replicate human cognition but
to use computers to help us reason over human-created
data—our communications, our documents, our images,
and our designs.
Dr. Larry Norton at Memorial Sloan-Kettering agrees.
While he embraces the expanded role of cognitive com-
puting in health care, he believes that even as medical
decision making becomes ever more scientific, it will also
remain a creative activity. “I envision situations where
myself, the patient, the computer, my nurse, and my grad-
uate fellow are all in the examination room interacting
with one another,” Larry says. “Arriving at a good medi-
cal decision is an act of artistic improvisation—just as it
is with a great piece of music.” In his free time, he plays
vibraphone in jazz groups.
With every major advance in computing that IBM has
produced over its more than one-hundred-year history,
questions have been raised about the effect on human
beings. In a memo to IBM managers on April 25, 1960, CEO
Thomas Watson Jr. addressed head-on the issue of “think-
ing machines.” He wrote: “Computers will never rob man
of his initiative or replace the need for his creative think-
ing. By freeing man from the more menial or repetitive

3 138 3
Coda: An Alliance of Human and Machine

forms of thinking, computers will actually increase the


opportunities for the full use of human reason. Only
human beings can think imaginatively and creatively in
the fullest sense of these words.”
Watson wrote those lines just a few years before IBM
launched System/360, a revolutionary family of main-
frame computers that transformed the computer indus-
try. System/360 included many firsts. The machines had
new kinds of processors, memory devices, and storage. It
also marked the first time that software was offered that
could run on a variety of computers, not just one, which
ultimately gave rise to the independent software indus-
try. But IBM’s executives and engineers had no idea at the
time that the technology breakthroughs in the System/360
would create a path that would ultimately lead to the
personal computer, the Internet, and mobile communi-
cations. And so it is today, at the dawn of the era of cogni-
tive systems. We know that big shifts are coming. Yet it’s
impossible to fully imagine the impact they will have on
computing, business, and society.
But this we do know: by working in concert, humans
and cognitive systems have the potential to dramatically
improve and accelerate outcomes that matter to us and
to make life on earth more sustainable. This alliance of
human and machine offers the promise of progress on a
massive scale. Creating these systems will require break-
throughs and reinvention at every level of technology,
from the way we build chips to what we think of as an
application. That’s a lot of work to do. So let’s get to it.

3 139 3
Notes

1. A New Era of Computing

1. Dr. Herb Chase, Columbia University School of Medicine,


IBM IBV report, “The Future of Connected Healthcare
Devices,” March, 2011.
2. Dr. Larry Norton, Memorial Sloan-Kettering Cancer Cen-
ter, interview, June 12, 2012.
3. Thomas Malone, Massachusetts Institute of Technology,
interview, May 3, 2013.
4. J. C. R. Licklider, “Man-Computer Symbiosis,” IRE Trans-
actions on Human Factors in Electronics 1 (March 1960):
4–11.
5. CenturyLink Business Inc., infographic, 2011, http://www
.centurylink.com/business/artifacts/pdf/resources/big
-data-defining-the-digital-deluge.pdf.
6. Ralph Gomory, former IBM Research director, interview,
March 19, 2012.
7. Michel McCoy, Lawrence Livermore National Laboratory,
interview, June 14, 2012.
1. A New Era of computing

2. Building Learning Systems

1. Stephen Baker, Final Jeopardy: The Story of Watson, the


Computer that will Transform Our World (New York: Mari-
ner, 2012).
2. Dario Gil, program director, IBM Research, interviews,
October 10, 2011, and May 31, 2012.
3. Charles Lickel, former IBM Research manager, interview,
February 24, 2013.
4. John McCarthy and Edward Feigenbaum, “Arthur Samuel:
Pioneer in Machine Learning,” obituary, n.d., http://
infolab.stanford.edu/pub/voy/museum/samuel.html;
Gerald Tesauro, “Temporal Difference Learning and TD-
Gammon,” Communications of the ACM 38, no. 3 (March
1995): 58–68.
5. Murray Campbell, IBM Research scientist, interview, May
4, 2012.
6. See Marvin Minsky’s home page, http://web.media.mit
.edu/~minsky/.
7. Paul Horn, NYC assistant provost for research, interview,
November 28, 2012.
8. David Ferrucci, former IBM Research manager, interview,
January 27, 2010.
9. Eric Brown, IBM Research manager, interview, December
10, 2012.
10. Brown, interview.
11. IBM and WellPoint, “IBM Watson Hard at Work,” press
release, February 8, 2013.
12. Joyce Phillips, chief executive officer global wealth and
private banking, ANZ Banking Group, “How Innovation
Can Fuel Smarter, Faster Financial Advice,” A Smarter
Planet blog, May 21, 2013, http://asmarterplanet.com/
blog/2013/05/advice.html.
13. David Nahamoo, IBM Research, interview, January 2, 2013.

3 142 3
3. Handling Big Data

14. Dr. Mark Kris, presentation at IBM Investors Day, May 9,


2012.

3. Handling Big Data

1. Permanent Subcommittee on Investigations, Wall Street


and the Financial Crisis: Anatomy of a Financial Collapse,
April 13, 2011, http://www.hsgac.senate.gov//imo/media/
doc/Financial_Crisis/FinancialCrisisReport.pdf.
2. International Data Corporation, “The Digital Universe in
2020: Big Data, Bigger Digital Shadows, Biggest Growth
in the Far East,” sponsored by EMC, December 2012,
http://www.emc.com/leadership/digital-universe/index
.htm#2012; also see EMC, “New Digital Universe Study
Reveals Big Data Gap,” press release, December 11, 2012,
http://www.emc.com/about/news/press/2012/20121211-01
.htm.
3. International Data Corporation, “The Digital Universe in
2020”; EMC, “New Digital Universe Study.”
4. Jens Jelitto, IBM Research, interview, December 3, 2012.
5. Nagui Halim, IBM Research, interviews, May 7, 2012, April
15, 2010.
6. Ralph Gomory, former IBM Research director, interview,
April 19, 2012.
7. Mark Squillante, IBM Research, interview, September 24,
2012.
8. Jeff Jonas, IBM software group, interview March, 9, 2012.
9. Sam Adams, IBM Research, interview, September 14, 2012.
10. Dario Gil, IBM Research, interview, December 14, 2012.
11. Marco de Vos, managing director, ASTRON, interview, Oc-
tober 16, 2012.
12. Square Kilometre Array, “Amazing Facts,” http://www
.skatelescope.org/media-outreach/fun-stuff/facts-figures/.

3 143 3
4. Augmenting Our Senses

13. Albert-Jan Boonstra, manager, ASTRON, interview, Octo-


ber 16, 2012.
14. Ton Engbersen, IBM Research, interview, October 17,
2012.

4. Augmenting Our Senses

1. International Data Corporation, “The Digital Universe in


2020: Big Data, Bigger Digital Shadows, Biggest Growth
in the Far East,” sponsored by EMC, December 2012,
http://www.emc.com/leadership/digital-universe/index
.htm#2012; also see EMC, “New Digital Universe Study
Reveals Big Data Gap,” press release, December 11, 2012,
http://www.emc.com/about/news/press/2012/20121211-01
.htm; YouTube, “Statistics,” http://www.youtube.com/t/
press_statistics.
2. John Smith, IBM Research, interview, January 7, 2013.
3. Dharmendra Modha et al., “Cognitive Computing,” Com-
munications of the ACM 54, no. 8 (August 2011): 62–71.
4. Mark Ritter, IBM Research, interview, August 15, 2012.
5. Dharmendra Modha, IBM Research, interview, November,
2008.
6. Dharmendra Modha, IBM Research, interview, May 4,
2012.
7. Modha, interview, May 4, 2012.
8. Modha, interview, May 4, 2012.
9. Dharmendra Modha, IBM Research, interview, March 6,
2013; Bill Risk, IBM Research, interview, June 15, 2012.
10. IBM, “IBM Unveils Cognitive Computing Chips,” press
release, August 18, 2012, http://www-03.ibm.com/press/
us/en/pressrelease/35251.wss.
11. Modha, interview, May 4, 2012.
12. Bill Risk, IBM Research, interview, June 10, 2013.

3 144 3
5. Designing Data-Centric Computers

5. Designing Data-centric Computers

1. Michael Frissora, IBM Research, interview, August 21,


2012.
2. Bijan Davari, IBM Research, interview, May 21, 2012.
3. Hybrid Memory Cube Consortium, “About Hybrid Mem-
ory Cube,” 2013, http://www.hybridmemorycube.org/
technology.html.
4. Douglas Joseph, IBM Research, interview, September 4,
2012.
5. Murali Ramanathan, SUNY Buffalo, interview, August 15,
2012.
6. Adam Ronthal, IBM, interview, May 2, 2012.
7. Ronald Ambrosio, IBM Research, interview, July 5, 2012.
8. David Turek, IBM, interview, June 13, 2012.
9. Jim Kahle, IBM Research, interview, September 12, 2012.
10. Radu Sion, “To Cloud or Not to: An Exploration of the Eco-
nomics of Clouds and Cyber-security,” http://www.zurich
.ibm.com/~cca/csc2011/talks/sion-invited-csc2011.pdf.

6. Inventing a New Physics of Computing

1. Gordon Moore, “Cramming More Components Onto In-


tegrated Circuits,” Electronics 38, no. 8 (April 19, 1965).
2. Robert Dennard, IBM Research, interview, June 21, 2013.
3. Supratik Guha, IBM Research, interview, August 23, 2012.
4. Mark Ritter, IBM Research, interview, August 15, 2012.
5. Matthias Steffen, IBM Research, e-mail interview, Sep-
tember 5, 2012.
6. Matthias Steffen, IBM Research, interview, February 8,
2012.
7. L. Trossarelli, “The History of Nylon,” Club Alpino Ital-
iano, Centro Studi Materiali e Tecniche, 2010, http://www
.caimateriali.org/index.php?id=32.

3 145 3
6. Inventing a New Physics of COmputing

8. Michel McCoy, director, Simulation and Computing, Law-


rence Livermore National Laboratory, interview, June 14,
2012.
9. Jed Pitera, IBM Research, interview, December 14, 2012.
10. McCoy, interview.
11. Andreas Heinrich, IBM Research, interview, August 2011.
12. Andreas Heinrich, IBM Research, interview, August 3,
2012.
13. Gerd Binnig, former IBM Research scientist, interview,
November 17, 2010.
14. Don Eigler, former IBM Research scientist, interview,
September 2009.
15. Andreas Heinrich, video interview, 2011.
16. Heinrich, interview, August 3, 2012.

7. Imagining the Cognitive City

1. Jonathan Mahler, Ladies and Gentlemen, the Bronx is Burn-


ing: 1977, Baseball, Politics, and the Battle for the Soul of a
City (New York: Farrar, Straus and Giroux, 2005).
2. United Nations Department of Urban and Social Affairs,
“World Urbanization Prospects: The 2007 Revision,” ex-
ecutive summary, February 2008, http://www.un.org/esa/
population/publications/wup2007/2007WUP_ExecSum_
web.pdf.
3. Geoffrey West, TED Talk, Edinburgh, Scotland, 2011, video,
https://www.youtube.com/watch?v=XyCY6mjWOPc.
4. “Fighting Terrorism in New York City,” 60 Minutes, CBS
television, September 25, 2011.
5. Paul Maglio, IBM Research, interview, July 6, 2012.
6. Arizona State University, “Study Maps Greenhouse
Gas Emissions to Building, Street Level for U.S. Cit-
ies,” press release, October 9, 2012, https://asunews.asu
.edu/20121009_Hestia.

3 146 3
Coda: An Alliance of Human and Machine

7. Guruduth Banavar, IBM, interview, October 5, 2012.


8. Colin Harrison, IBM, “Smarter Cities—NextGen,” Power-
Point presentation, presented at the Major Cities of Eu-
rope Conference, Vienna, Austria, June 5, 2012.
9. Milind Naphade, IBM Research, interview, September 27,
2012.
10. Chandra Ravada, Iowa East Central Intergovernmental
Association, interview, January 7, 2013.

Coda: An Alliance of Human and Machine

1. Alan Turing, “Computing Machinery and Intelligence,”


Mind 59 (October 1950): 433–60; see http://www.loebner
.net/Prizef/TuringArticle.html.

3 147 3

You might also like