Professional Documents
Culture Documents
An Executive's Guide To AI PDF
An Executive's Guide To AI PDF
An algorithm uses training data An algorithm explores input An algorithm learns to perform
and feedback from humans to data without being given a task simply by trying to
What it is
learn the relationship of given an explicit output variable maximize rewards it receives
inputs to a given output (eg, how (eg, explores customer for its actions (eg, maximizes
the inputs “time of year” and demographic data to points it receives for increasing
“interest rates” predict housing identify patterns) returns of an investment
prices) portfolio)
You know how to classify the You do not know how to You don’t have a lot of training
When to input data and the type of classify the data, and you data; you cannot clearly define the
use it behavior you want to predict, want the algorithm to find ideal end state; or the only way to
but you need the algorithm to patterns and classify the learn about the environment is to
calculate it for you on new data data for you interact with it
1
We’ve listed some of the most commonly used algorithms today—this list is not intended to be exhaustive. Additionally, a number of different models can
often solve the same business problem. Conversely, the nature of an available data set often precludes using a model typically employed to solve a particular
problem. For these reasons, the sample business use cases are meant only to be illustrative of the types of problems these models can solve.
Support vector machine Predict how many patients a hospital will need
A technique that’s typically used for classification to serve in a time period
but can be transformed to perform regression. It
draws an optimal division between classes (as wide Predict how likely someone is to click on an
as possible). It also can be quickly generalized to online ad
solve nonlinear problems
2
We’ve listed some of the most commonly used algorithms today—this list is not intended to be exhaustive. Additionally, a number of different models can often solve the
same business problem. Conversely, the nature of an available data set often precludes using a model typically employed to solve a particular problem. For these reasons,
the sample business use cases are meant only to be illustrative of the types of problems these models can solve.
Optimize pricing in real time for an online auction of a product with limited supply
3
The sample business use cases are meant only to be illustrative of the types of problems these models can solve.
A multilayered neural network with a special A multilayered neural network that can store
architecture designed to extract increasingly information in context nodes, allowing it to
What it is
complex features of the data at each layer to learn data sequences and output a number or
determine the output another sequence
When you have an unstructured data set (eg, When you are working with time-series data
When to images) and you need to infer information or sequences (eg, audio recordings or text)
use it from it
4
The sample business use cases are meant only to be illustrative of the types of problems these models can solve.
Diagnose health diseases from medical Generate analyst reports for securities
Business scans traders
use
cases
Detect a company logo in social media Provide language translation
to better understand joint marketing
opportunities (eg, pairing of brands in one Track visual changes to an area after
product) a disaster to assess potential damage
claims (in conjunction with CNNs)
Understand customer brand perception
and usage through images Assess the likelihood that a credit-card
transaction is fraudulent
Detect defective products on a production
line through images Generate captions for images
machine learning
French mathematician Adrien-Marie Legendre
publishes the least square method for regression, 1965 – Moore recognizes
+
which he used to determine, from astronomical exponential growth in chip
observations, the orbits of bodies around the power
sun. Although this method was developed as a Intel cofounder Gordon Moore
statistical framework, it would provide the basis notices that the number of
for many of today’s machine-learning models. transistors per square inch on
integrated circuits has doubled
every year since their invention.
His observation becomes Moore’s
1958 – Rosenblatt develops the first self- law, which predicts the trend will
+
on backpropagation, popularizing this adoption begins among home Nvidia releases the GeForce
key technique for training artificial neural Internet users 256 graphics card, marketed as
networks (ANNs) that was originally proposed Broadband allows users access the world’s first true graphics
by American scientist Paul Werbos in 1982. to increasingly speedy Internet processing unit (GPU). The
Backpropagation allows the ANN to optimize connections, up from the paltry technology will later prove
itself without human intervention (in this 56 kbps available for downloading fundamental to deep learning by
case, it found features in family-tree data that through dial-up in the late 1990s. performing computations much
weren’t obvious or provided to the algorithm Today, available broadband faster than computer processing
in advance). Still, lack of computational power speeds can surpass 100 mbps units (CPUs).
and the massive amounts of data needed to train (1 mbps = 1,000 kbps). Bandwidth-
these multilayered networks prevent ANNs hungry applications like
leveraging backpropagation from being used YouTube could not have become
widely. commercially viable without the
advent of broadband.
+
neural network (CNN) is well suited for storage and computing to the
shape-recognition tasks. LeCun and team apply masses
CNNs to the task of recognizing handwritten Amazon launches its Amazon Web
characters, with the initial goal of building Services, offering cloud-based
automatic mail-sorting machines. Today, storage and computing power to
CNNs are the state-of-the-art model for image users. Cloud computing would
recognition and classification. come to revolutionize and
democratize data storage and
computation, giving millions
of users access to powerful IT
systems—previously only available
1992 – Upgraded SVMs provide early to big tech companies—at a
+
+
Harvard student Mark Zuckerberg introduce the MapReduce
and team launch “Thefacebook,” as it algorithm to cope with data
1997 – RNNs get a “memory,” positioning was originally dubbed. By the end of explosion
+
them to advance speech to text 2005, the number of data-generating With the World Wide Web taking
In 1991, German computer scientist Sepp Facebook users approaches six million. off, Google seeks out novel
Hochreiter showed that a special type of artificial ideas to deal with the resulting
2004 – Web 2.0 hits its stride, proliferation of data. Computer
+
algorithm
The algorithm, which ranks web pages higher
the more other web pages link to them, forms 2005 – YouTube debuts 2005 – Cost of one gigabyte of
+
+
the initial prototype of Google’s search engine. Within about 18 months, the site would disk storage drops to $0.79,
This brainchild of Google founders Sergey serve up almost 100 million views per from $277 ten years earlier
Brin and Larry Page revolutionizes Internet day. And the price of DRAM, a type of
searches, opening the door to the creation and random-access memory (RAM)
consumption of more content and data on the 2005 – Number of Internet users commonly used in PCs, drops to
+
World Wide Web. The algorithm would also worldwide passes one-billion $158 per gigabyte, from $31,633 in
go on to become one of the most important mark 1995.
for businesses as they vie for attention on an
increasingly sprawling Internet.
+
use of deep-learning models introduce Hadoop to store
To speed the training of deep- and process massive amounts
learning models, Geoffrey Hinton of data
develops a way to pretrain them with Inspired by Google’s MapReduce,
+
a deep-belief network (a class of computer scientists Doug Cutting
neural network) before employing and Mike Cafarella develop the
2007 – Introduction of the iPhone
backpropagation. While his method Hadoop software to store and
propels smartphone revolution—and
would become obsolete when process enormous data sets.
amps up data generation
computational power increased Yahoo uses it first, to deal with the
Apple cofounder and CEO Steve Jobs
to a level that allowed for efficient explosion of data coming from
introduces the iPhone in January 2007.
deep-learning-model training, indexing web pages and online
The total number of smartphones sold
Hinton’s work popularized the use data.
in 2007 reaches about 122 million. The
of deep learning worldwide—and
era of around-the-clock consumption
many credit him with coining the
and creation of data and content by
phrase “deep learning.”
smartphone users begins.
the number sold in 2007. exabytes (20 billion gigabytes) per month
Internet protocol (IP) traffic is aided by growing
2010 – Microsoft and Google introduce
+
+
IBM’s question answering system, Watson,
defeats the two greatest Jeopardy! champions,
Brad Rutter and Ken Jennings, by a significant
margin. IBM Watson uses ten racks of IBM
Power 750 servers capable of 80 teraFLOPS
(that’s 80 trillion FLOPS—the state of the art in
the mid-1960s was around three million FLOPS).
+
one billion
The amount of data processed by the company’s 2012 – Google demonstrates the
+
systems soars past 500 terabytes. effectiveness of deep learning for
2012 – Deep-learning system wins image recognition
+
renowned image-classification contest Google uses 16,000 processors to train a deep
for the first time artificial neural network with one billion
Geoffrey Hinton’s team wins ImageNet’s connections on ten million randomly selected
image-classification competition by a large YouTube video thumbnails over the course of
margin, with an error rate of 15.3 percent versus three days. Without receiving any information
the second-best error rate of 26.2 percent, using about the images, the network starts recognizing
a convolutional neural network (CNN). Hinton’s pictures of cats, marking the beginning of
team trained its CNN on 1.2 million images using significant advances in image recognition.
two GPU cards.
30 times faster than GPUs and CPUs. In 2017, learning to play three different games in less
Google announced an upgraded version of the than 24 hours
TPU that was faster (180 million teraFLOPS— While creating AI software with full general
more when multiple TPUs are combined), could intelligence remains decades off (if possible at all),
be used to train models in addition to running Google’s DeepMind takes another step closer to
them, and would be offered to the paying public it with AlphaZero, which learns three computer
via the cloud. TPU availability could spawn even games: Go, chess, and shogi. Unlike AlphaGo Zero,
more (and more powerful and efficient) machine- which received some instruction from human
learning-based business applications. experts, AlphaZero learns strictly by playing
itself, and then goes on to defeat its predecessor
AlphaGo Zero at Go (after eight hours of self-play)
as well as some of the world’s best chess- and
shogi-playing computer programs (after four and
two hours of self-play, respectively).