Professional Documents
Culture Documents
Bbil2-B2-Bbil2 b2 109 Sallam Os
Bbil2-B2-Bbil2 b2 109 Sallam Os
Bbil2-B2-Bbil2 b2 109 Sallam Os
Rita Sallam
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or distributed in
any form without Gartner's prior written permission. If you are authorized to access this publication, your use of it is subject to the Usage Guidelines for Gartner Services posted on
gartner.com. The information contained in this publication has been obtained from sources believed to be reliable. Gartner di sclaims all warranties as to the accuracy, completeness
or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. This publication consists of the opinions of Gartner's research
organization and should not be construed as statements of fact. The opinions expressed herein are subject to change without notice. Although Gartner research may include a
discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company, and its
shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner's Board of Directors may include senior managers of these
firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or their managers. For further information
on the independence and integrity of Gartner research, see "Guiding Principles on Independence and Objectivity."
"Not everything
Don't replace Commonthat counts
Sense and
can be
Experience counted,
With "Data Science"
Signal Linguistics
Processing
Operations
Research
Machine
Statistics Data Science: .... unified discipline that
Learning
develops methodologies to utilize data for the
purpose of monitoring, understanding,
anticipating and controlling parts of the world.
Descriptive
Monitoring Human
Decision
Diagnostic
Action
Human
Data
Understanding
Predictive
Anticipating Human
R, Python,
SAS, Matlab
Advanced
Analytics
Platforms Data Discovery
Vendors
Text Analytics
BI Platforms
Specialty Analytics
Packaged Analytic
Applications Embedded
Analytics
Top Data Power Business Information
notch scientists users analysts consumer
2 (5) 20 (200) 2000 (5000) 20000 (50K) Everybody
Per Million in OECD Nations User-required data
(in Top 10 metropolitan areas)
science skills
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Key Issues
Deep
Learning
Computer Self-driving Self-driving Google acquires
beats top cars in the cars in DeepMind
chess player desert normal ($400m)
traffic
1997 ... 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Self-driving cars
on normal streets
IBM Watson beats
jeopardy experts
Google/
Facebook
Self-driving hire top ML
cars experts
Signal-2-noise-ratio worsens
noise
signal
~1.20 x
used
time
Performance
5 60
25
Data 5
volume
Monitoring
Operational Log-Data
Data Enterprise
Transactions "Dark Data"
Contracts
Tabular Non-tabular
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Turning on the Audio
Perfect
10%
Combined
9%
Normal
Calls Correlated to Churn
8%
7%
Audio
6% at 10%-Touchrate
5%
"Don't Know" Correct Wrong
4%
3% Normal 55% 45%
2% Combined 73% 27%
1%
0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Call Population
Adapted from:
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Ensembles Are Cutting Edge
Model 4
Model 1 Model 6
Model 2
Model 3 Model 5
Advantages
• avoid overfitting of a single model
95% of the time correct ... • robust regarding "hyper-parameters"
• fairly easy-to-use
Assume 20% of the audience has
the answer: • Today's gold standard for high precision
32% Cautions
20% • require faster computers
22% 24% 22%
• less transparent models
12%
A B C D
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
From the Cutting Edge of Data Science:
Deep Neural Nets
The trick: Biggest Neural Net So Far
feature learning
Merck's
Cheminformatic
ao a*
y
bo b*
a' a"
b' b"
word2vec
Android Speech Recognition
Advantages
• complex features (a*, b*) are "learned"
• utilize unlabelled + labeled data
Cautions
• less feature engineering?
• Novel approach
• domain knowledge less important?
• Brittle hyper-parameters
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Google Face Recognition
Scattered
Market-place humans
Customer
• Quality, time, cost trade-off
Objective: Access to the best of
• Great for hard-to-automate scenarios
the best in data science
• Fixed costs versus variable costs
DS IT IT
IT
AA/DS = advanced analytics/data science; LOB = line of business; ACE = Analytic Center of Excellence
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Where Do I Find the Talent?
Light- Mid- to heavy- Top-
Power weight data- weights notch
Users scientists data scientists data scientists
Hiring • Social Sciences • Statisticians • Kaggle
• Natural Sciences • Machine Learning • TopCoder
• Electrical Engineers • Operations • CrowdAnalytix
• Mathematicians Researcher
• Industrial Engineers
• Computational
Linguistics
Training • YouTube • "On-the-job" • Academia
• MOOC • Big Data Science
• Local Colleges Firms: Google,
• Vendors Facebook, Amazon,
• "On-the-job" "Wall Street"
Gauge political
friction Performance Criteria That Matter
(ROI, accuracy, profitability
Deployment versus market gain)
"Analytics Leader"?
Feature Engineering
Data
Recalibration With
Data IT Data New Data?
Logistics Skills Science
High-
performance Which Analytics
Computing to Choose?
Project
Execution/Monitoring Data Exploration
Data Governance
© 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
Creativity, Communication
Choosing the Right Approach for the
Right Data Science Solution
An Eight-Question Decision Framework
for Buying, Building and Outsourcing
Data Science Solutions
Alexander Linden and Lisa Kart (G00258056)
BUILD
Advanced Analytics Platforms
See Hype Cycle
OUTSOURCE
Data Science Service Provider BUY
Packaged Analytics Applications