Professional Documents
Culture Documents
Big Data Univ Bengkulu
Big Data Univ Bengkulu
FUNCTIONING
A top down
birds eye view
of an area
identified by a
client –
visualized using
Smart Steps
IBM and Orange Mobile Data: Urban Transportation
FU NCTIONING:
• SingTel uses Amobee’s technology
combined with its own internal data
to create targeted ad campaigns to
for its advertisers.
• Location is the single largest data
point used to create these targeted
offers – for its external clients.
q Internally however the
organization combines Amobee
with its customer information
to create a 360 degree view of
its user in order to create even
greater personalization
Big Data Telkom Group Implementation
Big data is a pretty popular term. And even though its definition is simple enough, it hides numerous potential
advantages for our company.
AT
A
R
AN
KU
AH
LU
AT
BI
UR
N
AH
EN
NG
TA
N
AT
AU
A
A
T
A
T
T
A
A
LI
UR
NG
EH
AR
AR
AR
RA
PU
T
RA
RA
AR
AR
TA
TA
SI
GA
IA
BA
M
AR
AR
AR
AR
KU
R
M
TA
AT
NG
AC
U
NG
NT
RI
M
TU
M
PU
NE
R
JA
BA
BA
BA
PA
BA
LA
UT
LA
UT
UT
UT
AL
GG
EN
TI
TI
TI
AK
EL
IB
ON
AK
P.
BA
NG
TE
TE
LI
M
O
SE
SE
M
N
A
IT
A
A
KU
KE
A
A
N
EN
BE
A
SI
LA
IS
IJ
ES
Y
AR
W
OR
TA
BE
AR
ER
W
PU
ER
TA
TA
A
N
IN
E
G
ES
N
A
IT
JA
DK
W
U
ES
TA
W
JA
W
A
YO
AN
ER
G
TA
G
AN
G
PA
AN
AT
AT
AL
W
GK
JA
LA
LA
ES
NG
AN
NG
AN
LA
AT
M
M
M
M
M
DI
LA
SU
W
SU
N
LI
SU
TE
SU
M
LI
TE
LI
BA
SU
M
LA
KA
SU
KA
KA
LI
SU
LI
SA
SA
KA
SU
P.
KA
KE
NU
NU
80
10 ACEH
Year 2016 2017 2018
70 SUL AWESI UTARA
9 SUMATERA UTARA SUL AWESI TEN GAH
8
SUMATERA BARAT
Indonesia 4,03 3,12 2,55 60
SUL AWESI SELATAN
7 50 SUL AWESI TEN GGARA
RIAU
6 GOR ON TAL O
JAMBI 40
5 SUL AWESI BARAT
SUMATERA SELATAN 30 MALU KU
4
MALU KU UTARA
3 BENGKUL U 20
PAPUA BARAT
2 LAMPU NG
10 PAPUA
1 KEP. BAN GKA BELITUNG
0
0 KEP. R IAU 2015 2016 2017 2018
2015 2016 2017 2018
50
0,7 BALI
45
0,6 DKI JAKARTA NU SA TENGGARA BARAT
40
0,5 35 NU SA TENGGARA TIMUR
JAWA BAR AT
Sedangkan jika menurut daerah tempat
30
0,4 JAWA TENGAH tinggal, berikut persentasenya: 25
KALIMANTAN BARAT
KALIMANTAN TENGAH
0,3 DI YOGYAKARTA 20
0,2 Year 2016 2017 2018 15 KALIMANTAN SELATAN
JAWA TIMUR
10 KALIMANTAN TIMU R
0,1
5
BANTEN
Perkotaan 0,31 3,44 0,24 KALIMANTAN UTARA
0 0
2015 2016 2017 2018 2015 2016 2017 2018
Pedesaan 7,09 2,08 4,62
Source Data : BPS, Dukcapil, Telkom Analysis. 2018.
Data Ketahanan Pendidikan Nasional
Year 2018
KEP. RIAU 9,88 KALIMANTAN TIMUR 9,32
SUMATERA UTARA 9,45 Indonesia 8,26 KALIMANTAN UTARA 8,9
ACEH 9,22 KALIMANTAN TENGAH 8,37
SUMATERA BARAT 9,14 BALI 8,33
RIAU 8,97 KALIMANTAN SELATAN 8,11
BENGKULU 8,76 NUSA TENGGARA TIMUR 7,52
JAMBI 8,41 KALIMANTAN BARAT 7,31
SUMATERA SELATAN 8,3 NUSA TENGGARA BARAT 7,17
KEP. BANGKA BELITUNG 8,1
LAMPUNG 8,09 MALUKU 9,71
SULAWESI UTARA 9,58
PAPUA BARAT 9,37
DKI JAKARTA 10,75 MALUKU UTARA 8,82
DI YOGYAKARTA 9,36 SULAWESI TENGGARA 8,74
SULAWESI TENGAH 8,6
BANTEN 8,53 Top 3 Province SULAWESI SELATAN 8,27
JAWA BARAT 8,29 INDONESIA 8,26
Sumatera : Kep.Riau, Sumut, Aceh GORONTALO 8,17
JAWA TIMUR 7,49 Jawa : DKI, DIY, Banten
SULAWESI BARAT 7,83
JAWA TENGAH 7,45
Kalbanusra : Kaltim, Kaltara, Kalteng
PAPUA 5,97
Sulmapua : Maluku, Sulut, Papua Barat
BALI 91 BALI 91
DKI J AKARTA 91 DKI J AKARTA 90
DI YOGYAKARTA 89
Bali, DKI Jakarta dan DIY KALIMANTAN UTARA 88
KEP. BANGKA BELITUNG 86 adalah provinsi dengan KEP. RIAU 84
KEP. RIAU 85 jumlah penduduk yang KALIMANTAN TIMUR 81
SULAWES I SELA TAN 80 SULAWES I TENGGARA 81
KALIMANTAN TIMUR 79 paling banyak dalam DI YOGYAKARTA 81
SULAWES I UTARA 75 memiliki layanan sanitasi RIAU 80
SUMATERA UTARA 75 layak GORONTALO 79
JAW A TENGAH 74 JAW A TENGAH 78
PAPUA BARAT 74 SULAWES I SELA TAN 78
NUSA TENGGARA BARAT 74 PAPUA BARAT 77
KALIMANTAN UTARA 72 Akses sanitasi layak masih MALUKU 76
RIAU 71 SULAWES I UTARA 76
BANTEN 71
dirasa susah oleh pendu- JAW A TIMUR 75
SULAWES I TENGGARA 70 duk di Papua, Lampung INDONESIA 74
INDONESIA 69 dan Bengkulu. Persepsi NUSA TENGGARA BARAT 74
MALUKU 69 KALIMANTAN BARAT 73
JAW A TIMUR 69 masyarakat untuk menja- BANTEN 73
SUMATERA SELATAN 69 ga kesehatan lingkungan NUSA TENGGARA TIMUR 72
ACEH 67 masih belum menjadi ke- SUMATERA UTARA 72
MALUKU UTARA 67 SULAWES I TENGAH 71
JAW A BARAT 65 butuhan. JAW A BARAT 71
GORONTALO 64 SUMATERA BARAT 70
SULAWES I TENGAH 64 MALUKU UTARA 69
JAMBI 64 KEP. BANGKA BELITUNG 67
SULAWES I BARAT 63 Bali, DKI Jakarta dan JAMBI 67
KALIMANTAN SELATAN 63 ACEH 66
Kaltara adalah provinsi KALIMANTAN TENGAH 65
SUMATERA BARAT 57
KALIMANTAN BARAT 54 dengan jumlah penduduk SUMATERA SELATAN 65
KALIMANTAN TENGAH 53 yang paling banyak dalam SULAWES I BARAT 63
LAMPUNG 52 KALIMANTAN SELATAN 63
NUSA TENGGARA TIMUR 51
memiliki akses layanan PAPUA 58
BENGKULU 44 sumber air minum layak LAMPUNG 57
PAPUA 34 BENGKULU 49
Rasio dokter terhadap 100.000 penduduk baik secara nasional Secara nasional, rasio perawat adalah 114,75 per 100.000 Rasio bidan di Indonesia adalah sebesar 63,22 per 100.000
maupun provinsi masih jauh dari target rasio dokter pada tahun penduduk. Hal ini masih jauh dari target tahun 2019 sebesar penduduk. Angka ini masih jauh dari target 2019 sebesar 120
2019 yaitu 45 per 100.000 penduduk. Secara nasional, rasio 180 per 100.000 penduduk. Namun ada delapan provinsi per 100.000 penduduk. Ada empat provinsi yang telah
dokter di Indonesia sebesar 16,02 per 100.000 penduduk. dengan rasio perawat yang sudah memenuhi target tahun memenuhi target tahun 2019 yaitu Aceh, Bengkulu, Maluku
2019. Utara, dan Jambi.
Stunting merupakan masalah kurang gizi kronis akibat asupan gizi yang kurang sehingga tinggi badan bayi di bawah standar menurut
usianya/pendek. Menurut World Health Organization/WHO batas maksimal stunting bayi adalah 20%. Artinya stunting Balita di Indonesia saat
ini masih di atas batas toleransi yang ditetapkan oleh Badan Kesehatan Dunia.
N
I
R
H
R
U
T
T
IA
A
A
N
I
A
O
A
EN
A
G
AH
U
AT
AT
EH
LU
R
TA
B
AL
A
A
R
R
U
A
U
R
T
TA
A
IA
TA
TA
U
K
N
AL
M
ES
U
R
R
AR
AR
G
C
AR
TA
IM
A
M
G
TA
TA
AP
G
TA
LU
B
U
R
JA
LA
A
G
A
AN
T
BA
BA
BA
A
KA
N
A
N
EN
TI
TI
P
U
K
N
U
U
U
EL
B
B
G
EL
P
M
G
TE
TE
O
SE
O
YA
M
A
N
A
A
EN
N
A
A
T
N
A
SI
U
A
SI
LA
D
S
S
IJ
R
R
W
TA
R
U
W
ER
TA
ER
TA
N
TE
A
K
SI
IN
E
E
G
N
B
A
A
SI
O
JA
AP
K
LU
JA
TA
W
W
N
E
O
G
ER
TA
N
G
N
G
AT
D
AT
SI
JA
LA
A
LA
IY
A
G
P
A
A
N
W
G
E
M
M
AT
LA
M
M
M
M
EN
A
EN
LA
U
U
D
W
I
U
U
M
I
I
AL
S
M
AL
U
AL
M
LA
T
S
S
U
T
S
U
AL
K
A
S
AL
K
A
U
S
K
S
S
K
U
U
N
N
Berdasarkan hasil Pantauan Status Gizi (PSG) 2017 prevalensi stunting bayi berusia di bawah lima tahun (Balita) Nusa Tenggara Timur (NTT)
mencapai 40,3%. Angka tersebut merupakan yang tertinggi dibanding provinsi lainnya dan juga di atas prevalensi stunting nasional sebesar
29,6%. Prevalensi stunting di NTT tersebut terdiri dari bayi dengan kategori sangat pendek 18% dan pendek 22,3%. Sementara provinsi
dengan prevalensi Balita stunting terendah adalah Bali, yakni hanya mencapai 19,1%. Angka tersebut terdiri dari Balita dengan kategori sangat
pendek 4,9% dan pendek 14,2%.
Data Ketahanan Kesehatan Nasional
Angka Harapan Hidup (AHH) Tren Angka Harapan Hidup Indonesia VS Jawa Barat
adalah perkiraan rata-rata
tambahan umur seseorang Year 2010 2011 2012 2013 2014 2015 2016 2017 2018
yang diharapkan dapat terus
Indonesia 69,8 70,0 70,2 70,4 70,6 70,8 70,9 71,1 71,2
hidup
Jawa Barat 71,3 71,6 71,8 72,1 72,2 72,4 72,4 72,5 72,7
Angka Harapan Hidup (AHH) merupakan alat untuk mengevaluasi kinerja pemerintah dalam meningkatkan kesejahteraan penduduk pada
umumnya, dan meningkatkan derajat kesehatan pada khususnya.
Angka Harapan Hidup (AHH) yang rendah di suatu daerah harus diikuti dengan program pembangunan kesehatan, dan program sosial
lainnya termasuk kesehatan lingkungan, kecukupan gizi dan kalori termasuk program pemberantasan kemiskinan.
SOURCE DATA
• Social Media
• Media Online
• Others
Media Social
Online Media
Crawling Crawling
Sonar Platform
Adhoc Dashboard
Analytics
Data Scientist Business user
Derived from Clients’s internal Populations’ Home, Office and Hangout place
Greater Jakarta &
definition of NGID*, Telkomsel category count
Palembang as cities filter
helps sizing up and visualized
the area populated of
Telkomsel’s subscribers whom
categorized as NGID segment
95%
… of the world's data today
53x
Increase from 1999 to 2016,
50%
has been created to 318,000 million instructions
in the last 3 years! per second
SOURCE: Wikipedia; V&C; Digital Agenda EU; Internet live stats, McKinsey McKinsey & Company 5
Why now? Advanced Analytics
Costs of data storage Data
and processing availability Maths
Telcos
Call
center
SOURCE: Dave Evans (April 2011) "The Internet of Things: How the Next Evolution of the Internet Is Changing Everything” McKinsey & Company 6
Big Data Strategy
Understanding How Big Data and
Data Science Drive Data Monetization
Big Data Operating Model
Data Source Sponsorship & Governance Outcomes
IT & Data
Management The process to obtain executive sponsorship and senior Measurement
leader commitment to the analytics vision
Structure Innovation KPI
Organization Structure & Talent Mgt from Data
Big Data & AI Platform – Providing Big Data Platform, software & licensing
Cloud – Digital Infrastructure – Providing Hardware, Cloud, Data Center – Ensure availability & reliability infrastructure
Activity inside Big Data Unit Activity outside Big Data Unit
Structure Data
• Any data or information that is located in a fixed field within a defined
record or file, usually in database, spreadsheets. Usually it is
organized in rows and column.
• The most common examples include customer data, sales data,
transactional records, financial data, number of website visit, etc.
• Structure data just represent 20% of all the data available. The
remaining 80% is unstructured data.
UnStructure & Semi Structure
• Any data or information that is the term for any data that doesn’t fit
neatly into traditional structure formats or database.
• The most common examples include email conversation, website text,
social media posts, video content, photos, and audio recordings, etc.
• Everything that didn’t fit into database or spreadsheets.
• Semi-structure data is a cross between unstructured and structured
data.
• For example: a tweet can be categorized by author, date, time, length
event.
https://lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/
Defining Internal Data
• Refers to all the information your business currently has or has the
potential to collect (customer database, transactional record, etc).
• It can be structured in format or unstructured (customer call record,
employee interview).
• It is owned by your business and this mean only your company
controls access to the data.
• Usually cheap and free to access which often makes it good starting
point when you considering your data option.
Defining External Data
• Refers to all the data or information that exist outside of your
organization. This is owned by third party.
• It can be structured in format or unstructured. Social media data,
google trends, government census data, economic data, weather
data, etc.
• For small company, it can be very useful.
• It can be free to access, but sometimes we have to buy 3rd party data
to add from our internal data.
Data Management
Data Internal Staging Area Holding Area Curated Data Holding Area
Bappenas
Data
Taxonomy Data Bridge Data Summary
Data External Holding Area sebagai dapur dari
Data Acquisition Cleansed, Aggregation to
(Kementrian, Penyimpanan Data Engineer and Data Scientist Tagging
standardised, daily, weekly,
raw data, dimana Sebelum data dipindah ke Data and
Lembaga Negara) Cataloging organised data monthly
data dari Mart atau data dikurasi. for data delivery Digunakan
Data
oleh Data
berbagai sumber Tempat untuk melakukan: Data
Engineer
Data External data disimpan Quality, Data Validity, Konversi
dan Data
(Pemerintah tanpa merubah Data (string, timestamp) Data Scientists
Daerah) apapun (as is). Join with other tables (e.g.. Translation
1. Standarisasi data
Lookup to Data Reference) Auto Quickwin Use
2. Standarisasi
Append Data Set (Union) Indexing, cases SDGs atau
Metadata
Program Nasional
Data External Pembuatan temporary table
Auto 3. Interoperabilitas
(Open Data) Translation 4. Referensi Data
Data Source
Organizations do not need a big data
strategy; they need a business
strategy that incorporates big
data.
What were revenues and What will revenues and Plant X and Y crops across
profits last year? profits be next year? N acres
How much fertilizer did I How much fertilizer will I Pre-order X amount of
use last planting season? need next planting season? fertilizer at 5% discount
When will my equipment Service your harvester
How much downtime did I
need maintenance next and tractor #2 in January
have last month due to
month?
unplanned equipment
maintenance?
How many workers will I Hire X number of workers
How many workers did I need next month and when for Y days
use last month? will I need them?
Source: “Scientific Method: Embrace the Art of Failure”, University of San Francisco School of Management Big Data MBA
Tools
Programming
Database
Statistical
Mathematical
Visualization
Business/Comm
• Computer science, Software engineer, • Machine learning, predictive analytics, • Business, economy, excel, tableu
database administrator prescriptive analytics • Building business report, insight,for
• Building data infrastructure & pipeline • Building modelling, recommendation engine business team.
Analytics Overview
Prescriptive
How do you make it happen?
Optimization
Predictive
Value What will happen?
Creation Machine Learning
Descriptive
What happened?
Reports, Mapping
Difficulty to Implement
Organizational Structure: Centralized Approach
Chief Data
Scientist
Pros Cons
Flexible resources require less Prioritization of project requests
initial investment can be difficult
Simple for data scientists to share Difficult for data scientists to
ideas and best practices acquire specific domain
knowledge for each business unit
Organizational Structure: Decentralized
Approach Business Unit Leaders
Data Scientists
Pros Cons
Data scientists gain a better Difficult for data scientists to share
understanding of their assigned best practices, data sources,
business unit and can proactively software, etc.
bring new data-driven solutions to
the business
Business units are more likely to Data scientists optimize locally
be involved rather than globally
Organizational Structure: Deployed Approach
Chief Data
Scientist
Data Use
Negotiation /
Self-
monetization
Goal Oriented
Transactional
Receipts
Logistics 6
Data
Service:
delivery 4 Provide alternative
Goal Oriented Logistics financing for
tracking, Service:
7 managing spending via other
customer payment systems
return 8 service
Open Banking
5 Payment Card 3 PSD2 Play using
Tink platform
Transactional
7
Transactional Settle-up
4
notifications
Real-time Selective
1 notifications card lock Data Type
Technology
SCIENCE
INITIATIVES EXPERIMENT
Source: “Driving Business Strategies with Data Science Big Data MBA”, Schmarzo, 2016
2. Don’t Think Business Intelligence,
Think Data Science
Technologies OLAP, ETL, Data Warehousing Cloud Platforms, Python, R Machine Learning
Arif Rachman
Source: “Driving Business Strategies with Data Science Big Data MBA”, Schmarzo, 2016
3. Don’t Think Data Warehouse,
Think Data Lake
Source: “Driving Business Strategies with Data Science Big Data MBA”, Schmarzo, 2016
4. Don’t Think “What Happened”,
Think ”What Will Happen”
“What Should I do”
Order [5.000] units of Component Z to
support widget sales for next month
“What Will Happen”
Hire [Y] new sales reps by these zip codes to
How many widgets will I sell next handle projected Christmas sales
month?
“What Happened” Set aside [$125k] in financial reserve to
What will be sales by zip code cover Product X returns
How many widgets did I sell last over this Chirstmas season?
month? Sell the following product mix to achieve
How many of product X will be quarterly revenue and margin goals
What were sales by zip code for returned next month?
Christmas last year? Increase hiring pipeline by 35% to
What were projected company achieve hiring goals
How many of product X were revenues for next quarter?
returned last month?
How many employees will I need
What were company revenues for to hire next year?
the past quarter?
How many employees did I hire
last year?
Source: “Driving Business Strategies with Data Science Big Data MBA”, Schmarzo, 2016
5. Don’t Think HIPPO,
Think Collaboration
Collaboration
Source: “Driving Business Strategies with Data Science Big Data MBA”, Schmarzo, 2016
THANK YOU
Case-1 :
Football in the age of analytics
Football in the Age of Analytics
• To this end, in 2014, Hopp had built a “footbonaut” on-site at the club’s training facilities.
• One of only three in the world, the footbonaut provided an automated indoor ball-skills and
strengthening environment, and collected data on various aspects of a player’s skills and strengths.
• These data could augment the video and in-person assessments TSG Hoffenheim’s scouts and analysts
relied on to make decisions about player acquisitions.
• The footbonaut data also gave team coaches, doctors, and trainers better insight into player health and
fitness, helping coaches track each player’s development and helping trainers monitor injured players’
rehabilitation.
player_id year score
A01 2012 7,77
A01 2013 7,837
A01 2014 7,845
A01 2015 7,97
A02 2012 7,06
A02 2013 7,127
A02 2014 7,145
A02 2015 7,25
A03 2012 7,1
A03 2013 7,157
A03 2014 7,145
A03 2015 7,28
T01 2012 7,65
T01 2013 7,75
T01 2014 7,77
T01 2015 8,28
T02 2012 8,05
T02 2013 8,12
T02 2014 8,18
T02 2015 8,25
T03 2012 6,67
T03 2013 6,74
T03 2014 6,78
T03 2015 7,31
T04 2012 7,79
T04 2013 7,83
T04 2014 7,89
T04 2015 8,4
T05 2012 7,6
Case-2 :
Analytics in Fashion Industry
Approach
Goal: Maximize expected revenue of new styles