Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Big Data & AI

In The Era IR 4.0

Edwin Purwandesi
STIMIK ESQ
24 Desember 2021
Edwin
Purwandesi
Head of Digital Infrastructure &
Security Telkom Indonesia

Edwin Purwandesi
Edwin Purwandesi
@edwinpurwandesi

“Credit to Muhammad Hari Diputera for the fancy version of these slides” ☺ 2
Aktivitas Terkini
• Asean-Japan Information Security Forum

• Kelompok Kerja Strategi Nasional Kecerdasan Artifisial (AI)

• Anggota Tim Pakar Dewan Ketahanan Nasional Bidang Big Data

• Anggota Satgas Satu Data Indonesia Telkom

• Certified Design Sprint Master

• Certified Human-Computer Interaction

• Certified Usability Testing

• Coach for Telkom StartUp Program, Amoeba & Indigo

3
Trend of Data Analytics

www.comany.com
The Definition

What is Big Data ?


Big Data is at the heart of nearly every digital transformation.
Organizations are exploring how large-volume data can be usefully
deployed to create and capture value for individuals, businesses,
communities, and governments (McKinsey Global Institute, 2011).

Big Data is defined by the four V’s :


Volume Velocity Variety Value

www.comany.com
Trend
Trendof
ofIndustry
Industry

Source : Studymalaysia.com
www.comany.com
Big data analytics consists of 6Cs in IR4.0

Connection (sensor and networks)

Cloud (computing and data on demand)

Cyber (model & memory)

Content/context (meaning and correlation)

Community (sharing & collaboration)

Customization (personalization and value)

Source : Studymalaysia.com
www.comany.com
Sources of Big Data

Big Data comes from three predominant streams:

Internal Data Streams : "Owned" channels such as organizational websites,


press releases, branded blogs, and company or brand-sponsored pages on
social networks (Twitter, Facebook, etc.

Shared Data Streams: Events, publicity, and sponsorships in which your


organization participates, as well as industry research.

External Data Streams : Organic social media conversations, news,


syndicated and omnibus surveys, government data, and academic studies.

www.comany.com
Data Integration

Data mining is the automated extraction of patterns representing knowledge


implicity stored or collected in large databases, data warehouses, online, other
massive information repositories, or data streams.

Data mining tasks like descriptive (characterizing general properties of data) and
predictive (making inferences based on data) are used to find patterns.

www.comany.com
Data Insight

Three elements must be present to


convert data to insights:

➢ Critical thinking and statistical acumen


➢ Subject matter expertise
➢ Access to tools

www.comany.com
History of AI

www.comany.com
The Definition of Artificial Intelligent

Narrow AI
AI that is skilled at one specific task
Artificial Intelligence (AI)
"Every aspect of learning or any other feature of intelligence can
in principle be so precisely described that a machine can be made
to simulate it."
Dartmouth Artificial Intelligence Conference 1956

SuperIntelligent AI
An intellect that is much smarter than the best
Artificial General Intelligence (AGI) human brains in practically every field, including
AI that is considered human-level, and can scientific creativity, general wisdom and social
perform a range of tasks skills

www.comany.com
The Definition of Machine Learning
It's currently the most promising tool in
Machine Learning (ML) the AI kit for businesses.
is one subfield of AI. The core principle here is that
machines take data and "learn" for themselves.

Unlike hand-coding a software program with specific


instructions to complete a task, ML allows a system to learn
to recognize patterns on its own and make predictions. ML systems can quickly apply
knowledge and training from large data
sets to excel at facial recognition,
speech recognition, object recognition,
translation, and many other tasks.

www.comany.com
The Definition of Deep Learning

Deep learning can be


Deep learning expensive, and requires
is a subset of ML. It uses some ML techniques to solve real-world massive datasets to train
problems by tapping into neural networks that simulate human itself on.
decision-making.

That's because there are a huge number


of parameters that need to be
understood by a learning algorithm,
For instance, a deep learning algorithm could be which can initially produce a lot of false-
instructed to "learn" what a cat looks like. It would take positives..
a very massive data set of images for it to understand
the very minor details that distinguish a cat from, say, a
cheetah or a panther or a fox.

www.comany.com
Why we have to produce data?

Source : Aricent/frog design, primary research (2011)


www.comany.com
www.comany.com
Application
Technology Framework
App / Web Frontend
Tableau PHP Application Presentation
Docker - Openshift Komponen Hadoop Distribution

Komponen pendukung
Analytics / AI Enabler components
Security &
Data Notebook / Data Science Workbench Real Time Aliran data
Governance Keras TensorFlow Operations
Analytic

Keras TfonSpark Python Scala


Caffe MxNet To handle VOLUME of Data
Kubernetes - Docker

Active Directory To handle VARIETY of Data


/ Kerberos Deep Learning AI Spark

To handle VELOCITY of Data


Data Integration, Storage & Computing
Sentry Structured and Semi-structured Data
DWH / Cache
SQL Impala - Spark Compute Yarn
Accessed

Navigator
Spark
MySQL Storage HDFS
streaming
ETL Python - Spark - Kylo

Data Sources

Structured Data Unstructured Data and IoT Devices


Acquired

Web crawling HTTP API Social Media Streaming Application Logs Loc & Movement
DB2 MSSQL CSV
Network Monitoring Device Sensor Video Camera Voice Data IoT Devices

www.comany.com 17
Data Science Delivery Model

www.comany.com
Design Thinking New Approach

www.comany.com
Design Thinking Toolbox for Analytics

www.comany.com
Study Case Video Data Extraction

Video Data extraction is the act or process of


retrieving data out of (usually unstructured or poorly
structured) video data sources for further data
processing or data storage (data migration).

Source : Wikipedia
www.comany.com
Data Sources

Toll Gates Transaction


People Movement Data
- Card id / Obu
1. Existing - Plate number Google
More than 170Mio
CCTV - Time stamp (Date & time) Maps Social
- Vehicle type, Media
2. New CCTV - Gate Number and location
- Price

Specific Solution
1. Detecting number of vehicles at each
gate
2. Identifying speed of vehicle on the road
3. Traffic prediction
4. Gate recommendation
5. Data integration from multiple data
sources gathered from some data
center locations
6. People movement route
7. Behaviour of user Video Analytics
People Movement Analytics

www.comany.com
In the power of Deep Research

Source : Telkom DDS Journal


www.comany.com
Integration Example OpenCV for Video

Source : BDA Video Analytics Architecture


www.comany.com
Detection & Classification MoG + SVM
• Mixture of Gaussian (MoG) models is best suited for system having static and complex
background with clutters
• SVM has the advantage of detail and based statistical computing

www.comany.com
Study Case NLP

Natural language processing (NLP) is an interdisciplinary


domain encompassing linguistics, information retrieval,
machine learning, probability and statistics. It is
concerned with analyzing, understanding, and
interpreting written text and spoken language as well as
using natural languages for communicating with
computers (Indurkhya and Damerau, 2010; Jurafsky and
Martin, 2009; Manning and Schutze, 1999)

Source : Handbook of Statistic 33 : Big Data Analytics


www.comany.com
NLP Statistical Language Modeling

Probability Distribution of Strings


Maximum Likelihood Estimates
Markov Assumption
Unigram, Bigram, and Trigram Models
Smoothing Parameter Values
Evaluating Language Models
Log Linear Models for Language Modeling
Big Data for Building Language Models

Source : Handbook of Statistic 33 : Big Data Analytics


www.comany.com
NLP Turn Any Text Into Gold

www.comany.com
NLP Product Portfolio

www.comany.com
THANK YOU
Study Case SIIS
SIIS Framework & Ecosystem Innovation

ALPRO

DEMAND

ANALYTICS

SUPPORT

TRACKING

www.comany.com
Dashboard Captured

www.comany.com
Study Case Logistik Tani

FUTURE FEATURE

FUTURE FEATURE FUTURE FEATURE

www.comany.com
Dashboard Captured

Logtan –Logistik Tani

Drone Capture Layout Data Spasial

High Resolution and Countable size for each square Data valid, Size of each square valid & ID per square valid

www.comany.com
Dashboard Captured
Blok Sedong, Drone captured

Blok Lega, Drone captured

www.comany.com
Study Case IoT for Poultry
Problems Impacts :
a. Productivity
1. Real-time monitoring of the coops is difficult at present
b. Mortality Rates
2. Unreliable data as chickens weighed manually and done on the basis
c. Labour Cost
of a small sample size
d. Visibility to management

Challenges
Manual recording of the weight of the chickens in Charoen’s poultry
farm is causing erroneous reports resulting in loss of revenue.

IoT Potential Usecase


• Weight Sensor • Wind velocity
• Temperature • NH3
• Humidity • Air pressure differences
• CO2 Sensor

www.comany.com
Proof of Concept Smart Poultry
DESIGN
Connectivity :
3G/4G GSM-LTE
IoT
Platform Censors:
Dashboard
• Temperature and Humidity
• Weight Censor

3G/4G Connectivity Temperature & Humidity Censor


Elektronic Box

LOAD CELL

FARMS
Sensors

Telkomsel Base Station

Challenge to implement for PoC :


• IoT Connectivity in the rural area
• Availability Censors
• Validity Data www.comany.com
Dashboard Captured

www.comany.com

You might also like