Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

IBM Security Systems

Big Data Analytics


Lecture Series

1
1
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

What is the aim of the course

Focus is on Systems and applications for


cloud-based storage and processing of BIG
DATA.
+Big
+Big
+Big
+Big
+Big
+Big
+Big

Data
Data
Data
Data
Data
Data
Data

2
2
2013 IBM Corporation

- Definition
- Analytics
- Storage (HDFS)
- Computing (Map/Reduce)
- Database (HBase)
Graph DB (Titan)
- Streaming (Strom)
2012 IBM Corporation

IBM Security Systems

Mantra

Learning is not just restricted to listening, it is actively asking


relevant questions

3
3
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

What are we going to understand

What is Big Data?


Why we landed up there?
To whom does it matter
Where is the money?
Are we ready to handle it?
What are the concerns?
Tools and Technologies
Is Big Data <=> Hadoop
4
4
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Simple to start

What is the maximum file size you have dealt so far?


Movies/Files/Streaming video that you have used?
What have you observed?

What is the maximum download speed you get?


Simple computation
How much time to just transfer.

5
5
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

What is big data?

Every

day, we create 2.5 quintillion bytes of data


so much that 90% of the data in the world today has
been created in the last two years alone. This data
comes from everywhere: sensors used to gather
climate information, posts to social media sites,
digital pictures and videos, purchase transaction
records, and cell phone GPS signals to name a few.
This data is big

6
6
2013 IBM Corporation

data.

2012 IBM Corporation

IBM Security Systems

Huge amount of data

There are huge volumes of data in the world:


+From the beginning of recorded time until 2003,
+ We created 5 billion gigabytes (exabytes) of data.

+In 2011, the same amount was created every two


days
+In 2013, the same amount of data is created every
10 minutes.

7
7
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Big data spans three dimensions: Volume, Velocity and Variety


Volume: Enterprises are awash with ever-growing data of all types, easily
amassing terabyteseven petabytesof information.
Turn 12 terabytes of Tweets created each day into improved product sentiment
analysis
Convert 350 billion annual meter readings to better predict power consumption
Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as
catching fraud, big data must be used as it streams into your enterprise in order to
maximize its value.
Scrutinize 5 million trade events created each day to identify potential fraud
Analyze 500 million daily call detail records in real-time to predict customer
churn faster
The latest I have heard is 10 nano seconds delay is too much.
Variety: Big data is any type of data - structured and unstructured data such as
text, sensor data, audio, video, click streams, log files and more. New insights are
found when analyzing these data types together.
Monitor 100s of live video feeds from surveillance cameras to target points of
interest
Exploit the 80% data growth in images, video and documents to improve
customer satisfaction
8
8
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Finally.

`Big- Data is similar to Small-data but bigger


.. But having data bigger it requires different approaches:

Techniques, tools, architecture


with an aim to solve new problems

Or old problems in a better way

9
9
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Whom does it matter


Research Community
Business Community - New tools, new capabilities, new infrastructure, new business
models etc.,
On sectors

Financial Services..
10
10
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

How are revenues looking like.

11
11
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

The Social Layer in an Instrumented Interconnected World


30 billion RFID
12+ TBs

tags today
(1.3B in 2005)

of tweet data
every day

camera
phones
world
wide

? TBs of

data every day

100s of
millions
of GPS
enabled
devices
sold
annually

2+
billion

25+ TBs of
log data
every day

76 million smart
12
12
2013 IBM Corporation

4.6
billion

meters in 2009
200M by 2014

people
on the
Web by
end 2011
2012 IBM Corporation

IBM Security Systems

What does Big Data trigger?

From Big Data and the Web: Algorithms for Data Intensive Scalable Computing, Ph.D Thesis, Gianmarco
13
13
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Types of tools typically used in Big Data Scenario

Where is the processing hosted?


Distributed server/cloud
Where data is stored?
Distributed Storage (eg: Amazon s3)
Where is the programming model?
Distributed processing (Map Reduce)
How data is stored and indexed?
High performance schema free database
What operations are performed on the data?
Analytic/Semantic Processing (Eg. RDF/OWL)
14
14
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

When dealing with Big Data is hard

When the operations on data are complex:


Eg. Simple counting is not a complex problem.
Modeling and reasoning with data of different kinds
can get extremely complex
Good news with big-data:
Often, because of the vast amount of data,
modeling techniques can get simpler (e.g., smart
counting can replace complex model-based
analytics)
as long as we deal with the scale.
15
15
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Time for thinking

What do you do with the data.


Lets take an example:

From application developers to video streamers, organizations of all sizes face the
challenge of capturing, searching, analyzing, and leveraging as much as terabytes of
data per secondtoo much for the constraints of traditional system capabilities and
database management tools.
16
16
2013 IBM Corporation

2012 IBM Corporation

IBM Security Systems

Why Big-Data?

Key enablers for the appearance and growth of Big-Data are:

+Increase in storage capabilities


+Increase in processing power
+Availability of data

17
17
2013 IBM Corporation

2012 IBM Corporation

IBM big data IBM big data

IBM big data


IBM big data
18

IBM big data

IBM big data IBM big data

IBM big data

THINK

IBM big data

IBM big data

IBM Security Systems

2013 IBM Corporation

You might also like