Relation Between Big: Data and Business

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Mohammed salih Assist. Prof. Dr.

alia' Hassan

1. Relation between Big data and business


Big data is a term that defines the data that exceeds the processing
capabilities of spreadsheets, common statistical applications and
conventional relational database management system.
The data is too big and moves so fast that is does not fit the structure of
traditional database architecture.

Big data characteristics:


The main three characteristics of big data are what IBM called 3V
diagram as following:
1- Volume: big data comes in huge volumes in terabytes and even
petabytes of data size.
2- Velocity: big data is generated continuously and at a fast rate
rapidly flowing through the system from various sources.
3- Variety: big data has various formats and types.

Big data characteristics


Big data problem
New types of data are generated from various source like smart
phones, tables, sensor, etc. these huge amounts of data have introduced a
problem, "big data problem" which lies in capturing, processing and
analyzing these huge data as traditional system tend to fail doing that.
Here are sample statistics of how much data is generated:
 Facebook: it has 40 PB2 of data and capture 100 TB 3 per day.
 Yahoo: it has 60 PB of data
 Twitter: capture 8 TB per day

1
Mohammed salih Assist. Prof. Dr. alia' Hassan

Business environment
The last decades, Business environment become dynamic and
changing. Primary goal of companies around world is process of turing
data into intelligence.
Business intelligence (BI) for short can be defined as the process of
performing the collect, organize, analyze, store and retrieve huge amount
of data to produce high level information of reports and insights that
allow business management to make crucial decision. BI improve
performance and efficiency of business workflow to take right actions at
right time according to company's strategy. BI provide easy
interpretation of raw data. BI increase profit and decrease expenses.

Combination of business intelligence world with big data


Combination of business intelligence world with big data world using
open source technologies is a critical solution to companies and
organizations that differ in size and operation. This Combination
provides:
 Flexible platform to analyze data with any size in various formats.
 Provide business insights that affect the functionality of company
and organization.

2
Mohammed salih Assist. Prof. Dr. alia' Hassan

2. Data Mining System with or without Data Warehouse


System
A critical question in the design of a data mining (DM) system is how to
integrate DM system with a data warehouse (DW) system. Possible
integration schemes include as follows:
1. No coupling: DM system works as a stand-alone system. There are no
DW systems with which it has to communicate. It may fetch data from a
particulars source (such as a file system), process data and then store
results in another file. No coupling represents a poor design.
DM system without using a DW system suffers from several
drawbacks:
 First, DM system may spend amount of time finding, collecting and
transforming data because of DW system provides flexibility and
efficiency at storing, accessing, and processing data.
 Second, DM system will need to use other tools to extract data,
making it difficult to integrate such a system into an information
processing environment.
2. Loose coupling: DM system will use some facilities of a DW system,
Loose coupling is better than no coupling because it can fetch any
portion of data stored in data warehouses by using query processing,
indexing, and other system facilities. It is difficult for loose coupling to
achieve high scalability and good performance with large data sets.
3. Semi-tight coupling: compromise between loose and tight coupling.
Which means that besides linking a DM system to DW system, efficient
implementations of a few essential data mining primitives. This design
will enhance the performance of a DM system.
4. Tight coupling: Tight coupling means that a DM system is smoothly
integrated into the DW system. This will provide a uniform information
processing environment. This approach is highly desirable because it
facilitates efficient implementations of data mining functions, high
system performance, and an integrated information processing
environment.

3
Mohammed salih Assist. Prof. Dr. alia' Hassan

3. OLAP and data mining complement each other


Today’s commercially available relational database systems now
routinely include tools such as SQL database query engines, data mining
components, and OLAP. Contrary to many assertions in the literature and
business press, performing queries on large tables or manipulating query
data via OLAP tools is not considered data mining because no data
modeling occurs in these tools.
On the other hand, these three tools complement each other and
allow developers to pick the tool that is right for their application:
1- Queries allow ad hoc access to virtually any instance in a database;
2- Data mining tools can generate high-level, discovers hidden patterns in
data and operates at a detail level instead of a summary level. 
3- OLAP allows for real-time access to pre-aggregated measures along
important business dimensions. OLAP is a data
summarization/aggregation tool that helps simplify data analysis.

By integrating OLAP with multiple data mining functions, on-line


analytical mining (OLAM) provides users with the flexibility to select
desired data mining functions and swap data mining tasks dynamically.

You might also like