Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Facultat d’Economia i Empresa

Sistemas de Información
3. BI, Analytics & Big Data
Business Intelligence

BI, ANALYTICS & BIG DATA


Databases

Database
• Collecion of organized information that serves many applications
by centralizing data and controlling redundancy

Database management system (DBMS)


• Interfaces between applications and physical data files
• Separates logical and physical views of data
• Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Uncouples programs and data
• Enables organization to central manage data and data security

Basadas en Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. Ver © al final
Databases

HUMAN RESOURCES DATABASE WITH MULTIPLE VIEWS

FIGURE 6-3 A single human resources database provides many different views of data, depending on the information
requirements of the user. Illustrated here are two possible views, one of interest to a benefits specialist and
one of interest to a member of the company’s payroll department.

Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. © Pearson Education, Inc
Databases

• Relational DBMS
– Represent data as two-dimensional tables
– Each table contains data on entity and attributes

• Table: grid of columns and rows


– Rows (tuples): Records for different entities
– Fields (columns): Represents attribute for entity
– Key field: Field used to uniquely identify each record
– Primary key: Field in table used for key fields
– Foreign key: Primary key used in second table as look-up field to
identify records from original table

Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. © Pearson Education, Inc
Relational Database Tables

A relational database organizes


data in the form of two-
dimensional tables. Illustrated
here are tables for the entities
SUPPLIER and PART showing
how they represent each entity
and its attributes. Supplier
Number is a primary key for
the SUPPLIER table and a
foreign key for the PART table.

FIGURE 6-4

Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. © Pearson Education, Inc
Business Intelligence (BI)

Definition
Objective
Tools for consolidating, analyzing,
and providing access to vast Deliver the right information to
amounts of data to help users make the right individual in a timely
better business decisions. manner.

Tools
Does not refer to a specific product or system, but the set of processes, technologies
and applications used to support decision making.
Business intelligence includes ESS and DSS seen in chapter 1.
BI tool example: balanced scorecard.
https://www.youtube.com/watch?v=j7Csj5myIw0

Basadas en Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. Ver © al final
Business Intelligence (BI)

ETL Reports
ERP

Datamart
Graphics
CRM Datamart

Dashboards
Datamart
Web logs
Data Warehouse
Datamart
Maps
Legacy
Systems Datamart

Datamart
......
BI concepts

Storage
DWH (DataWareHouse):
• Stores current and historical data from many core operational
transaction systems
• Consolidates and standardizes information for use across enterprise,
but data cannot be altered
• Provides analysis and reporting tools

Datamart:
–Subset of data warehouse
–Summarized or focused portion of data for use by specific population
of users, for example, business line or geography

ETL (Extract, Transform and Load): Process by which data is loaded in


the DataWareHouse. This process is typically done offline and includes
extaction from transactional systems, transformation and data cleansing.

Basadas en Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. Ver © al final
BI concepts

Analysis

Data Mining: Process by which hidden information is extracted from


dataset. Includes finding patterns and relationships and the predition of
future behaviour.

OLAP (On-Line Analytic Processing): BI solution to analyze large


volumes of information using multi-demensional structures (OLAP cubes).
Uses structured data stored in the DataWareHouse.
Data mining

Clustering Patterns

Stystical analyis
Trends
Artificial
Intelligence
Relationships
Data Warehouse
Neural Networks

Cognitive Depedencies
Computing

Machine
Learning
Anomalies

KDD: Knowledge Discovery in Databases Predictive Analytics


OLAP: What is a cube? (1/2)
Multi-dimensional structure formed by:
Facts Table
One or more measures
One or more dimensions

Conceptos:
A cube is a multidimensional data structure.
Metric is the business insight that has to be extracted from the cube.
A cube may have more than three dimensions.

Examples:

ADSL
ADSL

TV TV

Landline Landline

Ventas Ventas
TV sales in Girona during 2008 All product sales during 2007 in all
regions.
What is a CUBE? (2/2)
Facts table
Main table containing the numeric data to be analyzed.
Contains the mesures and dimensions

Measures:
Magnitudes stored in the facts table.
Limit the level of detail that cab be obtained from the cube.

Dimensions:
Organize data hierarchically.
ADSL
Represent information categories
TV

Telefonía Fija
BI history
The lack of analytic information will continue to difficult business developemnt. 35%
of companies will take wrong decission because of the lack of accurate information.

33% of Business Intellignece data will be stored in Mashup systems (web-based


2012
technology gathering information from different sources (example: Google Maps)
2010 20% de las companies will use Business Intelligence systems in a Software as a
Service scheme (cloud-computing) (1)
2009 Use of BI in Social Networks (such as Facebook) to predict group behaviour(1)

1989 Howard Dresner redefines theterm BI to describe concepts and methods to improve
decision making with the support of Information Systems. (2)

Hans Peter Luhn uses the term Business Intelligence for the first time: capacity to
1958 learn from from the present to focus our actions to achieve our objectives. (3)

Refernces
(1) "Gartner Reveals BI Predictions for 2009 and Beyond", http://www.gartner.com/it/page.jsp?id=856714 [Online, 15/03/09]
(2) D.J.Power. "A Brief History of Decision Support Systems, v.4“, dssresources.com/history/dsshistory.html. [Online, 15/03/09]
(3) H.P.Luhn (1958). "A Business Intelligence System”, www.research.ibm.com/journal/rd/024/ibmrd0204H.pdf. [Online, 15/03/09]
BI charactistics (1/2)
Acessible
- Abilty to use multiple data sources, both internal (IS osf the organization) and external
(market trnds, social networks, …)
- Independent from the data source
- Structured in functional areas or reachable by the entire organiation.
- Information is never lost or deleted, it always grows as the organization executes business.

Support decision making.


- BI systems are of the type DSS or ESS
- Predicts the future using historical data.
- It allows to analyze the information dynamically (navigation and drilling when analyzing the
data).

End user oriented


- Focus one ease of use, in order to facilitate business decisions
- Desined for users without IT knowledge (typically mid/senior management).
BI Characteristics(2/2)
Deals with information, not with data
- Adds value to data in order to turn it into information.
- Base for decision making
Data Information→ Decision making
- Provides answers about the business, not the data.

Multidimensional
- It extracts information dynamically, by combining multiple citerion (dimensions)
- Supports the creation of multiple personalized reports.

“An organization can be rich in data but poor in information” (Madnick, 1993)
Normalization vs. De-normalization
Normalization Employee Id
Dimensions separated in multiple tables. Name
Surname
The Information systems have to look for the
Salary
data in sepparate tables (join).
Workplace ID (Foreign Key)

De-normalization Job ID
Workplace description
Dimensions and hierarchy are consolidated in the same table
Department Id
Faster access to information (Foreign Key)

Department Id
Trabajador Id Department Name
Name Company Id (FK)
Surname
Salary
Job Description
Department name Workplace hierarchy Company Id
Company name
Company name
Business Intelligence Market Share

06/05/2019 Fuente: http://about.g2crowd.com/blog/best-business-intelligence-software-fall-2014-g2-crowd/ 18


Big Data

BI, ANALYTICS & BIG DATA


Data, data and more data

Exabytes: 1018 bytes.


Even more data

Zettabyes: 1021 bytes.


http://www.slideshare.net/McKinseyCompany/digital-globalization-the-new-era-of-global-flows/3-McKinsey_Company_2Used_crossborder_bandwidthCrossborder 22
Big Data & Analytics

Big data
• Massive volume of both structured and unstructured data so large that
it is difficult to process using traditional database and software
techniques (Laudon)
• Big data is high-volume, high-velocity and/or high-variety information
assets that demand cost-effective, innovative forms of information
processing that enable enhanced insight, decision making, and process
automation (Gartner)

Big data analytics


• Process of analyzing big data to gain useful insight to increase
revenues, get or retain customers, and improve operations.
• Big data analytics has the potential to hugely impact both businesses
and society.
Big Data

ERP Social Networks

CRM
Webs, blogs, forums,...

Web logs Big


Data Warehouse

Legacy
Systems Data IoT
......
The Vs of Big Data

Value

Variety Veracity

Velocity

Volume

https://www.experian.co.uk/blogs/latest-thinking/identity-and-fraud/the-evolution-of-big-data-the-6vs/
06/05/2019 25
Traditional bottlenecks in data usage.

Historically, several bottlenecks have limited the ability to spread the use of data
and analytics in the organizaions

Insufficient data and/or data quality

The technology needed to store, transform and analyze information was expensive and
complex -> High Total Cost of Ownership

The talent to analyze and exploit information was uncommon and rare, hence expensive.

The organizational culture did not value data as a key asset, data analysis was not central in
decision making.

Data isolation and difficulty to aggregate different sources keeps analytics-based decision making
as a tactical level, rather than at a strategic level.

Sam Ransbotan (23/11/2015) :


06/05/2019 http://sloanreview.mit.edu/article/the-ethics-of-wielding-an-analytical-hammer/ 26
How bottlenecks are being eliminated

- Businesses are based on Information Systems, which are mature and have
been integrated in the entire value chain. These Information sytems provide a
large amount of high quality data that can deliver a comprehensive view of
the organization.

- External data sources are being incorporated (such as connected devices – IoT
or social networks), that can deliver context and high analtic potential.
- New technologies are being incorporated at a fast pace, delivering additional
information not available in the past.

- Organiztional culture is shifting towards data-oriented decision making.

- Infrastructure and data management improvements help make relevant


information available and create enterprise-wide valuable reports, even in
very large and complex organizations.

Sam Ransbotham @ransbotham (23/11/2015) :


06/05/2019 http://sloanreview.mit.edu/article/the-ethics-of-wielding-an-analytical-hammer/ 27
Absorptive Capacity

The ability of a firm to recognize the value of new, external information,


assimilate it and apply it to commercial ends.

It is considered critical to the innovative capabilities of a firm.


Cohen & Levinthal, 1990 (http://www.jstor.org/stable/2393553)

06/05/2019 28
How can such large amounts of data be
handled?

Current computer nodes are not sufficient to store and process such large
amounts of data. The information and processing of big data has to be
distributed among clusters / hardware.
The most popular way to handle and process data in a distributed manner is
MapReduce:
- Development led by Yahoo and used by Google
- The most popular implementation is open source: Hadoop.
- It allows to distribute storage and computing of data amongst different
computers / servers, making possible the processing of big data.
- Parallelizing tasks provides the ability to process huge amounts of data
quickly.

06/05/2019 29
How can such large amounts of data be
handled? (II)

A contemporary
business intelligence
infrastructure
features capabilities
and tools to manage
and analyze large
quantities and
different types of
data from multiple
sources.

FIGURE 6-12 from the


course reference
book

06/05/2019 Laudon, K.C. & Laudon, J.P. (2012) “Sistemas de Información Gerencial” 12Ed. Ed. Pearson Education. © Pearson Education,30
Inc
Who manages it?
Data Science & Data Scientist

Jeff Harmberger (Facebook) & D.J. Patil (Linkedin) 2008


1990’s Quants: (Wall Street Quantitative analystshighly
intelligent, curious, mathematical individuals applying
methods from eclectic fields to new problems.
Arose out of necessity in early data-driven companies
such as Google, Yahoo, and Amazon
There is a high demand for these professionals, they are
rare and expensive.
The combination of mathematical, statistical, and coding
skills are vital for big data analytics.
These skills can be acquired and developed across a
team5 and not just within a single individual
https://www.ibm.com/big-data/us/en/

http://www.ibmbigdatahub.com/blog/going-beyond-data-science-toward-analytics-ecosystem-part-1
06/05/2019 https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century 31
Examples

Big data and analytics: Watson on medical records, cancer research


- https://www.youtube.com/watch?v=UIExBJN1ZTk

SAP Smart meter analytics


- https://www.youtube.com/watch?v=Z2z9wDlocAQ

BBVA innovation center


- http://www.centrodeinnovacionbbva.com/noticias/ejemplos-reales-del-uso-
de-big-data

You might also like