Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 10

BI&DWH

Introduction
BI Definition:-
• The term Business Intelligence (BI) refers to technologies,
applications and practices for the collection, integration,
analysis, and presentation of business information.
• The purpose of Business Intelligence is to support better
business decision making.
• Essentially, Business Intelligence systems are data-driven
Decision Support Systems (DSS).
• Business Intelligence is sometimes used interchangeably
with briefing books, report and query tools and executive
information systems.
Data Analysis and Statistics
• Data analysis is getting a lot of attention
nowadays. It’s taking on an increasingly larger
role at all sizes of companies, including small
startups. The practice of data analysis has
gradually developed over time, gaining huge
benefits from evolution in computing. Let’s
take a short journey together through the
history of data analysis.
Data Analysis and Statistics … Continued
• Data analysis is rooted in statistics, which has a pretty long history. It is said
that the beginning of statistics was marked in ancient Egypt as it took a
periodic census for building pyramids.
• Throughout history, statistics has played an important role for governments
all across the world, for the creation of censuses, which were used for
various governmental planning activities (including, of course, taxation).
• With the data collected, we can move on to the next step, which is the
analysis of that data.
• Data analysis is a process that begins with retrieving data from various
sources and then analyzing it with the goal of discovering beneficial
information. For example, the analysis of population growth by district can
help governments determine the number of hospitals that would be
needed in a given area.
Data Analysis and Computing
• The invention of computers and the subsequent advances in
computing technology dramatically enhanced what we can do
with data analysis.
• Before computers, the 1880 Census in the US took over 7 years
to process the collected data and to arrive at a final report. In
order to shorten the time it takes for creating the Census, in
1890, Herman Hollerith invented the “Tabulating Machine”. This
machine was capable of systematically processing data recorded
on punch cards. Thanks to the Tabulating Machine, the 1890
census finished in only 18 months and on a much smaller
budget.
• Advances in Collection Mechanisms
Relational Databases
• After the von Neumann architecture was invented, the data
had been regarded and processed as data to be processed
for data analysis. The turning point was the appearance of
RDB (relational database) in the 1980s which allowed users
to write Sequel (SQL) to retrieve data from a database. For
users, the advantage of RDB and SQL is to be able to
analyze their data on demand. It made the process to get
data easy and helped to spread database use. As you see,
the combination of easier/cheaper data collection with
cheaper/faster data storage/retrieval technology has
pushed the boundaries of what we can do with data.
Data warehouse and Business Intelligence
• From around the late 1980s, the amount of data collected continued to increase
significantly, thanks to the ever decreasing costs for hard disk drives. That’s when
William H. Inmon proposed a “data warehouse”, which is a system optimized for
reporting and data analysis. The difference from usual relational databases is
that data warehouses are usually optimized for response time to queries. Many
times data is stored with a timestamp and operations such as DELETEs and
UPDATEs are used much less frequently. For example, if a business wanted to
compare sales trends for each month, all sales transactions can be stored with
timestamps within a data warehouse, and queried based on this timestamp. Also
the term “BI (Business Intelligence)” was proposed by Howard Dresner at
Gartner in 1989. BI supports better business decision making through searching,
collecting and analyzing accumulated data in business. The birth of the concept
was only natural, given the quality of technologies like databases and data
warehouses available to support it. Especially big companies embraced BI by
analyzing customer data systematically when making business decisions.
Data Mining
• Data mining, which appeared around the 1990s, is the
computational process to discover patterns in large
datasets. By analyzing data in a different way from usual
methods, unexpected but beneficial results could be
expected. The development of data mining was made
possible thanks to database and data warehouse
technologies, which enable companies to store more data
and still analyze it in a reasonable manner. A general
business trend emerged, where companies started to
“predict” customers’ potential needs based on analysis of
historical purchasing patterns.
Google Web Search
• The next big change was the internet. For the demand of
searching a particular website on the web, Larry Page and
Sergey Bin developed the Google search engine which
processes and analyzes big data in distributed computers.
Surprisingly the Google engine responds with the result which
you mostly likely wanted to see in just a few seconds. The key
points of this system are that it was “automated”, “scalable”
and “high performance”. A white paper on MapReduce in 2004
greatly inspired engineers, pulling in an influx of talent to the
challenge of handling big data. In the late 2000s, many open
source software projects like Apache Hadoop and Apache
Cassandra were created to take on this challenge.
Big Data Analysis on the Cloud
• In the early 2010s, Amazon Redshift, which is a cloud-
based data warehouse, and Google BigQuery, which
processes a query in thousands of Google servers, were
released. Both came with a remarkable fall in cost and
lowered the hurdle to process big data. Nowadays,
every company is able to get an infrastructure for big
data analysis within a reasonable budget. Even startups,
which traditionally did not have a budget to conduct
such analysis, are now able to repeat PDCA cycles
rapidly by using big data tools such as Amazon Redshift.

You might also like