Professional Documents
Culture Documents
Bi 1nov2017 One
Bi 1nov2017 One
Bi 1nov2017 One
Question 1a)
i. Manufacturing
ii. Retail
iii. Financial Services
iv. Transportation
v. Telecommunications [10 Marks]
SOLUTION
BI technology is used in manufacturing for order shipment and customer support, in retail for
user profiling to target grocery coupons during checkout, in financial services for claims
analysis and fraud detection, in transportation for fleet management, in telecommunications
for identifying reasons for customer churn and network usage analysis.
Question 1b)
i. ETL- Extract-Transform-Load
Question 1c)
Identify the three factors (metrics) which determine the regression technique to be
used in making predictions. [5 Marks]
1
710/16/S07 BUSINESS INTELLIGENCE
Solutions
There are various kinds of regression techniques available to make predictions. These
techniques are mostly driven by three metrics (number of independent variables, type of
dependent variables and shape of regression line).
Question 2a)
A data warehouse is a federated repository for all the data that an enterprise's various
business systems collect. The repository may be physical or logical.
Question 2b)
2
710/16/S07 BUSINESS INTELLIGENCE
Question 2c)
i. Subject-Oriented
ii. Integrated
iii. Time-Variant
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For
example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources. For example,
source A and source B may have different ways of identifying a product, but in a data
warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve
data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most recent data is kept. For
example, a transaction system may hold the most recent address of a customer, where a
data warehouse can hold all addresses associated with a customer.
Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a
data warehouse should never be altered.
Question 3a) Write notes on the following types of reports with the aid of examples
i. Management reporting
3
710/16/S07 BUSINESS INTELLIGENCE
SOLUTION
Ad-hoc Analysis – Answer any question at any time by exploring multi-dimensional data
from many different perspectives by filtering, drilling down, slicing, and dicing.
Question 3b)
Identify and explain any four basic Characteristics of OLAP [12 Marks]
SOLUTION
4
710/16/S07 BUSINESS INTELLIGENCE
To deliver efficient decision support, OLAP tools must have advanced data access features.
Such features include:
• Access to many different kinds of DBMSs, flat files, and internal and external data sources.
• Access to aggregated data warehouse data as well as to the detail data found in
operational databases.
• The ability to map end-user requests, expressed in either business or model terms, to the
appropriate data source and then to the proper data access language (usually SQL). The
query code must be optimized to match the data source, regardless of whether the source is
operational or data warehouse data.
• Support for very large databases. As already explained the data warehouse can easily and
quickly grow to multiple gigabytes and even terabytes.
5
710/16/S07 BUSINESS INTELLIGENCE
Question 4a)
Define data mining [2
Marks]
Data mining is the computing process of discovering patterns in large data sets involving
methods at the intersection of artificial intelligence, machine learning, statistics, and
database systems. It is an interdisciplinary subfield of computer science. The overall goal of
the data mining process is to extract information from a data set and transform it into an
understandable structure for further use. Aside from the raw analysis step, it involves
database and data management aspects, data pre-processing, model and inference
considerations, interestingness metrics, complexity considerations, post-processing of
6
710/16/S07 BUSINESS INTELLIGENCE
discovered structures, visualization, and online updating. Data mining is the analysis step
of the "knowledge discovery in databases" process, or KDD.
Question 4b)
Explain the relationship between Data Mining and Data Warehouse [10 Marks]
Key difference: Data Mining is actually the analysis of data. It is the computer-assisted
process of digging through and analyzing enormous sets of data that have either been
compiled by the computer or have been inputted into the computer. Data warehousing is the
process of compiling information or data into a data warehouse. A data warehouse is a
database used to store data.
Data Mining is actually the analysis of data. It is the computer-assisted process of digging
through and analyzing enormous sets of data that have either been compiled by the
computer or have been inputted into the computer. In data mining, the computer will analyze
the data and extract the meaning from it. It will also look for hidden patterns within the data
and try to predict future behavior. Data Mining is mainly used to find and show relationships
among the data.
The purpose of data mining, also known as knowledge discovery, is to allow businesses to
view these behaviors, trends and/or relationships and to be able to factor them within their
decisions. This allows the businesses to make proactive, knowledge-driven decisions.
The term ‘data mining’ comes from the fact that the process of data mining, i.e. searching
for relationships between data, is similar to mining and searching for precious materials.
7
710/16/S07 BUSINESS INTELLIGENCE
Data mining tools use artificial intelligence, machine learning, statistics, and database
systems to find correlations between the data. These tools can help answer business
questions that traditionally were too time consuming to resolve.
Data Mining includes various steps, including the raw analysis step, database and data
management aspects, data preprocessing, model and inference considerations,
interestingness metrics, complexity considerations, post-processing of discovered
structures, visualization, and online updating.
In contrast, data warehousing is completely different. However, data warehousing and data
mining are interrelated. Data warehousing is the process of compiling information or data
into a data warehouse. A data warehouse is a database used to store data. It is a central
repository of data in which data from various sources is stored. This data warehouse is then
used for reporting and data analysis. It can be used for creating trending reports for senior
management reporting such as annual and quarterly comparisons.
The purpose of a data warehouse is to provide flexible access to the data to the user. Data
warehousing generally refers to the combination of many different databases across an
entire enterprise.
The main difference between data warehousing and data mining is that data warehousing is
the process of compiling and organizing data into one common database, whereas data
mining is the process of extracting meaningful data from that database. Data mining can
only be done once data warehousing is complete.
Question 4c)
8
710/16/S07 BUSINESS INTELLIGENCE
ii. itemset
iv. confidence
SOLUTION
Market Basket Analysis is a modelling technique based upon the theory that if you buy a
certain group of items, you are more (or less) likely to buy another group of items. For
example, if you are in an English pub and you buy a pint of beer and don't buy a bar meal,
you are more likely to buy crisps (US.
An itemset is the set of items a customer buys at the same time. It’s typically stated as a
logic rule like IF {bread, peanut butter} THEN {jelly}. An itemset can consist of no items (a
null amount though, is usually ignored) to all items in the data set.
The support count is a count of how often the itemset appears in the transaction database.
The support is how often the item appears, stated as a probability. For example, if the
support count is 21 out of a possible 1,000 transactions, then the probability is 21/1,000 or
0.021.
The confidence is the conditional probability that the items will be purchased together.
Question 5a)
Briefly explain text mining [3
Marks]
Text mining, also referred to as text data mining, roughly equivalent to text analytics, is the
process of deriving high-quality information from text. High-quality information is typically
derived through the devising of patterns and trends through means such as statistical
pattern learning.
Question 5b)
9
710/16/S07 BUSINESS INTELLIGENCE
With the aid of action result and tool explain the text mining process [12 Marks]
Whether you intend to use textual data for descriptive purposes, predictive purposes, or
both, the same processing steps take place, as shown in the following table:
Question 5c)
Most certainly. However, those that seek to make moral personal decisions have the will or
desire to seek what's right over the long term. This will be reflected in their ethics in decision
making (decisions made in the business context). There will also be the case where a
person's morals may come into conflict with the organization's ethics. Expect this to be the
greatest source of dilemmas in ethics and decision making in an organizational context.
10