Bi 1nov2017 One

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 10

710/16/S07 BUSINESS INTELLIGENCE

Question 1a)

Briefly explain the use of BI technology in the following areas

i. Manufacturing
ii. Retail
iii. Financial Services
iv. Transportation
v. Telecommunications [10 Marks]

SOLUTION

BI technology is used in manufacturing for order shipment and customer support, in retail for
user profiling to target grocery coupons during checkout, in financial services for claims
analysis and fraud detection, in transportation for fleet management, in telecommunications
for identifying reasons for customer churn and network usage analysis.

Question 1b)

Write the following in full as used in Business Intelligence

i. ETL- Extract-Transform-Load

ii. Complex Event Processing (CEP)

iii. Relational database management systems (RDBMS).

iv. SQL Structured Query Language

v. Online analytic processing (OLAP) [5 Marks]

Question 1c)

Identify the three factors (metrics) which determine the regression technique to be
used in making predictions. [5 Marks]

1
710/16/S07 BUSINESS INTELLIGENCE

Solutions

There are various kinds of regression techniques available to make predictions. These
techniques are mostly driven by three metrics (number of independent variables, type of
dependent variables and shape of regression line).

Question 2a)

Define a data warehouse [2 Marks]

A data warehouse is a federated repository for all the data that an enterprise's various
business systems collect. The repository may be physical or logical.

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection


of data in support of management's decision making process.

Question 2b)

Outline the five major objectives of a data warehouse [10 Marks]

2
710/16/S07 BUSINESS INTELLIGENCE

Question 2c)

Explain the following basic characteristics of a data warehouse

i. Subject-Oriented

ii. Integrated

iii. Time-Variant

iv. Non-volatile [8 Marks]

Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For
example, "sales" can be a particular subject.

Integrated: A data warehouse integrates data from multiple data sources. For example,
source A and source B may have different ways of identifying a product, but in a data
warehouse, there will be only a single way of identifying a product.

Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve
data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most recent data is kept. For
example, a transaction system may hold the most recent address of a customer, where a
data warehouse can hold all addresses associated with a customer.

Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a
data warehouse should never be altered.

Question 3a) Write notes on the following types of reports with the aid of examples

i. Management reporting

ii. Ad-hoc report [5 Marks]

3
710/16/S07 BUSINESS INTELLIGENCE

SOLUTION

Management Reporting – Automate and streamline management reporting by selecting


reports, scorecards, dashboards, and charts into a report pack; add images, title pages, a
table of contents, explanatory text, and create a PDF. When opened, reports are
automatically refreshed with the latest data, including fields within report explanations.

Ad-hoc Analysis – Answer any question at any time by exploring multi-dimensional data
from many different perspectives by filtering, drilling down, slicing, and dicing.

Question 3b)

Identify and explain any four basic Characteristics of OLAP [12 Marks]

SOLUTION

 They use multidimensional data analysis techniques.


 They provide advanced database support.

 They provide easy-to-use end-user interfaces.

 They support the client/server architecture.

Multidimensional Data Analysis Techniques:


The most distinctive characteristic of modern OLAP tools is their capacity for
multidimensional analysis. In multidimensional analysis, data are processed and viewed as
part of a multidimensional structure. This type of data analysis is particularly attractive to
business decision makers because they tend to view business data as data that are related
to other business data.

II. Advanced Database Support:

4
710/16/S07 BUSINESS INTELLIGENCE

To deliver efficient decision support, OLAP tools must have advanced data access features.
Such features include:

• Access to many different kinds of DBMSs, flat files, and internal and external data sources.

• Access to aggregated data warehouse data as well as to the detail data found in
operational databases.

• Advanced data navigation features such as drill-down and roll-up.

• Rapid and consistent query response times.

• The ability to map end-user requests, expressed in either business or model terms, to the
appropriate data source and then to the proper data access language (usually SQL). The
query code must be optimized to match the data source, regardless of whether the source is
operational or data warehouse data.

• Support for very large databases. As already explained the data warehouse can easily and
quickly grow to multiple gigabytes and even terabytes.

III. Easy-to-Use End-User Interface:


Advanced OLAP features become more useful when access to them is kept simple. OLAP
tools have equipped their sophisticated data extraction and analysis tools with easy-to-use
graphical interfaces. Many of the interface features are “borrowed” from previous
generations of data analysis tools that are already familiar to end users. This familiarity
makes OLAP easily accepted and readily used.

IV. Client/Server Architecture:

5
710/16/S07 BUSINESS INTELLIGENCE

Client/server architecture provides a framework within which new systems can be


designed, developed, and implemented. The client/server environment enables an OLAP
system to be divided into several components that define its architecture. Those
components can then be placed on the same computer, or they can be distributed among
several computers. Thus, OLAP is designed to meet ease-of-use requirements while
keeping the system flexible.

Question 3b) List any three types of OLAP Servers [3 Marks]

Types of OLAP Servers

 Relational OLAP (ROLAP)


 Multidimensional OLAP (MOLAP)

 Hybrid OLAP (HOLAP)

 Specialized SQL Servers

Question 4a)
Define data mining [2
Marks]

Data mining is the computing process of discovering patterns in large data sets involving
methods at the intersection of artificial intelligence, machine learning, statistics, and
database systems. It is an interdisciplinary subfield of computer science. The overall goal of
the data mining process is to extract information from a data set and transform it into an
understandable structure for further use. Aside from the raw analysis step, it involves
database and data management aspects, data pre-processing, model and inference
considerations, interestingness metrics, complexity considerations, post-processing of

6
710/16/S07 BUSINESS INTELLIGENCE

discovered structures, visualization, and online updating. Data mining is the analysis step
of the "knowledge discovery in databases" process, or KDD.

Question 4b)

Explain the relationship between Data Mining and Data Warehouse [10 Marks]

Key difference: Data Mining is actually the analysis of data. It is the computer-assisted
process of digging through and analyzing enormous sets of data that have either been
compiled by the computer or have been inputted into the computer. Data warehousing is the
process of compiling information or data into a data warehouse. A data warehouse is a
database used to store data.

Data Mining is actually the analysis of data. It is the computer-assisted process of digging
through and analyzing enormous sets of data that have either been compiled by the
computer or have been inputted into the computer. In data mining, the computer will analyze
the data and extract the meaning from it. It will also look for hidden patterns within the data
and try to predict future behavior. Data Mining is mainly used to find and show relationships
among the data.

The purpose of data mining, also known as knowledge discovery, is to allow businesses to
view these behaviors, trends and/or relationships and to be able to factor them within their
decisions. This allows the businesses to make proactive, knowledge-driven decisions.

The term ‘data mining’ comes from the fact that the process of data mining, i.e. searching
for relationships between data, is similar to mining and searching for precious materials.

7
710/16/S07 BUSINESS INTELLIGENCE

Data mining tools use artificial intelligence, machine learning, statistics, and database
systems to find correlations between the data. These tools can help answer business
questions that traditionally were too time consuming to resolve.

Data Mining includes various steps, including the raw analysis step, database and data
management aspects, data preprocessing, model and inference considerations,
interestingness metrics, complexity considerations, post-processing of discovered
structures, visualization, and online updating.

In contrast, data warehousing is completely different. However, data warehousing and data
mining are interrelated. Data warehousing is the process of compiling information or data
into a data warehouse. A data warehouse is a database used to store data. It is a central
repository of data in which data from various sources is stored. This data warehouse is then
used for reporting and data analysis. It can be used for creating trending reports for senior
management reporting such as annual and quarterly comparisons.

The purpose of a data warehouse is to provide flexible access to the data to the user. Data
warehousing generally refers to the combination of many different databases across an
entire enterprise.

The main difference between data warehousing and data mining is that data warehousing is
the process of compiling and organizing data into one common database, whereas data
mining is the process of extracting meaningful data from that database. Data mining can
only be done once data warehousing is complete.

Question 4c)

Explain the following Data Mining terms [8 Marks]

i. Market Basket Analysis

8
710/16/S07 BUSINESS INTELLIGENCE

ii. itemset

iii. support count

iv. confidence

SOLUTION

Market Basket Analysis is a modelling technique based upon the theory that if you buy a
certain group of items, you are more (or less) likely to buy another group of items. For
example, if you are in an English pub and you buy a pint of beer and don't buy a bar meal,
you are more likely to buy crisps (US.

An itemset is the set of items a customer buys at the same time. It’s typically stated as a
logic rule like IF {bread, peanut butter} THEN {jelly}. An itemset can consist of no items (a
null amount though, is usually ignored) to all items in the data set.

The support count is a count of how often the itemset appears in the transaction database.
The support is how often the item appears, stated as a probability. For example, if the
support count is 21 out of a possible 1,000 transactions, then the probability is 21/1,000 or
0.021.

The confidence is the conditional probability that the items will be purchased together.

Question 5a)
Briefly explain text mining [3
Marks]

Text mining, also referred to as text data mining, roughly equivalent to text analytics, is the
process of deriving high-quality information from text. High-quality information is typically
derived through the devising of patterns and trends through means such as statistical
pattern learning.

Question 5b)

9
710/16/S07 BUSINESS INTELLIGENCE

With the aid of action result and tool explain the text mining process [12 Marks]

The Text Mining Process

Whether you intend to use textual data for descriptive purposes, predictive purposes, or
both, the same processing steps take place, as shown in the following table:

Action Result Tool


Text Import node
Creates a single SAS data set from your
%TMFILTER macro — a SAS
document collection. The SAS data set
macro for extracting text from
File preprocessing is used as input for the Text Parsing
documents and creating a
node, and might contain the actual text
predefined SAS data set with a
or paths to the actual text.
text variable
Decomposes textual data and generates
Text parsing a quantitative representation suitable for Text Parsing node
data mining purposes.
Transformation Transforms the quantitative
(dimension representation into a compact and Text Filter node
reduction) informative format.
Text Cluster node
Performs classification, prediction, or
Text Topic node
Document concept linking of the document
Text Rule Builder node
analysis collection. Creates clusters, topics, or
SAS Enterprise Miner
rules from the data.
predictive modeling nodes

Question 5c)

Will an immoral manager make an ethical decision or a moral manager make an


unethical decision give an opinion to support your view. [5 Marks]

Most certainly. However, those that seek to make moral personal decisions have the will or
desire to seek what's right over the long term. This will be reflected in their ethics in decision
making (decisions made in the business context). There will also be the case where a
person's morals may come into conflict with the organization's ethics. Expect this to be the
greatest source of dilemmas in ethics and decision making in an organizational context.

10

You might also like