Assignment 2 Brief

Student Name/ID Number Vo Van Tuong Quang/GCS190882

Unit Number and Title 14: Business Intelligence

Academic Year 2020-2021

Unit Tutor

Assignment Title Assignment 2: Apply BI tools & techniques and their impact

Issue Date

Submission Date

IV Name & Date

Submission Format

Part I: Project submission. This should be a zip / rar folder of your project, including all necessary files to
run your project. There should be a link to your Tableau work on Tableau Public cloud.
Part II: The submission is in the form of a group written report. This should be written in a concise, formal
business style using single spacing and font size 12. You are required to make use of headings,
paragraphs and subsections as appropriate, and all work must be supported with research and
referenced using the Harvard referencing system. Please also provide a bibliography using the Harvard
referencing system.
Part III: Team needs to present their point of view about how business intelligence tools can contribute
to effective decision-making as well as the legal issues involved in exploiting user data for business
intelligence. You may need to research for specific examples of organizations that use BI tools to
enhance or improve their business and evaluate how they can use BI tools for extend their target
audience and make them more competitive within the market.

Unit Learning Outcomes

LO3 Demonstrate the use of business intelligence tools and technologies

Assignment Brief

(Continued from previous scenario)

Your next task is to demonstrate to the board of directors about the ability of applying business
intelligence in the company's current business processes. To demonstrate BI, you need to prepare a
presentation about BI and related tools & techniques and a demonstration on real company dataset.
For the presentation, you need:
- Explain general concept of what is BI
- Introduction to some tools / techniques for BI and their application in general

For the demonstration, you need:

- A (some) data set(s) extracted from the company's business processes. Explain the dataset.
- Show how you pre-process data for later analysis, explain each step and it purpose
- Design dashboards to show your analysis on pre-processed data. Explain clearly purpose of
dashboards and charts. Suggestions should be made after analysis
During the demonstration, you need collect feed-back and comments from users to review how well
your dashboards design meet user or business requirement and what customization needed for future
Team needs to present their point of view about how business intelligence tools can contribute to
effective decision-making as well as the legal issues involved in exploiting user data for business
intelligence. You may need to research for specific examples of organizations that use BI tools to
enhance or improve their business and evaluate how they can use BI tools for extend their target
audience and make them more competitive within the market.
To summary, you need to submit a report in PDF includes 4 parts: your presentation, result of
demonstration and review of user feedback, point of view on BI contribution and legal issues.

Learning Outcomes and Assessment Criteria

Pass Merit Distinction

LO3 Demonstrate the use of business intelligence tools and


D3 Provide a critical review

P3 Determine, with examples, M3 Customise the design to ensure of the design in terms of
what business intelligence is and that it is user friendly and has a how it meets a specific user
the tools and techniques functional interface. or business requirement
associated with it. and identify what
customisation has been
integrated into the design.
P4 Design a business intelligence
tool, application
or interface that can perform a
specific task to support problem-
solving or decision-making at an
advanced level.

LO4 Discuss the impact of business intelligence tools and technologies D4 Evaluate how
for effective decision-making purposes and the legal/regulatory organisations could use

context in which they are used business intelligence to
extend their target audience
P5 Discuss how business M4 Conduct research to identify and make them more
intelligence tools can contribute specific examples of organisations competitive within the
to effective decision-making. that have used business market, taking security
intelligence tools to enhance or legislation into consideration
improve operations.
P6 Explore the legal issues
involved in the secure
exploitation of business
intelligence tools

1. Business Intelligence, tools and techniques (P3)............................................................................................6
1.1. Definition of Business Intelligence...........................................................................................................6
1.2. Analytics (descriptive, predictive, prescriptive).......................................................................................6
1.2.1. Descriptive analytics..........................................................................................................................7
1.2.2. Predictive analytics............................................................................................................................8
1.2.3. Prescriptive analytics.........................................................................................................................8
1.3. Some BI tools............................................................................................................................................9
1.3.1. MicroStrategy....................................................................................................................................9
1.3.2. Datapine..........................................................................................................................................10
1.3.3. Tableau............................................................................................................................................10
2. Designing BI tools to solve problems and aid in decision making (P4).........................................................11
2.1. Pre-processing steps...............................................................................................................................11
2.2. Python Matplotlib reports......................................................................................................................16
2.3. Tableau reports, dashboards and stories (M3, D3)................................................................................19
3. How business intelligence supports decision making (P5)...........................................................................25
4. Issues surrounding BI usage (P6)...................................................................................................................26
5. BI case studies (M4, D4)................................................................................................................................26
6. References.....................................................................................................................................................27

1. Business Intelligence, tools and techniques (P3)
1.1. Definition of Business Intelligence
Business Intelligence is a term that encompasses architectures, tools, databases, analytics, applications, and
methodologies. It means different things to people, with a lot of the confusion being largely in part due to
the acronyms and buzzwords associated with it. However, the common theme is that BI seeks to enable
interactive access, manipulation, and analysis of data. By analyzing historical and present data, specific
situations and performance indicators, decision makers obtain valuable data-driven insight that aid them in
making more informed and appropriate decisions. BI has data be processed into information, which serves
as input for better decisions that are then executed as actions. BI systems include: a data warehouse,
business analytics, tools for manipulating, mining, and analyzing data, business process or performance
management, and a user interface.

1.2. Analytics (descriptive, predictive, prescriptive)

Many authors define "analytics" differently, but one can view it as the process of creating actionable
decisions or recommendations based on data driven insight. The Institute for Operations Research and
Management Science (INFORMS) views that analytics represents the combination of computer technology,
management techniques, and statistics, to solve problems. With the INFORMS viewpoint, there are three
levels of analytics proposed and encapsulated by INFORMS. Note that the levels are not necessarily to be
interpreted as sequential steps, but can be viewed as having overlaps.

1.2.1. Descriptive analytics
Also called reporting analytics, this seeks to know what currently is happening in an organization, and to
understand underlying patterns and causes of those phenomena. At first this requires the consolidation of
data sources and availability of relevant data in a manner that allows for proper reporting and analysis.
Often times the creation of this infrastructure for data is part of data warehouses, from which the
organization can use to bring about reports, queries, alerts, and trends.
In one example, Seattle's Children Hospital management seeks to identify and eliminate inefficiencies from
their processes such that resources are better spent towards catering to patients and saving lives, ensuring
quality and safety for them. Bearing this in mind, they spend much time analyzing data relevant to patient
Specifically, they implemented Tableau, the BI software, with easy-to-use analytics (intuitive for people to
make visualizations). Various stakeholders (managers, analysts, and of course also the doctors) use
descriptive analytics to solve various problems faster and more efficiently.
Seattle Children measures and analyzes patient wait-times with visualizations to identify root causes and
contributing elements for patient waiting. They saw that early delays piled up during the day, which led
them to focus on on-time appointments for patient services as a solution to improving waiting time and
increasing availability of beds. They thusly saved 3 million dollars from the supply chain.

1.2.2. Predictive analytics
As the name implies, this form of analytics seeks to determine what is likely to happen. It is based on
statistical techniques, as well as more recent techniques that fall under “data mining”. These techniques
aim to predict if customers are likely to switch to a competitor, what they are likely to purchase next and
how much, what they would respond to in terms of promotions and marketing, and so on. There are many
techniques used for developing predictive analytics software, including many different classification
algorithms, ranging from decision tree models, clustering algorithms, all the way to neural networks and
association mining techniques.
Though not what would be traditionally considered a business, the case of the Oakland Athletics baseball
team and their assistant general manager Paul Podesta applying "Sabermetrics" to optimize the selection of
players based on their on-base percentage (OBP).
It was statistics that measured how often a batter reached base for any reason other than fielding error,
fielder's choice, dropped/uncaught third strike, fielder's obstruction, or catcher's interference. This means
that instead of being reliant on a scout's experience and intuition, which is highly subjective, Podesta
selected players based exclusively on objective indicators for performance.
This brought the Oakland Athletics from having a big loss in the New York Yankees in 2001, losing many key
players and ending up with a weak team with poor financial prospects, to winning 20 consecutive games
and setting an American League record.
1.2.3. Prescriptive analytics
The third category of analytics under the INFORMS definition, this seeks to recognize what currently is
happening and forecasts likelihoods to optimize decisions and performance. In other words, the aim is to
provide a decision or recommendation for specific action, which may take the form of yes/no decisions,
specific amounts, or a set of steps in a production plan. They can be presented to a decision maker, or
directly used as input for automated decision rules systems.
An example of this is that of the Industrial and Commercial Bank of China (ICBC) employing models to
reconfigure branch networks. It is the largest publicly traded bank in the world, with more than 16000
branches and over 230 milliion individual customers and 3.6 million corporate clients. With this, they had to
adapt to the rapidly changing economic growth, fast-paced urbanization, and increase in personal wealth of
the Chinese. There were major changes to be implemented in over 300 cities, implying high variation in
customer behavior and finance.
They initially had been using a scoring model to develop new branches with appropriate locations, services,
and target customers. It had different variables and weighted priorities for input, including custoemr flow,
number of residences, and number of competitors in a geographic region. This model was bad at
determining customer distribution in a given region, and also could not optimize the distribution of bank
branches in the network. IBM supported them to create a tool that took in three parts to optimize branch
reconfiguration: geographic data with 83 categories, demographic and economics data with 22 categories,
and branch transactions/performance data with over 60 million records every day.
These generated optimized distribution for each area and optimized the branch network reconfiguration.
The system consisted also of a market potential calculation model (measures customer value and value), a

branch network optimization model (locates branches so that they could cover largest market potential
areas). and branch site evaluation model (determines value for establishing branches at specific locations).
Since 2006, the system has been improved iteratively. ICBC's deposits had been increased by $21.2 billion
since the advent of the system, as the bank can now reach more customers with the right services.

1.3. Some BI tools

The following is an introduction on some popular BI tools. Bear in mind that these are selections of
dedicated BI tools, and are not the only possible ways to achieve BI in an organization. As will be
demonstrated in P4, Python is a powerful programming language that can be used for data analysis.
1.3.1. MicroStrategy
The MicroStrategy platform supports interactive dashboards, scorecards, customized reports, ad-hoc
queries, alerts, and automated report distribution. They can be used on the web, desktop, and even has
Microsoft Office Integration.

MicroStrategy uses a relational OLAP architecture that allows users to report about anywhere in the entire
relational database, even to transactional-level detail. It has optimizations for all major relational database
and data warehouse vendors and can also access multidimensional databases and flat files.
Aside from their development suite and administrative tools, MicroStrategy also provides an SDK (software
developer kit) that allows customization of the applications to be integrated with other software.
1.3.2. Datapine
Datapine is a business intelligence solution with simple drag-and-drop features, focusing on self-service and
visualization with unlimited dashbaords and various charts to choose from, similar to Tableau. The self-
service analytics allows users to visualize data without IT's help. Users can run ad-hoc reports, apply filters
and access data, even through mobile devices. Advanced users with knowledge of Excel to perform

calculations, and those that can use SQL can also create pivot tables all within datapine. The reports can be
shared as links, whether interactive or read-only. They can also be configured to be sent on triggers or on
schedule. In addition, datapine allows users to embed charts and dashboards direectly into present
solutions. Hidden filters can help users identify information for reader needs. Furthermore, datapine
connects data through data warehouse automation, with external and internal data being combined, and
likewise for structured and unstructured. Clients with existing data warehouses can run queries remotely
without having to transfer it out of the existing system. Lastly, datapine's data security laws in Germany has
all clients retaining ownership of their data at all times. Datapine has read-only access and a layer of
security to prevent SQL injections, as well as no backdoors built into the system.

1.3.3. Tableau
It is the market-leading choice for business intelligence, having powerful visualization capabilities as well as
support for various data set formats. It also has cloud support and real-time data analytics. There are many
products in the Tableau suite, which can be split into developer tools that allow the creation of
visualizations (this includes Tableau Desktop and Tableau Public), and sharing tools, which allow the sharing
of what was created of the developer tools (this includes Tableau Online, Server, and Reader). The data can
be hosted on locally or on the public cloud, depending on the users' needs and selected products.


2. Designing BI tools to solve problems and aid in decision making (P4)
2.1. Explaining the dataset and Pre-processing steps
As mentioned before, our dataset is from a Superstore, with columns that describe various categories,
including both quantitative and qualitative data. Specific data types are to be revealed later during analysis.
We will be using a Python 3 environment and some helpful data analytics packages to help us with the data
cleaning and analysis process.

Importing some libraries


Within the Python packages imported, NumPy is a library that supports large, multi-dimensional arrays and
matrices, and of course rather importantly, some high-level mathematical functions to operate on said
arrays. In addition, there is also the Pandas library which supports data manipulation and analysis, offering
data structures and operations for manipulating numerical tables. There is also seaborn, which is another
data visualization library based on Matplotlib, which provides a high-level interface for drawing attractive
and informative statistical graphics. Most notably, there is the Matplotlib plotting library which can create
static, animated, or interactive data visualizations. The packages are aliased as np, pd, and plt, respectively.

Next, we read our dataset, an Excel file, into a pandas data frame and assign it to the df (which stands for
dataframe) variable. Afterwards, we call the head() function to return the first 5 rows, which is the default
when no other parameters have been specified.

Then, we use the pandas shape function to retrieve the number of rows and columns in our dataset.
This yields 9994 rows and 21 columns.

To retrieve the column names of the table, we use the pandas columns
function. This returns an array of the column names of our dataset.

Then, we check for the datatypes of each of the columns to better understand our dataset. This uses the
dtypes function from pandas.


Most of the quantitative columns here are in the form of integers or floats. The rest are classified as an
object type, with the exception of the Order Date and Ship Date, which takes the form of the datetime
We then use the isnull() function to check if there are any missing data. isnull().sum() is used to count the
number of rows with missing data on that column.


We can see that there are 11 rows missing values in the Postal Code column.
In addition, the “Row ID” column is also unnecessary, being only a serial number.

We’ll can also see that the “Country/Region” column only has 1 value throughout all the records as “United
States”, through the value_counts() function applied on that column. This means that we will not be
analyzing by the country, and so it will be dropped similarly.

The drop() function receives ‘Row ID’ or ‘Country/Region’, a string argument, for a parameter that
determines which column to remove. The parameter “axis” specifies whether to drop labels from the index
(when axis = 0) or by columns (when axis = 1). head() shows the table again.


Next, we try to retrieve the different unique values that occur in the Category column with unique(). We
obtain 3 large categories, which then are counted more specifically with value_counts(). The results
displayed show the value on the left and the number of occurrences on the right. We also count the
number of unique values in the Sub-Category column with nunique().

Lastly, we seek to see the specific count of records for each kind of sub-category, using the value_counts()
function for the ‘Sub-Category’ column.

2.2. Python Matplotlib reports


Here are some reports we make using python’s matplotlib library. First off is a bar chart by different
segments our products are sold for. plt.figure() is used to create a new figure, with figsize being an attribute
of the object that refers to the dimensions (width, height) of the figure in inches. That is then assigned to
the variable fig. Then, we use add_subplot(111) to add an axes to the figure as part of a subplot
arrangement, with the parameters being (nrows, ncols, index). The subplot will take the index position on a
grid with nrows rows and ncols columns. Index starts at 1 in the upper left corner and increases to the right.
The countplot() function from seaborn (alias sns) shows the counts of observations in each categorical bin
using bars. The first parameter 'Segment' refers to the data that is to be displayed, and the "data"
parameter receives the dataframe df as the dataset for plotting. Then a for loop goes through the patches in
the newly added subplot axes, then annotates with ax.annotate(). Finally, displays the figure.


Similarly, we want to show yearly profit percentage gained in each Sub-Category. First, we’ll extract and
display the order years.

We’ll also make a new column called Profit %, which is based off of another new column called Cost.

Then, we create the bar chart. This time, we use the seaborn barplot() funciton. It receives the order year
and Profit % and Sub-Category as inputs for long-form plotting data. Palette is just a parameter that sets the
color to use for different levels of the hue (Sub-Category) variable.


Lastly, we’ll go for a barplot of Year wise Total Sales & percentage of profit gained.


First, we use groupby to group each of the columns (sales and profit%) by the order year. from
python's pandas library is used to create the barplot. Then plt.title allows us to name the title of our plot.

2.3. Tableau reports, dashboards and stories (M3)

Now, a large part of BI is to have support for insightful data visualizations, such that the design is user-
friendly and even interactive. One such way to achieve that in a business is to implement Tableau
dashboards and stories. The following documentation demonstrates how we create our reports,
dashboards, and stories using the same Superstore dataset, using Tableau Desktop.


First, we will open up our workbook and head over to Connect pane and then into the Saved data sources
section, clicking on our Sample - Superstore excel file. Once that's done, we have connected to that
dataset. In this case, we have 3 sheets, “Orders”, “People”, and “Return” from the Excel file present.

Next, we create our reports, in preparation for the dashboard and stories. At its core, Tableau reports will
have dimensions and measure fields placed on the Rows and Columns shelves, as well as various properties


on the Marks card. Adding fields from the Data pane can be done by either drag-and-dropping fields from
the Data pane onto the cards and shelves, or double clicking on the fields from the Data pane.
Specifically, with our first report, which shows segment-wise sales, "Sales" is added to the Rows shelf,
"Segment" is added as a colored mark, and then "Sales" is added as a mark as size. We also had
"Segment" and "Sales" be marked as a label.

For State-wise sales, we'll have the latitude and longitude for the Rows and Columns shelves. Then, we'll
have the Sales marked with Color, and the States with text.


With Sub-Category sales by year, we'll have Category and Sub-Category into the Columns shelf, and the
Sales onto the Rows shelf. We'll have the Order Year be a color mark.

And lastly, we have year-wise sales by segment. Segment is used for a color mark with the Area mode, with
the Order Year on Columns and Sales on Rows.

For our dashboard, we'll first drag our sheets into our desired layout. Then we'll be using automatic fixed
size for this to easily fit to the page. The dashboard is then ready, with the interactive selectors on the
upper right corner for their respective charts.

Finally, to create a Tableau story, it's as easy as just dragging whichever sheet you want into a story point,
and then typing in your caption to summarize that story point. You can add more story points by clicking
Blank for a fresh sheet, or to Save as new if you want to customize from an existing story point, Here we
have 4 story points.


2.4. How the design fits the business requirements (D3)
As Tableau is

3. How business intelligence supports decision making (P5)

In the context of BI, decision support systems (DSS) are often built to support the solution to a problem or
to evaluate an opportunity. In a strict sense, BI systems monitor situations and determine
problems/opportunities using analytical methods. Reporting is a major role in BI, with the user identifying
whether a situation requires attention, and then analytics can be applied. Although models and data access
are included in BI, DSS have their own databases and are developed specifically to solve specific problems.
This leads to the notion of "DSS applications".
Formally, DSS are approaches or methodologies for supporting decision making. They use interactive and
flexible computer-based information systems that are specialized for supporting the solution to specific
unstructured management problems. They use data, provide user-friendly interfaces, and can incorporate
the decision maker's insights. Furthermore, they can include models and are developed through iterative
processes that may even involve interacting with end users. Naturally, there is no consensus on what a DSS
is. That said, there are some key characteristics:
1. Support for decision makers, bringing human judgment and computer-based information together.
Often times these are for semi-to-unstructured situations that cannot be solved easily by
computerized systems or common qualitative methods.
2. Support for all management levels.
3. Support for individuals and groups, with less structured problems needing more involvement from
different levels or departments.

4. Support for interdependent/sequential decisions that may happen more than once.
5. Support in all phases of the decision making process: intelligence, design, choice, and
6. Support for various decision making styles and processes.
7. Support for flexible customiztion such that the decision makers can adapt the DSS to meet changing
conditions by adding, deleting, combining, changing, or rearranging elements.
8. User-friendliness, powerful graphics, and natural language interactive human-machine interfaces to
increase effectiveness.
9. Improvement of effectiveness of decision making (accuracy, timeliness, quality...) instead of just
reducing costs of making decisions.
10. Decision makers are to have complete control over the decision making process as a whole, with the
DSS only serving as support and not replacement.
11. End users are to be able to develop and modify simple modules or systems by themselves. Larger
systems can be built with datamining software, OLAP, data warehouses, and support from
information system specialists.
12. Models are generally used to analyze decisions. The modeling capability should allow for
experimentation with different strategies in different contexts.
13. Access to various data sources, formats, and types.
14. A DSS can be used as a stand-alone tool by an individual or to be distributed throughout several
nodes in the supply chain. It can be integrated with other applications, and is distributed with
networking and Web technologies.

4. Issues surrounding BI usage (P6)

The advent of analytics can bring about various legal issues already present regarding computer systems.
The ethics and legalities of liability for the actions of advice provided by intelligent machines are being
taken into consideration. In addition to resolving disputes about the unexpected and potentially dangerous
results of some analytics, other complex matters may arise. For instance, who is liable if an enterprise goes
bankrupt as a consequence of going by the advice of an analytic software? Is the enterprise itself
responsible for not testing the system enough before leaving it to tend with critical issues? Or are auditing
and accounting firms to share the liability? And likewise, are the developers of those systems in trouble
too? Consider the following questions as legal concerns surrounding the use of BI systems.
 What is the value of an expert opinion in court when the expertise is encoded as a computer
 Who is to be held liable for the wrong results from advice provided by an intelligent machine? For
instance, who's responsible for the death of a patient when a doctor accepts an incorrect diagnosis
by a computer?
 Who owns the knowledge in the knowledge base?


And so on and so forth. Privacy is also another related concern. Collecting information about individual
people has long been a touchy subject surrounding the relationship between privacy and information
systems that require user data. While the definition of privacy itself can be interpreted qite broadly, past
court decisions show that often times privacy is not an absolute right, and that it must be balanced against
the needs of society such that the public's right to know is superior to the individual's own privacy. This
brings about the difficulty in enforcing privacy regulations. Is it considered fair and just for the government
for example to use mass data to surveil the populace and prevent crime and fraud and tax evasion, at the
cost of the individual loss of privacy? Similarly, private information about employees may aid decision
making in human resources, but is that fair for the employee? And likewise for customers, who may have
better customized product advertising or service recommendation, at the cost of being "monitored" by
The use of AI technologies in legal administration and law enforcement may also increase public concern
regarding personal privacy. One could say that more security by means of mass surveillance (data mining on
telephone calls, emails, taking photos of people and identifying them...) is at direct odds with citizens'
personal privacy.

6. References
Waskom, M., 2021. seaborn.countplot — seaborn 0.11.1 documentation. [online]
Available at: <> [Accessed 7 March 2021].
Waskom, M., 2021. seaborn: statistical data visualization — seaborn 0.11.1 documentation. [online] Available at: <> [Accessed 3 March 2021].
Hunter, J., 2021. matplotlib.figure.Figure — Matplotlib 3.3.4 documentation. [online]
Available at: <> [Accessed 5 March
Scavicchio, J., 2021. datapine Review. [online] Better Buys. Available at:
<> [Accessed 7 March 2021].
Simon, Herbert A. 1947 [1997]. Administrative Behavior, 4th edn. New York: Free Press.
Jeston, J. and Nelis, J., 2014, Business Process Management – Practical Guidelines to Successful
Implementations, Routledge.
Kirchmer, M., 2017, High Performance Through Business Process Management: Strategy Execution in a
Digital World, Springer.
von Rosing, M., Scheer, A. and von Scheel, H., 2014, The Complete Business Process Handbook: Body of
Knowledge from Process Modeling to BPM, Morgan Kaufmann.
Lloyd, J.. (2011). Identifying Key Components of Business Intelligence Systems and Their Role in Managerial
Decision making. University of Oregon. Available at:
[Accessed 26 March 2021]

