Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Higher Nationals in Computing

Unit 14: Business Intelligence


ASSIGNMENT 2

Learner’s name: Ninh Xuân Bảo Hưng


ID: GCS200058
Class: GCS0905A
Subject code: 1641
Assessor name: Nguyen Xuan Sam

Assignment due: 5th March 2023


Assignment submitted: 5th March 2022
ASSIGNMENT 2 FRONT SHEET

Qualification BTEC Level 5 HND Diploma in Computing

Unit number and title Unit 20: Business Intelligence

Submission date 5th March 2023 Date Received 1st submission

Re-submission Date Date Received 2nd submission

Student Name Ninh Xuân Bảo Hưng Student ID GCS200058

Class GCS0905A Assessor name Nguyen Xuan Sam

Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I understand that
making a false declaration is a form of malpractice.

Student’s signature

Grading grid
P1 P2 M1 M2 D1 D2
❒ Summative Feedback: ❒ Resubmission Feedback:

Grade: Assessor Signature: Date:


Signature & Date:
Page |4

Assessment Brief
Student Name/ID Number
Unit Number and Title 14: Business Intelligence
Academic Year 2018
Unit Tutor
Assignment Title Assignment 2: Apply BI tools & techniques and their impact
Issue Date
Submission Date
IV Name & Date

Submission Format
Part I: Project submission. This should be a zip / rar folder of your project, including all necessary files
to run your project. There should be a link to your Tableau work on Tableau Public cloud.
Part II: The submission is in the form of a group written report. This should be written in a concise,
formal business style using single spacing and font size 12. You are required to make use of headings,
paragraphs and subsections as appropriate, and all work must be supported with research and
referenced using the Harvard referencing system. Please also provide a bibliography using the Harvard
referencing system.
Part III: Team needs to present their point of view about how business intelligence tools can contribute
to effective decision-making as well as the legal issues involved in exploiting user data for business
intelligence. You may need to research for specific examples of organizations that use BI tools to
enhance or improve their business and evaluate how they can use BI tools for extend their target
audience and make them more competitive within the market.
Page |5

Unit Learning Outcomes


LO3 Demonstrate the use of business intelligence tools and technologies

Assignment Brief
(Continued from previous scenario)
Your next task is to demonstrate to the board of directors about the ability of applying business
intelligence in the company's current business processes. To demonstrate BI, you need to prepare a
presentation about BI and related tools & techniques and a demonstration on real company dataset.
For the presentation, you need:
- Explain general concept of what is BI
- Introduction to some tools / techniques for BI and their application in general
For the demonstration, you need:
- A (some) data set(s) extracted from the company's business processes. Explain the dataset.
- Show how you pre-process data for later analysis, explain each step and it purpose
- Design dashboards to show your analysis on pre-processed data. Explain clearly purpose of
dashboards and charts. Suggestions should be made after analysis
During the demonstration, you need collect feed-back and comments from users to review how well
your dashboards design meet user or business requirement and what customization needed for future
use.

Team needs to present their point of view about how business intelligence tools can contribute to
effective decision-making as well as the legal issues involved in exploiting user data for business
intelligence. You may need to research for specific examples of organizations that use BI tools to
enhance or improve their business and evaluate how they can use BI tools for extend their target
audience and make them more competitive within the market.

To summary, you need to submit a report in PDF includes 4 parts: your presentation, result of
demonstration and review of user feedback, point of view on BI contribution and legal issues.
Page |6

Learning Outcomes and Assessment Criteria

Pass Merit Distinction

LO3 Demonstrate the use of business intelligence tools and


technologies

P3 Determine, with examples, M3 Customise the design to ensure D3 Provide a critical review
what business intelligence is and that it is user friendly and has a of the design in terms of
the tools and techniques functional interface. how it meets a specific user
associated with it. or business requirement
and identify what
P4 Design a business customisation has been
intelligence tool, application integrated into the design.
or interface that can perform a
specific task to support problem-
solving or decision-making at an
advanced level.

LO4 Discuss the impact of business intelligence tools and technologies D4 Evaluate how
for effective decision-making purposes and the legal/regulatory organisations could use
context in which they are used business intelligence to
extend their target audience
P5 Discuss how business M4 Conduct research to identify and make them more
intelligence tools can contribute specific examples of organisations competitive within the
to effective decision-making. that have used business market, taking security
intelligence tools to enhance or legislation into consideration
P6 Explore the legal issues improve operations.
involved in the secure
exploitation of business
intelligence tools
Page |7

Table of Contents
1 Introduction ......................................................................................................................... 9
1.1 Overview of problems ................................................................................................. 9
1.2 Motivations .................................................................................................................. 9
1.3 Objectives .................................................................................................................... 9
2 Related works and dataset ................................................................................................. 10
2.1 Related works ............................................................................................................ 10
2.2 Dataset ....................................................................................................................... 10
2.3 Description dataset .................................................................................................... 11
2.4 Data cleaning ............................................................................................................. 11
3 Proposed model ................................................................................................................. 12
3.1 Correlation ................................................................................................................. 12
3.2 Linear regression ....................................................................................................... 13
3.3 Summary.................................................................................................................... 13
4 Simulating scenarios and Results...................................................................................... 13
4.1 Package installation ................................................................................................... 13
4.2 Correlation ................................................................................................................. 14
4.3 Dashboard .................................................................................................................. 15
4.4 Performance of scenarios........................................................................................... 16
4.5 Summary.................................................................................................................... 19
5 Conclusions and future works ........................................................................................... 19
5.1 Conclusions ............................................................................................................... 19
5.2 Future works .............................................................................................................. 19
References ................................................................................................................................ 20
Page |8

Table of Figures
Figure 1 The summary of methodology ................................... Error! Bookmark not defined.
Figure 2 The raw dataset ......................................................................................................... 10
Figure 3 Information of the raw dataset ................................................................................... 11
Figure 4 Two-variables Relationships...................................................................................... 12
Figure 5 Download Tableau ..................................................................................................... 13
Figure 6 Correlation ................................................................................................................. 14
Figure 7 Dashboard .................................................................................................................. 15
Figure 8 How the mileage traveled affects the price of the car? .............................................. 16
Figure 9 How does the number of engineSize affect the price of the car? .............................. 17
Figure 10 materials affect prices and taxes .............................................................................. 18
Figure 11 the year of the car affect the price............................................................................ 19
Page |9

Topic: Predict the price of Ford cars in the UK


1 Introduction
1.1 Overview of problems
The automotive market refers to the industry involved in the production, sale and marketing
of vehicles, including cars, trucks, buses, motorcycles, and other motor vehicles. The
automotive market is a huge global industry that generates substantial revenue and employs
millions of people worldwide.
The problem of predicting the price of Ford cars in the UK is an important one for various
stakeholders, such as Ford itself, dealerships, and potential buyers. Accurately predicting the
price of Ford cars can help these stakeholders make informed decisions about pricing,
marketing, and sales strategies.
To predict the price of Ford cars in the UK, it is necessary to analyze various factors,
including historical sales data, market trends, consumer preferences, and economic
conditions. The use of machine learning algorithms and data analysis tools can help generate
predictive models that can forecast future prices based on these factors.

1.2 Motivations
Vehicle price refers to the cost of buying a new or used car. The price of a vehicle can vary
widely depending on various factors, including the make and model of the vehicle, its age,
condition, features, and location. In general, newer cars with more advanced features and
technology tend to cost more than older, more basic models. Luxury cars and sports cars also
tend to be more expensive than entry-level cars. When buying a car, it's important to research
and compare prices to make sure you're getting a fair price. This may involve looking at
different dealers, checking prices for similar cars in the area, and negotiating with sellers to
get the best deal possible.
1.3 Objectives
In this job, there are several important goals that I focus on:
- How the number of miles traveled affects the price of the car?
- How transmission affect prices and taxes?
- How materials affect prices and taxes?
- How does the year of the car affect the price?
P a g e | 10

2 Related works and dataset


2.1 Related works
This paper analyzes the relationship between car prices and the U.S. auto industry, looking at
factors such as consumer demand, production costs, and market competition. The authors also
investigate the impact of trade policy and technological innovation on automobiles (Emily
Kolinski Morris and Rebecca Hellerstein, 2019).
This study, which looked at the relationship between gasoline prices and vehicle sales, found
that higher gasoline prices led to increased demand for fuel-efficient vehicles. The authors
also explore the impact of government policies, such as fuel economy standards, on the
automotive market (Shanjun Li, Christopher R. Knittel, and Arthur A. van Benthem, 2013).
This study investigates the psychological factors that influence car price negotiations between
consumers and dealers, exploring the role of emotions, power dynamics, and social norms in
the bargaining process (Amit Bhattacharjee, Cassie Mogilner and Kelly Goldsmith 2014).
This study looks at the impact of electric cars on the environment and the electricity market,
considering factors such as emissions reductions and changes in electricity prices. The authors
also discuss policy implications regarding the adoption of electric vehicles (Yates, A. J. ,
2015).

2.2 Dataset
The data I took from Kaggle (ADITYA , 2019). The dataset is 100,000 UK Used Car. The
raw dataset contains more than 100000 entries and 9 columns. The cleaned data set contains
information of price, transmission, mileage, fuel type, road tax, miles per gallon (mpg), and
engine size. Here is the head of the draw dataset (Figure 2).

Figure 1 The raw dataset


P a g e | 11

2.3 Description dataset


The dataset includes:
- model : model car (Fiesta, Focus,Kuga,….)
- year : vehicle's production date
- price : the price of car
- transmission : gearbox of car (Manual, Automatic)
- mileage : miles travelled
- fuelType : fuel type of car (Petrol, Diesel, Eletric,…)
- tax : road tax
- mpg : miles per gallon
- engineSize : in litres

2.4 Data cleaning


Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted,
duplicate, or incomplete data within a dataset. When combining multiple data sources, there
are many opportunities for data to be duplicated or mislabeled. If data is incorrect, outcomes
and algorithms are unreliable, even though they may look correct. There is no one absolute
way to prescribe the exact steps in the data cleaning process because the processes will vary
from dataset to dataset. But it is crucial to establish a template for your data cleaning process
so you know you are doing it the right way every time.

Figure 2 Information of the raw dataset


P a g e | 12

3 Proposed model
3.1 Correlation
A correlation coefficient, often expressed as r, indicates a measure of the direction and
strength of a relationship between two variables. When the r value is closer to +1 or -1, it
indicates that there is a stronger linear relationship between the two variables.

• r = Pearson Coefficient
• n= number of pairs of the stock
• ∑xy = sum of products of the paired stocks
• ∑x = sum of the x scores
• ∑y= sum of the y scores
• ∑x2 = sum of the squared x scores
• ∑y2 = sum of the squared y scores

The function above is called the Pearson product moment correlation. In order to know
whether the two variables are correlated, we can see the pattern of the scatter plot as the
image in Figure 4 below:

Figure 3 Two-variables Relationships


P a g e | 13

3.2 Linear regression


Simple linear regression is used to estimate the relationship between two quantitative
variables. You can use simple linear regression when you want to know:
- How strong the relationship is between two variables (e.g., the relationship between rainfall
and soil erosion).
- The value of the dependent variable at a certain value of the independent variable (e.g., the
amount of soil erosion at a certain level of rainfall).

• y is the predicted value of the dependent variable (y) for any given value of the
independent variable (x).
• B0 is the intercept, the predicted value of y when the x is 0.
• B1 is the regression coefficient – how much we expect y to change as x increases.
• x is the independent variable ( the variable we expect is influencing y).
• e is the error of the estimate, or how much variation there is in our estimate of the
regression coefficient.

3.3 Summary
In this section, I have given theories and formulas. It helps me understand more about this
work.

4 Simulating scenarios and Results


4.1 Package installation
To be able to draw graphs first we need to download tableau software.
Link to download : https://www.tableau.com/products/desktop/download/

Figure 4 Download Tableau


P a g e | 14

4.2 Correlation
I will explore the correlation of the dataset. I visualize a heat map for showing this:

Figure 5 Correlation
As we can see from the heat map, I collect some highly correlated pairs because the
correlation scores are above 0.4 and below -0.5:
- price and age : -0.64
- price and mileage : -0.53
- price and tax : 0.41
- price engineSize : 0.41
P a g e | 15

4.3 Dashboard

Figure 6 Dashboard
P a g e | 16

4.4 Performance of scenarios


a) How the mileage traveled affects the price of the car?

Figure 7 How the mileage traveled affects the price of the car?

Through the tableau chart in Figure 8 we can see the correlation of mileage affecting the price
of the car. Through the chart, we can see that the black trendline has a bottom direction
through which we can see that the farther the car has traveled, the lower the price of the car.
According to the data, we can see that if the car runs from 0 to 10000km, the price can be up
to 50,000 USD. And if the car runs more than 100000km, the price is only 5000USD to
15000USD.
P a g e | 17

b) How does the number of transmission affect the price of the car?

Figure 8 How does the number of transmission affect the price of the car?

Through the chart from Tableau in Figure 9 we can see the correlation of transmission and the
price of the vehicle. In the chart we see:
- Automatic transmission cars with the highest average price have a blue circle
- Semi-Auto transmission vehicles have the second highest average price with the red circle.
- Finally, there are the Manual transmission vehicles with the lowest average price with the
yellow circle.
Thereby, we can see that the cars using Automatic transmission are more expensive than
Semi-Auto transmission and Manual transmission.
P a g e | 18

c) How does the tax affect the price of a car?

Figure 9 How does the tax affect the price of a car


Through the tableau chart in Figure 10, we can see the correlation of vehicle taxes affecting
the price of the car. Through the chart, we can see that the black trendline has an upward
direction through which we can see that the higher the tax of the car has run, the higher the
price of the car. According to the data we can see that if the car's tax is from 0 to 40 USD, the
car price is from 5000 USD to 20000 USD and the car tax price is from 120 to 160 USD, the
car price can increase to 50,000 USD and the rest amount is not too much, so it does not
affect enjoy too much.
P a g e | 19

d) How does the Year of the car affect the price?

Figure 10 the year of the car affect the price

Through the tableau chart in Figure 11 we can see that the correlation of the year of
manufacture affects the price of the car. Following the black trendline going up, we can see
that the closer the production year is to the present, the higher the price. Through the data, we
see that from 2011 to 2020, the price increases gradually, the higher the number of years, the
higher the price. 2009 and 2010 may be because the car that year may be more expensive than
the rest of the years, so the price of the car may be high.

4.5 Summary
In this chapter, I answered all the questions based on the objectives in chapter 1 and proved it.
I learned how to draw in python in this tutorial.

5 Conclusions and future works


5.1 Conclusions
Year, mileage, tax, mpg, engineSize all affect the price of the car. To buy a car we need to
consider everything to have the most reasonable price with the pocket.

5.2 Future works


If I had more time and more budget, I think I would be able to predict car prices based on
more features and develop better UI/UX.
P a g e | 20

References
kaggle.com (2022). 100,000 UK Used Car Data set [Online]. Available:
https://www.kaggle.com/datasets/adityadesai13/used-car-dataset-ford-and-
mercedes?select=ford.csv [Accessed].
Morris, E. K., & Hellerstein, R. 2019. Car prices and the automotive industry in the United
States. Journal of Economic Perspectives, 33, 3-24
Li, S., Knittel, C. R., & van Benthem, A. A. 2013. The effect of gasoline prices on the ability
to save new car fuel. Review of Economics and Statistics, 95, 1078-1095
Bhattacharjee, A., Mogilner, C., & Goldsmith, K. 2014. Understand car price negotiations in
dealerships. Journal of Consumer Research, 41, 1148-1163
Holland, S. P., Mansur, E. T., Muller, N. Z., & Yates, A. J. 2015. Electric cars, emissions and
electricity prices. Review of Economic Studies, 82, 126-160

You might also like