Professional Documents
Culture Documents
CaseStudy Report (2)
CaseStudy Report (2)
Huskie Motor
Corporation
INF20016 CASE STUDY ANALYSIS
BRONSON JOHNSON (102094694)
CUONG NGUYEN (102840305)
OSKAR MAIER (102582319)
VAN HOANG (102578923)
Executive summary
This report will provide a detailed analysis and look to provide information on the performance of
Huskie Motor Corporation (HMC) from the big data collected. The company is an emerging superstar
in the automotive manufacturing industry, and it has built its outstanding brand name and its global
performance in areas such as the North American countries (especially in the USA). The report is
going to break down the success of its performance in brands, models, and sale channels. It’ll even
turn the analysis into insights such as a recommendation to leave the Canadian through conducting a
four-quarter advance forecast in sales volume and contribution margins. Furthermore, the report
will showcase other critical issues such as consent and privacy in the collection of data and the errors
that had to be cleansed before. Ultimately, the report strives towards providing actionable intel for
HMC to evolve their business.
1
Contents
Executive summary ................................................................................................................................. 1
Introduction ............................................................................................................................................ 3
Business Overview .................................................................................................................................. 3
Dataset problems .................................................................................................................................... 3
Current situations and Forecast.............................................................................................................. 5
Overall performance Analytics............................................................................................................ 5
How is HMC performing globally? .................................................................................................. 5
How are various HMC brands performing? .................................................................................... 7
How are the various sales channels performing? ........................................................................... 7
What are the most and least profitable models? ........................................................................... 8
Financial analytics ............................................................................................................................... 9
Current contribution margin per model ......................................................................................... 9
Average variable cost per model, and how has that changed over time ..................................... 10
Which model has the most variability in variable costs ............................................................... 11
What is the current CM per channel? ........................................................................................... 11
Operations Analytics ......................................................................................................................... 12
Best performing and low performing models............................................................................... 12
Days each model spent on lot prior to sale .................................................................................. 13
Forecasting analytics ......................................................................................................................... 13
Recommendations/insights from forecast ................................................................................... 14
Other Critical Issues .............................................................................................................................. 17
Conclusion ............................................................................................................................................. 18
References ............................................................................................................................................ 19
Appendix ............................................................................................................................................... 20
Table 1: List of errors on the dataset of HMC operations: ............................................................... 20
Steps taken in the data cleaning process.......................................................................................... 23
2
Introduction
The main objective of this report is to employ the use of Tableau in a large corporation such as
Huskie Motor Corporation (HMC), to gain greater insight from looking at the data, and allow easier
understanding of overall performance analytics and profitability within the brands HMC manage.
Ultimately, incorporating analysis tools such as Tableau allows for processes to become streamlined
and in turn responsible for the direction of HMC and profitability for the shareholders.
The report highlights problems within the existing database of HMC, which creates a negative effect
upon the understanding of data when attempting to analysis figures. Moreover, through a range of
graphics, the cleaned data is transformed into a digestible amount of useful information to drive
home the key ideas, while highlighting factors which may need to be improved. By incorporating the
use of Tableau, and concluding with a proposed approach for the future.
Business Overview
Huskie Motor Corporation is responsible for manufacturing of motor vehicles, successively managing
three brands with several models and multiple segments within. This process creates a large amount
of data which can be recorded and stored to help future planning and decision making. Through the
use of forecasts and analysis data, HMC can more accurately plan production schedules and monitor
the corporation's market share within each county it operates within. HMC has created a very
precise method of collecting and recording data, all through the use of each Vehicles Identification
Number (VIN), A VIN is allocated to every vehicle produced and acts as a key item in collecting data.
It is important to state the company is not new but a “spin off” of the older company blue diamond
automotive. They have a relatively new structure and have accumulated a great deal of 2019-2020
sales data from the processes mentioned to be analysed. From these data, dealers can put their own
in-house datasets to use with a big data platform, in order to optimize their sales, revenue streams,
marketing, margins, operating expenses, and what not (izmocars 2021). However, the data that was
collected had issues with accuracy, or veracity, which often went unnoticed due to the size of the
dataset, and the quality could be decreased (Boyd & Crawford 2012, p.669, Sheng, Amankwah-
Amoah & Wang 2019 p.321). The report is going to look at cleansing the dataset and performing big
data analytics to provide a read on the current performance and a view into the future with a
forecasting analysis.
Dataset problems
There are numerous errors that were found in the HMC dataset. Some of the most serious problems will
be explained and listed below, as well as how it was mitigated in the data cleaning process.
1. There are 3 duplicate values (6 rows) in the VIN # field (as in figure 3.1), which
means there was an error in the data entry since that field is unique for each car. To resolve this,
after finding the duplicate data in Tableau by scrolling down the column, we delete both of the rows.
3
Figure 3.1: Example of a duplicate value in the dataset.
2. The variable cost value for all cars is wrongly calculated (as in the figure 3.2). It also influences
other values in the dataset such as contribution margin, net revenue and after-tax. Therefore, we need
to create new columns to calculate these values and delete old columns.
Figure 3.2: Filtering values in ‘Total Variable Cost’ column, which is similar to the calculation result, none
of them was found.
4
3. Some car segments, such as mid-size luxury, sport coupe and micro are not in the initial definition.
To resolve this, we merge full-size, mid-size and entry-level luxury to one car segment (luxury). As for
the sport coupe, we merge them with sport utility, because now sport utility coupe has emerged as a type
of coupe cars with features from sport utility vehicles (Elliot, 2020). As for the micro cars, our decision is
to remove it, since this type of car is smaller than a subcompact car (Fullard, 2015), so it could not be
merged with any type of car.
For a more in-depth list of errors in the dataset, it could be seen at Table 1 in the appendix.
5
Looking at the tariff rates amongst the countries (excluding Mexico as an outlier), it can be seen that
the countries with a higher tariff rate are performing worse, and vice versa. For example, Argentina’s
tariff rate is one of the highest (12.58%) but its contribution to profit is the lowest with only 1.9%.
Therefore, it would be wise for HMC to shift their business from sales in high-tariff countries to
lower ones (particularly to North America and Europe, as in figure 4.1.2).
6
How are various HMC brands performing?
When looking at the variety of brands offered by HMC, customers will have a choice between
Apechete, Jackson and Tatra. Analysis upon the brand performance indicates that the most
profitable brands are Apechete and Tatra who are each contributing around 39% of profit gained by
the company. Jackson on the other hand is the lowest performing brand due to it only bringing in
22% of HMC’s global turnover. Upon further analysis an outstanding statistic emerged involving the
Tatra brand, the data shows that 36% of the 39% of profit generated was made in North America
(see figure 4.1.3).
When it comes to making a recommendation about brand sales, the focus should be surrounded by
the anomaly emerged from Tatra sales. The brand has sold a total of 127 cars in Europe and South
America combined compared to the 1006 sold in North America. The recommendation is to drop
Tatra sales and Europe and South America, this would give HMC a capital of $2,345,166 (see figure
4.1.4) to push into the more profitable brands or the North American Tatra infrastructure.
Figure 4.1.3: Percentage of total brand contribution after-tax broken down by region
Figure 4.1.4: Variance in percent of total contribution after-tax, sales volume and variable
cost per brand for each region
7
statistics may seem disconnected but there is a direct correlation between all channels of sale, an
example of ascertaining an insight from the listed statistics is to say, “of the customers who choose
the fleet option they typically choose to lease a car through an Employee/Partner program.”
The statistics discussed above is measured in the number of cards sold/leased/rented but when
making recommendations it is important to look at the most profitable instead of capacity of sale.
The retail, rental and leasing are the most profitable dimensions in all three of the sales channels. As
for the fleet option, even though it sells the most cars (1010 in total) it only contributes to 26% of
the total profit for the company. When making a recommendation it is wise to adjust just the
marketing costs from fleet and government customers to everyone else as they have the highest
variable cost and bring in the least amount of profit (figure 4.1.5) .
For a quick and easy solution to increase $640,986.55 to revenue HMC can remove the sale of Jespie
and Mortimer. However, when looking at the number of cars sold globally Jespie is the peak
performer, through further investigation it can be determined that the loss to the company stems
8
from the obscene total variable cost of $4,377,573. This cost is nearly $800,000 more than the
advantage mode. Ultimately, if HMC can’t improve the aspects of labour, materials, overhead,
freight, and warranty the best course of action would be to scrap it.
Figure 4.1.7 Total percent of contribution for each model in the profit.
Figure 4.1.8: Performance analysis by profit, sales volume, marketing cost and variable cost.
Financial analytics
9
are models with the highest sales volume. The high CM may be due to different amounts of variable
cost on each different model when being sold, in addition to the large ranges of prices for each
model and variants.
Average variable cost per model, and how has that changed over time
From 2019 to 2020, there was very little change in variable cost per model. Each model within every
brand incurred a slight change of variable cost, this may be due to an increase in selling process and
possible changes to the prices of materials within different markets influencing the variable cost.
However, within the Apechete brand (figure 4.2.2), the Crux model shows a large decrease of 48% of
its variable cost. This decrease may be due to a decrease in sales volume of the Crux within later
years, resulting in a decrease in variable cost. Moreover, within all other models the average variable
cost has been largely consistent with no large changes.
10
Which model has the most variability in variable costs
Figure 4.2.3 displays a box chart of the real variable cost per model. From this figure, it is clear that
some models possess a higher variance, this may be due to the vast array in number of options
offered per model. It is clear that with most models, there is a high upper end that represents a large
number of vehicles with a lower variable cost compared to the amount of car sold with a higher
variable cost.
11
Figure 4.2.4: Contribution margin for different sales channels.
Operations Analytics
The lowest selling model was the Rebel from the Jackson brand with a sales volume of 65, followed
by the Robin model from the Apechete and the Mortimer from Tatra. This data can lead to further
investigation on why the models are performing so poorly. However, just from this data analysis we
can provide the information for HMC to either stock less amounts of the car models to save room for
the other more popular models or to completely remove the models from their inventory.
12
Figure 4.3.2: 4 lowest selling models of HMC.
Figure 4.3.3: The average day spent on a lot for each model.
Forecasting analytics
As for the sales value, as we can see in the figure 4.4.1, it is likely to rise from 252 cars in quarter 4
(Q4) of 2020 to 380 cars in Q1 of 2021. After that it is projected to decline by over 40% to 225 cars in
the next quarter, then recovering to 294 cars in the next quarter, before slightly falling by the end of
2021.
13
Figure 4.4.1: The sales value forecast for 2021.
For the contribution margin, it is projected to follow the same trend as the sales value, with its value
expected to surge from nearly $4 million to approximately $6 million in Q1 of 2021, then decline to
$3.7 million in the second quarter of the year. After that there would be a rebound in the next
quarter to $4.4 million before a slight drop by the end of 2021 (as shown in the figure 4.4.2).
Figure 4.4.2: The contribution margin forecast for HMC in 4 incoming quarters.
14
margin becoming negative for all quarters of 2021. The sales volume is also expected to fall
to zero from the second quarter of 2021 (because a negative sales volume is impossible), as
seen in Figure 5.1 below.
Figure 5.1. The forecast for sales volume and contribution margin for sales in Canada.
Due to these reasons, it is advised that the company should leave the Canadian market, or if it still
wants to keep its business in Canada it should not keep the marketing campaign for segment and
miscellaneous incentives, since if it abandons this campaign, both of the sales volume and the
contribution margin will stay positive (as shown in figure 5.2).
15
Figure 5.2: The sales volume and contribution margin forecast in Canada (without the marketing
campaign cited above).
2. HMC's vast array of different models provide consumers with a lot of options, however not
all models perform the same and to further optimize the business we will need to adjust the
inventory. In figure 5.3, we can see the various models of each brand and how well they are
currently performing. We can see that both Apechete and Jackson brands have a Cruz model
with the latter performing much better. This would lead us to assume that it would be more
viable to stock up on the Jackson brand of Crux rather than the Apechete brand in order to
free up the cost of holding cars that do not perform well.
16
Figure 5.3: The performance of different models across 3 brands (Apechete, Jackson and Tatra) in
gross sales.
We can also safely assume that Jackson’s Crux would be more viable to stock due to its popularity
and sales number, as it is currently the best performing Jackson brand model as shown in figure 5.4.
This would mean more stock is needed to keep up with the demand for it and hence the importance
of freeing up inventory space for more popular cars.
Figure 5.4: The gross sales of different models for the Jackson brand.
Other important issues are in regard to data security, especially transactional and consumer
personal data. The company will need to protect consumer personal data in response to this, as the
law also covers businesses that hold consumer data (Privacy and Data Protection Act 2014). This
means HMC will have to set up security banks to protect the data from being leaked to outside
sources and provide security walls to prevent cyberattacks that might aim to steal consumer data.
17
However, due to the nature of the dataset collected by the company most of these issues will not be
addressed as the data is collected from within the companies’ own production, sales and operations.
But these critical issues will still need to be taken into consideration to ensure the safety of the
company and the safety of their consumers.
Another critical issue that needs to be addressed is data inconsistency, here are some standard
practises that could be used to mitigate this problem:
• For data being randomly missed, if only a small number of values in rows is missed, it is
possible to drop these rows out of the dataset. However, this approach is not feasible if the
number of rows with missing values is too large, since it will affect to the prediction’s
accuracy (Gudivada et al. 2017, p.34).
• For duplicate data, they could be dropped from the database, since while there are other
methods that could be used to keep a record of these data such as inference of missing
values or finding false records, they are more complex and time-consuming than eliminating
all rows being duplicated (Lup Low et al. 2001, p.586).
• As for inconsistent data, our approach is to create a set of rules based on the definition in
other tables, and then removing data from the original column if it is not consistent with
these definitions, since these constraints have an important role to maintain a high data
quality (Volkovs et al. 2014, p.244).
Conclusion
In conclusion, an in-depth analysis of HMC’s raw data has uncovered interesting knowledge which is
useful in driving profits and market share forward. The company should focus its business in
countries with lower tariff rate and reduce its sales in high-tariff markets, as well as in Canada. Also,
within the sales channels, the variance of sales between the Advantage and Jespie model can be
noted as an important topic, this is also reflected within the financial analytics where a high
percentage of the top sellers are largely responsible for the company's contribution margin within all
sales channels and models. As highlighted earlier, key issues and recommendations can be found
within data privacy and security and the high level of data inconsistency, and the company should
mitigate these problems to enhance its data governance and its competitiveness in the future.
18
References
• Boyd, D & Crawford, K 2012, "Critical questions for Big Data", Information, Communication &
Society, vol. 15, no. 5, pp. 662-679.
• Elliot, H 2020, Half Sports-Car, Half Off-Roader: The Era of the SUV Coupe Has Begun,
Bloomberg, viewed 12 October 2021, <https://www.bloomberg.com/news/articles/2020-05-
13/mercedes-porsche-tout-suv-coupe-as-car-for-the-covid-19-era>.
• Fullard, M 2015, Complete guide to understanding car segments, Gulf News, viewed 12
October 2021, <https://gulfnews.com/lifestyle/complete-guide-to-understanding-car-
segments-1.1595406>.
• Gudivada, V., Apon, A. & Ding, J. 2017, “Data Quality Considerations for Big Data and
Machine Learning: Going Beyond Data Cleaning and Transformations", International Journal
on Advances in Software, vol. 10, no.1, pp. 1 - 20.
• izmocars 2021, Understanding Big Data Analytics for Auto Dealerships, izmocars, viewed 26
October, 2021, <https://www.izmocars.com/article/understanding-big-data-analytics-for-
auto-dealerships-1370-en-us.htm>.
• Lup Low, W., Li Lee, M. & Wang Ling, T. 2001, “A knowledge-based approach for duplicate
elimination in data cleaning”, Information Systems, vol.26, pp.585 - 606.
• Privacy and Data Protection Act 2014, the Office of Parliamentary Counsel, Canberra, p. 1,
<https://www.legislation.vic.gov.au/in-force/acts/privacy-and-data-protection-act-
2014/027>.
• Volkovs, M., Chiang, F., Szlichta, J. & Miller, R. J. 2014, "Continuous data cleaning," 2014 IEEE
30th International Conference on Data Engineering, pp. 244-255.
19
Appendix
8 664 cars have its Total Create a new Because the amount of rows
Fixed Cost value being column based on the with this error is too big, and
wrong formula the data to calculate the right
20
(Depreciation + value are already available.
Engineering +
Tooling), then delete
the old column
9 The data for Total Create a new Because the data for
Variable Cost of all column based on the calculating the right Total
(2672) cars were formula, then fixing Variable Cost is available, and
mismatched with the all columns related all columns are affected by this
formula (Label + to it (Contribution error.
Material + Overhead + Margin, Net Revenue
Freight + Warranty) and After-tax) with
these new values,
and delete all old
columns
10 1706 cars have their After joining The data for package cost is
package cost in the Packages and Cost already available in the
sales data different with the sales data, ‘Packages and Cost’ table, and
from the definition (in we delete the old data in the old column do not
‘Packages and Cost’ package costs follow the definition in this
table). column and put the table.
cost from the
‘Packages and Cost’
table as the new
package cost.
11 2 cars are in the Change the Avatar Both of these cars are from
Avatar model (which brand in these cars Tatra brand (similar to
is not the actual to Advantage. Advantage), and they have
model that these cars other similarities: like being in
are sold with). full-size segment, having pick-
up truck bodystyle like
Advantage, and AWD drive
configuration.
12 2 cars are in Flower Removing both cars They are in two different
model (which is not in from the dataset brands (Jackson and Tatra),
the company’s model which made it difficult to
list) determine what model are
they in, since each model is
only in a brand
13 1 car in Cx7 serie Change Cx7 serie to This car has numerous
(which is not defined Cx2 similarities with Cx2 car series:
in the beginning) in Chare model, is full-size
luxury, has SUV bodystyle, so
21
we could assume that is an
error when entering the
original Cx2 model for the car.
14 In Crux model there Remove both cars Because for the car with C1
are 2 cars with C1 and from the dataset serie, it has several differences
S2 series (while (which is performed with other cars in Cr1 serie: its
originally it only have in Excel file, the segment is full-size luxury,
2 series Cr1 and Cr2) result shown in while other cars are either
figure 2). compact or full-size. It also has
an SUV body style compared to
other cars’ type of pick-up
truck and sedan. Similarly, the
S2 series is also full-size luxury
with a SUV bodystyle, while
other cars in Crux model don’t
have these characteristics.
15 In the South America Changing the region Canada and the USA are both
region, 3 cars are column for these in the region of North
labelled as sold in cars to North America (which exists in the
Canada and 2 cars are America (in the Excel database), so it is feasible to
labelled as sold in the file, the result shown modify the rows’ data rather
USA. in figure 3). than delete all of them.
16 There are full-size Change the segment All of these segments have
luxury, mid-size luxury for all of these cars ‘Luxury’ components in their
and entry-level luxury to ‘Luxury’. name, which means that they
segments (which do could be considered as in the
not exist in original same luxury segment.
segments of the
company and are
respectively
presented in 63, 62
and 35 cars).
17 1543 cars have their Join the ‘Packages Because the value of the
package cost being and Cost’ table to package cost for each package
different to the cost the sales data, delete is already available in the
defined by the the old ‘Package ‘Packages and Cost’ table and
packages used for Costs’ column and each package type only has a
them. replace it with the cost, so it is possible to replace
value from the the old package cost values
‘Costs’ column of the with new values.
‘Packages and Cost’
table.
22
18 194 cars are in Micro Delete all of these As explained by Fullard (2015),
segment, which is not cars from the Micro cars are the type of cars
in the list of segment dataset. smaller than subcompact cars,
of cars sold by the so they could be considered as
HMC a different segment from any
other segments sold by HMC
and thus could not be merged.
19 114 cars are in the Change the segment Sport Coupe or Sport Utility
Sport Coupe segment of all of those cars to Coupe is emerging as a new
(which is not in the ‘Sport Utility’ type of sport utility vehicle
initial list of (Elliot (2020)).
segments).
21 664 cars have their Recalculate the tariff Because the data for tariff rate
tariff being calculated (multiply the rate already exist in the dataset,
wrong from the ‘Tariff rate’ and it could be calculated using
table with the gross the definition (Gross Sales *
sales) and then Tariff Rate).
delete the old tariff
column.
23
Figure 1: The workflow used for data cleaning.
Figure 2: The result after removing the cars with C1 and S2 series from the Crux model.
Figure 3: The result after removing cars sold in Canada and USA from the South America region and
putting them in North America.
The following figures show how the dataset is cleaned in Tableau Prep Builder.
24
25
26