Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Car residual values

usecase

Client logo goes here.

CONFIDENTIAL AND PROPRIETARY


Any use of this material without specific permission of McKinsey & Company
is strictly prohibited
Who we are

R
​ omain Zilahi ​Mathilde Lavacquery ​Colomban Basset
Partner Senior Data Scientist Senior Data Scientist

​Romain Zilahi pilote les activités ​Mathilde est Senior Data Scientist ​Colomban est Senior Data
crédit à la consommation et au bureau de Paris depuis 2019 Scientist au bureau de Bruxelles
leasing en Europe au sein du ​Mathilde a participé à des projets depuis 2020
pôle FIG. Il accompagne ces Advanced Analytics dans ​Colomban se concentre sur des
acteurs, ainsi que les banques plusieurs industries (Energy projets de modélisation de risque
traditionnelles, les fintech et les trading, banque et assurances…) pour des institutions financières
assureurs, dans leurs enjeux
stratégiques et opérationnels. 

McKinsey & Company 2


This course aims at covering classical steps of Operation Analytics
project while presenting you different use cases Focus of the session

​Key project step

​Industry ​Modelling ​Data ingestion/ ​Problem


project type ​Ideation approach pre-processing solving ​Scale-up
​Plant
management

​Banking

​Fleet
management

​Manufacturing

​Supply chain

​Chemistry
​Energy (tbc
Sophie)

McKinsey & Company 3


The exercise will be sequenced into four steps involving breakout
time in teams

1 2 3 4

Case Breakout time Presentations Debrief and Case


introduction per team sharing

3 x 5 min (one group per


15 min 40 min 35 min
question)
Plenary room 6 breakout rooms Plenary room
Plenary room
McKinsey & Company 4
​Case presentation

​Breakout exercise

​Case wrapup

McKinsey & Company 5


Context and objectives

Client introduction Objectives of the project

▪ Our client is an actor of the car


industry offering car leasing contracts ▪ Design end-to-end RV tool
to its customers – Technical knowledge on RV predictions & customization of
▪ A key element to define a leasing the predictions
contract is the residual value of the – Centralize and visualize all predicted and historical RV
vehicle leased, which is the value at
which the car could be sold by our client
at the end of the leasing contract
▪ Manage the risk
▪ So far, our client has been using RV
provided by external market quoters to – Understand the algorithm and the key factors influencing
define their leasing contracts terms those RVs
▪ They want to internalize back the risk – Technical independence from external market providers
and develop a proprietary tool to
determine and manage residual values
(RV) ▪ Decision tool allowing a dynamic pricing of leasing contracts
– Leverage predictions to make business decisions within the
tool
– Possibility to apply stressed scenarios to RVs

McKinsey & Company 6


RV tool management

▪ Left part: filter on the car to


analyse at the vehicle version
level
▪ Leasing contracts can vary from
6 months to 10 years, and
vehicles at the end of a leasing
contract vary from 5000 to 150
000 km of mileage
▪ One business rule: the RV
predicted should be decreasing
across both the age and the
mileage axis
▪ Possibility to click on a specific
value to see historical values
▪ Possibility to download the grids
in a flat excel table to perform
further analysis

McKinsey & Company 7


​Case presentation

​Breakout exercise

​Case wrapup

McKinsey & Company 8


Questions Tool kit


1 What data would you use to predict the RV ▪ Leasing contracts can vary from 6 months to 10
grids ? Please build the data collection years, and vehicles at the end of a leasing contract
request and cite the main variables vary from 5000 to 150 000 km of mileage


2 Describe the main preprocessing steps: ▪ The RV of a car is computed at the vehicle version x
fuel type
 Cite the main data quality checks that can
be performed ▪ To make their pricing decisions, the team considers:
 What are the main data cleaning steps to
perform in general ? – RV predictions from different sources (external
 BONUS: build a synthetic variable quoters and own predictions)


3 What screens would you design to help the – Historical prices from their brand and the
RV team make their pricing decisions ? competitor’s brands

– Different macro-economic scenarios

McKinsey & Company 9


Points allocation in this module will be
based on …

​Each team is expected to provide a PPT presentation at the end of the breakout, with ​Points will be attributed per question,
one slide per question. The format will not be assessed. according to the elements provided in the
​We advice you to be mindful of your time and divide the questions. PPT:

Main evaluation criteria in this module:


1. What data would you use to predict the RV grids ? Please build the data
collection request. 1 4 points
 Build a data collection request with key data characteristics
 Identify the main data quality checks for each data type

2. Deep dive into the main preprocessing steps:


 Cite the main data quality checks that can be performed 2 ​4 points
​(+1 bonus point)
 What are the main data cleaning steps to perform in general ?
 BONUS: build a synthetic variable

3. What screens would you design to help the RV team make their pricing 3 ​2 points
decisions ?
 Make a list of 2-3 screens that you would find interesting, and describe their
purpose in two lines
McKinsey & Company 10
​Case presentation

​Breakout exercise

​Case wrapup

McKinsey & Company 11


1. Data collection request
Real-time
Data Granularity Historicity availability Source

▪ Transaction data on Used Cars reselling prices from ▪ Monthly ▪ 10 years ▪ 1 month ▪ Third-tier
internet quoters: ▪ At a vehicle provider
– Vehicle characteristics: Model, Trimline, Model year, version level
Production year, Color, Drive type, Powertrain,
Engine liters & fuel & induction type
Intrinsic value – Registration-Date
– Sale country
– Operation months
– Total mileage
– Transaction price
▪ Original List Price for a vehicle ▪ Monthly ▪ 10 years ▪ 1 month ▪ Client internal
data

Market and
▪ Similar vehicles historical sales price from competitors ▪ Monthly ▪ 5-8 years ▪ 1 month ▪ Client internal
seasonality
▪ Market Fuel mix (volumes diesel/essence) ▪ Monthly ▪ 1 month data
data ▪ Electric vehicle adoption rate ▪ Yearly ▪ 1 year

▪ GDP growth ▪ Monthly, ▪ According to ▪ Previous period ▪ OECD


▪ Unemployment rate Quarterly or public indexes’ available
Macro-
▪ Average importations of crude oil Yearly, according availabilities
economical ▪ Consumer spending – household expenses to the public
data ▪ Country’s price index indexes’
▪ Passenger’s cars registration availabilities
▪ Contribution of renewables to total primary energy
supply

McKinsey & Company 12


2. Data quality checks and data cleaning steps
Data quality checks

• Assess that sufficient historicity is available. To forecast


to 4 years, at least 8 years of data are available (4 years
of training and 4 years of test)
• Assess that price patterns do make sense:
• For the same vehicle version (considering the same
age and the same mileage), verify that prices remain
Intrinsic value
relatively constant through time. When important
variations are noticed, assess the business reasons
behind it (E.g. new tax, new regulation, introduction of
electric cars etc.)
• For the same vehicle version, verify that prices are
decreasing along both axis (age and mileage)

Market and
• Assess the completeness and historicity of the data
seasonality
data available per brand/model type

• Assess data completeness (E.g. Four points a year for


quarterly data)

Macro- • Assess briefly data consistency (E.g. confirm that GDP


economical drops in 2008-2009 & in 2020)
data • Assess data timeliness (I.e., how long it takes to get the
data ready)

McKinsey & Company 13


2. BONUS: Build moving average on historical data

• A moving average variable can help Tree-based models to capture seasonal or historical trends
• It is important to be mindful not to feed any future observation at observation time (= contract start date)
• Fill the moving average on available data at observation time (= contract start date), knowing:
• The moving average must be over the past 6 months before target time (= contract end date) (target time excluded)
• There is one month of delay to get the latest UC prices from the market
• In this example, we are trying to predict the price of a car 3 months after its contract start date

Contract Contract start - Contract start - Contract end - Contract end - UC Price (at
Car ID duration Year Month Year Month end date) Price_6m_MA
Version XX 3 2020 5 2020 8 106
Version XX 3 2020 6 2020 9 107
Version XX 3 2020 7 2020 10 108
Version XX 3 2020 8 2020 11 110
Version XX 3 2020 9 2020 12 111
Version XX 3 2020 10 2021 1 112
Version XX 3 2020 11 2021 2 113
Version XX 3 2020 12 2021 3 114 ?
Fill the moving average
price over the latest 6
In 12/2020, we want to months for this row
predict the price the car
will have in 03/2021

McKinsey & Company 14


2. BONUS: Auto-regressive feature engineering
Contract Contract start - Contract start - Contract end - Contract end - Contract end -
Car ID duration Year Month Year Month Price Price_6m_MA
Version XX 3 2020 5 2020 8 106
Version XX 3 2020 6 2020 9 107
Version XX 3 2020 7 2020 10 108
Version XX 3 2020 8 2020 11 110
Version XX 3 2020 9 2020 12 111
Version XX 3 2020 10 2021 1 112
Version XX 3 2020 11 2021 2 113
Version XX 3 2020 12 2021 3 114 107.5
The model will learn
Contract Contract from past insights: auto-
start date end date On the 12/2020, we regressive synthetic
want to predict the price variables are very useful
the car will have on the
03/2021

There is a one-month lag in data collection – this data can’t be used

This is ‘future information’ : contracts starting at these dates will


close in the future – this data can’t be used
This is known information we could use to build auto-regressive
features – moving averages can be computed.

McKinsey & Company 15


3. Examples of other screens available

1 Visualize historical used car prices


• See the evolution over time of used car prices

Comparison of client RV with external quoters ​See next pages


2
• Compare outputs from the algorithm with
external quoters predictions on similar models

3 Decision table for RV setting


• Present model predictions and allow business
teams to override the RVs based on some key
indicators

4 Scenario planning tool


• Estimate the aggregated financial impact of
macro-economic scenarios on the RV ​See next pages

5 Comparison of RV grids
• Estimate the RV difference across the full RV
grid for similar models

McKinsey & Company 16


Prediction methodology
Training and model selection Prediction in ‘production mode’

1 ​Split the data in a training and in a test set. The idea is to maximize the data in 3 ​Use the full data available and the selected model to predict the residual
the training set while ensuring we can test for all maturities (up to 4 years) values ‘in production mode’.
Today
Today 1 year
Training set
2 years
Training set 1 year
3 years
2 years
4 years
3 years

4 years
Training Test

2 ​Iterations on model fine 4 Ensure the predictions do


tuning: make sense from a business
standpoint. Extrapolate the
​- feature selection, notably on
grid using a parametric
the synthetic auto-regressive
function trained on historical
​- model scope (brands/ RV values and smooth the
countries) grid to ensure it is decreasing
​- hyperparameter tuning along both axis.

keep the model leading to the


highest MAE (Mean Absolute
Error)

McKinsey & Company 17


Appendix

McKinsey & Company 18


Auto-regressive feature engineering
1 – Conceptual view 2 – Training data creation
The logic has to be applied considering only vehicles being sold with the same age Raw data available:
and the same mileage. Illustrative view: prediction with a 3-month horizon.
Start - Year Start - Month End - year End - Month Price
2006 5 2006 8 106
Timeline – representation of contract start dates 2006 6 2006 9 107
2006 7 2006 10 108
Contract start date Prediction horizon 2006 8 2006 11 110
2006 9 2006 12 111
2006 10 2007 1 112
2006 11 2007 2 113
2006 12 2007 3 ?

Training data creation:


There is a one-month lag in
data collection – this data Start - Year Start - Month End - year End - Month Price Price 3MA
can’t be used 2006 5 2006 8 106
2006 6 2006 9 107
This is ‘future information’ : 2006 7 2006 10 108
contracts starting at these 2006 8 2006 11 110
dates will close in the 2006 9 2006 12 111
future – this data can’t be 2006 10 2007 1 112
used 2006 11 2007 2 113
This is known information
2006 12 2007 3 114 107
we could use to build auto-
regressive features – The model will learn
moving averages can be from past insights: auto-
computed. regressive synthetic
variables are very useful

McKinsey & Company 19


Evolution of Used Car historical prices

McKinsey & Company 20


Comparison of client RV with external quoters and competitors

McKinsey & Company 21


Scenario planning

McKinsey & Company 22

You might also like