Professional Documents
Culture Documents
Car Residual Values Usecase: Client Logo Goes Here
Car Residual Values Usecase: Client Logo Goes Here
usecase
R
omain Zilahi Mathilde Lavacquery Colomban Basset
Partner Senior Data Scientist Senior Data Scientist
Romain Zilahi pilote les activités Mathilde est Senior Data Scientist Colomban est Senior Data
crédit à la consommation et au bureau de Paris depuis 2019 Scientist au bureau de Bruxelles
leasing en Europe au sein du Mathilde a participé à des projets depuis 2020
pôle FIG. Il accompagne ces Advanced Analytics dans Colomban se concentre sur des
acteurs, ainsi que les banques plusieurs industries (Energy projets de modélisation de risque
traditionnelles, les fintech et les trading, banque et assurances…) pour des institutions financières
assureurs, dans leurs enjeux
stratégiques et opérationnels.
Banking
Fleet
management
Manufacturing
Supply chain
Chemistry
Energy (tbc
Sophie)
1 2 3 4
Breakout exercise
Case wrapup
Breakout exercise
Case wrapup
▪
1 What data would you use to predict the RV ▪ Leasing contracts can vary from 6 months to 10
grids ? Please build the data collection years, and vehicles at the end of a leasing contract
request and cite the main variables vary from 5000 to 150 000 km of mileage
▪
2 Describe the main preprocessing steps: ▪ The RV of a car is computed at the vehicle version x
fuel type
Cite the main data quality checks that can
be performed ▪ To make their pricing decisions, the team considers:
What are the main data cleaning steps to
perform in general ? – RV predictions from different sources (external
BONUS: build a synthetic variable quoters and own predictions)
▪
3 What screens would you design to help the – Historical prices from their brand and the
RV team make their pricing decisions ? competitor’s brands
Each team is expected to provide a PPT presentation at the end of the breakout, with Points will be attributed per question,
one slide per question. The format will not be assessed. according to the elements provided in the
We advice you to be mindful of your time and divide the questions. PPT:
3. What screens would you design to help the RV team make their pricing 3 2 points
decisions ?
Make a list of 2-3 screens that you would find interesting, and describe their
purpose in two lines
McKinsey & Company 10
Case presentation
Breakout exercise
Case wrapup
▪ Transaction data on Used Cars reselling prices from ▪ Monthly ▪ 10 years ▪ 1 month ▪ Third-tier
internet quoters: ▪ At a vehicle provider
– Vehicle characteristics: Model, Trimline, Model year, version level
Production year, Color, Drive type, Powertrain,
Engine liters & fuel & induction type
Intrinsic value – Registration-Date
– Sale country
– Operation months
– Total mileage
– Transaction price
▪ Original List Price for a vehicle ▪ Monthly ▪ 10 years ▪ 1 month ▪ Client internal
data
Market and
▪ Similar vehicles historical sales price from competitors ▪ Monthly ▪ 5-8 years ▪ 1 month ▪ Client internal
seasonality
▪ Market Fuel mix (volumes diesel/essence) ▪ Monthly ▪ 1 month data
data ▪ Electric vehicle adoption rate ▪ Yearly ▪ 1 year
Market and
• Assess the completeness and historicity of the data
seasonality
data available per brand/model type
• A moving average variable can help Tree-based models to capture seasonal or historical trends
• It is important to be mindful not to feed any future observation at observation time (= contract start date)
• Fill the moving average on available data at observation time (= contract start date), knowing:
• The moving average must be over the past 6 months before target time (= contract end date) (target time excluded)
• There is one month of delay to get the latest UC prices from the market
• In this example, we are trying to predict the price of a car 3 months after its contract start date
Contract Contract start - Contract start - Contract end - Contract end - UC Price (at
Car ID duration Year Month Year Month end date) Price_6m_MA
Version XX 3 2020 5 2020 8 106
Version XX 3 2020 6 2020 9 107
Version XX 3 2020 7 2020 10 108
Version XX 3 2020 8 2020 11 110
Version XX 3 2020 9 2020 12 111
Version XX 3 2020 10 2021 1 112
Version XX 3 2020 11 2021 2 113
Version XX 3 2020 12 2021 3 114 ?
Fill the moving average
price over the latest 6
In 12/2020, we want to months for this row
predict the price the car
will have in 03/2021
5 Comparison of RV grids
• Estimate the RV difference across the full RV
grid for similar models
1 Split the data in a training and in a test set. The idea is to maximize the data in 3 Use the full data available and the selected model to predict the residual
the training set while ensuring we can test for all maturities (up to 4 years) values ‘in production mode’.
Today
Today 1 year
Training set
2 years
Training set 1 year
3 years
2 years
4 years
3 years
4 years
Training Test