Fundamentals of Model Calibration: Theory & Practice

Fundamentals of Model Calibration:
Theory & Practice
ISPOR 17th Annual International Meeting

Washington, DC USA
4 June 2012
Workshop Leaders
Douglas Taylor, MBA
Associate Director, Ironwood Pharmaceuticals Inc, Cambridge, MA
USA
Ankur Pandya, PhD MPH
Graduate Student, Harvard University, Boston, MA USA
David Thompson, PhD
Executive Vice President & Senior Scientist, OptumInsight, Boston,
MA USA
*Copy and paste this text box to enter notations/source information. 7pt type. Aligned to bottom. No need to move or resize this box.
Confidential property of Optum. Do not distribute or reproduce without express permission from Optum. 2
Acknowledgements
We would like to thank our colleagues who have contributed much to this
research over the last several years
Kristen Gilmore
Rowan Iskandar
Denise Kruzikas
Kevin Leahy
Vivek Pawar
Milton Weinstein
Workshop Objectives
Discuss rationale for model calibration in what circumstances is
calibration needed?
Provide overview of model calibration process: selection of inputs,
specifying the objective function, implementing the search process, and
evaluating the calibration results
Describe advanced topics in model calibration, including incorporation
of calibrated inputs into uncertainty analyses
Illustrate concepts through real-world examples
*Copy and paste this text box to enter notations/source information. 7pt type. Aligned to bottom. No need to move or resize this box.
Concept of Model Calibration
Calibration traditionally conceptualized as an importantbut not
necessary stepin model validation:
If reliable benchmark data exist, then predictive validity can be
assessed & model calibrated if found to be inaccurate
Otherwise, model cannot be impugned for not being calibrated
Calibration task involves systematic adjustment of model parameter
estimates so that model outputs more accurately reflect external
benchmarks
Calibration requires modeler to assess how model outputs can govern
model inputs, rather than the other way around
Model Inputs Model Model Outputs
Data Sources
When is calibration needed?

Model validity threatened by spatial variation (eg, if being adapted
from original setting to a foreign one)
CHD Risk
US
France
Cholesterol
Level
Model validity threatened by temporal variation (eg, if input data are
old or secular changes have occurred since their collection)
CHD Risk
US1980
US2010
Cholesterol
Level

Model validity threatened by heterogeneity (eg, population average
data available, but subgroup data not)
CHD Risk US Men
US Average
US Women
Cholesterol
Level
Model Calibration Process
Estimate Model
Run Model
Parameters
Adjust Inputs Assess Results
Looks straightforward, but

What criteria do we employ to adjust model results?
How do we go about adjusting model inputs?
How do we know when we are done?
Thank You.
Contact Info:
David Thompson, PhD
david.thompson@optum.com
781-518-4034
FUNDAMENTALSOF
MODELCALIBRATION:
THEORY&PRACTICE
IdentifyingInputstoCalibrate
Theoretically,anyinputcouldbecalibrated
Butinputsshouldberelatedtotheproblem
tojustifyusingcalibration
Adaptedfromonesettingtoanother
Estimatedfromheterogeneouspopulations
Affectedbytemporalchangesinepidemiologyor
practicepatterns
IdentifyingCalibrationTargets
Targetsshouldbebasedonsettingspecific(or
otherwiseappropriate)data
Modelshouldpredictthesetypesofevents
(agespecific,compositeoutcomes,etc.)
GoodnessofFit
Assesshowwellmodeloutputsmatch
observeddata
Threepotentialapproaches:
Acceptablewindows
Minimizingdeviations
Likelihoodfunctions
AcceptableWindows
Comparemodelpredictedoutcomesto
establishedrangesforeachendpoint
Suitablewhentherearemultipleendpointsof
interest
Easytoimplement
Limitation:Doesnotcapturethedegreeof
closeness
AcceptableWindows Example
UpperBound
LowerBound
UpperBound
LowerBound
UpperBound
LowerBound
UpperBound
LowerBound
MinimizingDeviation
Summarymeasureofrelativedistanceof
modelproducedresultsfrombenchmarks
Capturesmagnitudeofgoodnessoffit
Easytoimplement
Weightsallendpointsequally,unless
weightingschemeintroduced
PercentageDeviation
n predi obsi
Weighted Mean Percentage Deviation = wi
i =1 obsi
Where,
n = number of endpoints
predi = model-based estimate of the ith endpoint
obsi = data-based target value of the ith endpoint
wi = weight assigned to the ith endpoint
MinimizingDeviation Example
Target
Target
Target
LikelihoodFunctions
Howlikelythemodelproducedresultsarein
lightoftheobservedoutcomes
Incorporatesprecisionofendpointdata
Hardertoimplement
Needdataonsamplesizes
Havetoknow(orassume)distributions
LikelihoodFunctions Example
Assumeincidencehasbinomialdistribution
n k
Pr( K = k ) = p (1 p ) n k
Where:
k
k=#ofeventsobservedinmodel
n=samplesizeofoutcomedata
p=#ofeventsobservedinoutcomedata/n
LikelihoodFunction Example
n = person-years
k = events
(k / n)*1000 = Incidence (y-axis)
k = 23
n = 2,800
Incidence 8.21
Target
AgeSpecificIncidence(per1000personyears)
12.00
k = 28
L = 0.047
10.00
k = 23
n = 2,800
8.00
k = 14
L = 0.013 ARIC
Target
6.00
ParameterSet1
ParameterSet2
4.00
2.00
0.00
4554yrs 5564yrs 6574yrs 7584yrs
AgeSpecificIncidence(per1000personyears)
12.00
k = 28
L = 0.047
10.00
k = 23
n = 2,800
8.00
k = 14
L = 0.013 Target
ARIC
6.00
ParameterSet1
ParameterSet2
4.00
k = 287 k = 240
2.00 n = 49,000 L = 0.00045 k = 368
L = 0.00000064
0.00
4554yrs 5564yrs 6574yrs 7584yrs
CombiningLikelihoods
Multiplylikelihoods(ifindependent)
Sumloglikelihoods
SummaryofGoodnessofFitOptions
Acceptable Windows Deviations Likelihood-based
Easy to implement Easy to implement Need specific data

Not sensitive to Captures magnitude Need to know (or
magnitude of of deviations assume) distribution
deviations Weights for multiple Gives meaningful
endpoints will be goodness-of-fit
subjective measures (i.e.,
likelihoods are
probabilities)
ParameterSearchMethods
Howtoadjustinputsduringcalibration?
Manualadjustment
Randomsearches
Directedsearchalgorithms
Fundamentals of Model Calibration: Theory &
Practice
Advanced Topics
CONFIDENTIAL
Confidential | 33
EXCELDEMONSTRATION
Results of 100 calibrations of simple model
Confidential | 35
Advanced Topics
Probabilistic and deterministic sensitivity analysis
for calibrated disease models
Incorporating uncertainty of calibration endpoints in
calibrated oncology models
Identification of and correction for bias introduced
from calibrating longitudinal models to cross-
sectional data
Confidential | 36
Probabilistic and deterministic
sensitivity analyses for calibrated
disease models
Confidential | 37
WhyCSAWasNeeded
$600 $50K threshold
$500 Median: $10,500

Mean: $10,600
95% CI: ($7,800; $13,900)
Incremental Cost
$400
$300
$200
$100
$0
0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016
Incremental QALY
Confidential | 38
Why CSA Was Needed
Sources of uncertainty
Algorithm
Analyst in a manual calibration
Starting seed/search space in a random calibration
Starting simplex in Nelder-Mead calibration
Objective function
Is really quite subjective
Choices include:
Calibration targets
Weighting scheme
Stopping point
Confidential | 39
CSA Methods
Evaluated algorithm uncertainty by choosing 5
different starting Nelder-Mead simplexes
Evaluated objective function uncertainty by choosing
5 different objective functions
Combined simplexes and weights for a total of 25
different calibrations
Deterministic sensitivity analysis was performed by
examining cost-effectiveness results for each
calibration while holding all other parameters constant
Probabilistic sensitivity analysis was performed by
bootstrapping (with equal probability) the 25
calibrations within a 2nd order Monte Carlo simulation
for other model parameters
Confidential | 40
CSADeterministicResults
ICER*bysimplexand
Weight 1 Weight 2 Weight 3 Weight 4 Weight 5
Simplex 1 $8,400 $13,800 $4,400 $11,600 $5,300
Simplex 2 $17,100 $20,800 $7,800 $15,100 $8,100
Simplex 3 $20,500 $11,500 $27,800 $17,300 $10,900
Simplex 4 $20,700 $22,000 $1,500 $8,000 $5,400
Simplex 5 $20,700 $21,000 $39,100 $12,100 $8,900
MedianICER:$12,600
MeanICER:$14,000
Range:$1,500 $39,000
Confidential | 41
*ICER: Incremental Cost-Effectiveness Ratio (Cost per QALY gained) for vaccination vs. no vaccination
PSAforaSingleCalibration
$600 $50K threshold
$500 Median: $10,500

Mean: $10,600
95% CI: ($7,800; $13,900)
Incremental Cost
$400
$300
$200
$100
$0
0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016
Incremental QALY
Confidential | 42
CSAProbabilisticSAResults
$ 60 0
$50K threshold
$ 50 0 Median: $12,600
Mean: $14,000
$ 40 0 95% CI: ($2,700; $29,100)
Incremental Cost
$ 30 0
$ 20 0
$ 10 0
$0
0 .00 0 0.0 02 0.0 04 0 .00 6 0.0 08 0 .01 0 0 .01 2 0.0 14 0 .01 6
-$ 10 0
Incre me ntal Q ALY
Vaccination of age cohorts are compared with no vaccination among same age cohorts.
Each square represents a calibration and each color represents the PSA around those calibrations.
Confidential | 43
Representing uncertainty in
calibration targets
Confidential | 44
Objective
Demonstrate methods for incorporating uncertainties
in calibration targets into sensitivity analyses (PSA)
using an oncology example
Confidential | 45
Model
Non-Progressed Progressed
Dead
We constructed hypothetical PFS and OS (with censoring) curves for

two treatments and a corresponding three-state Markov model
Three transition probabilities for each treatment were calibrated
(using Excel Solver) to simultaneously fit the PFS/OS curves, using
mean squared deviation as the objective function
Uncertainty in cost-effectiveness results was represented by cost-
effectiveness acceptability curves (CEAC) of lifetime costs and
quality-adjusted life-years
Confidential | 46
Analysis
We will look at results of three increasingly comprehensive PSAs using

second-order Monte Carlo simulation (SMCS):
Conventional PSA by including only probability distributions of costs
and utilities
Calibration Parameter PSA, reflecting uncertainty in the target
PFS/OS curves, by specifying beta distributions for failure
probabilities at each PFS/OS time point, simulating multiple
replicates of the PFS/OS data from these distributions, re-estimating
and refitting the curves for each replicate, and incorporating the
resulting calibrated parameter sets into the SMCS
Calibration Structural PSA, reflecting uncertainty associated with
calibration methods, by varying curve-fitting parameters (initial
values, constraints, objective function)
Confidential | 47
Sensitivity analysis process flow
Generate 200
survival curves Calibrate model Bootstrap 200
from trial data to generate 200 parameter sets
reflecting parameter sets within PSA
sampling error
Alternative
calibration
methods to
generate 200
parameter sets
Confidential | 48
Sample Kaplan-Meier Data
Timepoint 0 4 9 14 19 24
At Risk 100 88 65 47 23 9
OS
Censored 0 7 9 12 14 7
At Risk 100 80 48 27 12 3
PFS
Censored 0 6 8 7 7 3
Confidential | 49
Uncertainty estimates
Life-table estimates are computed by counting the numbers of

censored and uncensored observations that are in time intervals
[t i 1, t i ] , i = 1, 2, K, k + 1, where t 0 = 0 and t k +1 =
q i p i
Estimated standard error is (q i ) = where p = 1 qi
ni
ni = ni w i 2, where w i is the number of units censored in the interval
ni is the effective sample size in [t i 1, t i ]
Conditional probability of an event in [ti 1, ti ] is estimated by q i =

di
ni
Where dj is the number of events in the interval
Confidential | 50
Uncertainty in survival curves and
calibration
Generated OS Calibrated OS
Confidential | 51
Comparison of PSA approaches
Confidential | 52
CEAC comparison
Confidential | 53
Calibrating Longitudinal Models to

Cross-Sectional Data: The Effect
of Temporal Changes in Health
Practices
Objective
One set of calibrated transition
probabilities for cervical cancer model
Problem
Pap smear screening practices changed
over time
Calibration targets reflect current and past
screening patterns
Older women (>65 years): Less screening
when they were young
Younger women: Exposed to higher
screening rates at same ages
Annual screening coverage by age
70%
60%
50%
% S c re e n e d
40%
30%
20%
10%
0%
10 20 30 40 50 60 70 80 90 100
Age
<65 65
How did we calibrate?
g
nin
c
elree
Ou
s
d
Mitho
el w
tp
d
ut
Mo
s
target
SEER targets
tionn
n
In
tio
raratio
pu
ibra
ts
Single-stage model,
alib
alibl
single-stage calibration
CC
Ca
Two-stage model run w/

Calibration single-stage calibration
Two-stage model,
two-stage calibration
Results
Incidence and Mortality rates per 100,000 (age 65+)
SEER Target SEER Target

for incidence: 13.41 for mortality: 7.14
3 5.68
1 3 .4
7.14
13.41 15.81 10.50
13. 7.3
41 2
Single-stage model / single-stage calibration

Two-stage model run w/ single-stage calibration
Two-stage model / two-stage calibration
Implication
Effects of temporal changes are important
when calibrating longitudinal models to
cross-sectional data
Conclusions
Time is always a limiting factor with more time a
better solution can almost always be found
Calibration can affect the interpretation of cost-
effectiveness results
In order to characterize the uncertainty in a
calibrated model:
Results should be reported as a range from different
calibrations
Calibration should be included in probabilistic sensitivity
analyses
Uncertainty in calibration targets should be considered
Adjustments may need to be made to account for
temporal shifts in data
Using a combination of calibration methods is likely
the most efficient way to arrive at good calibrations
Confidential | 61
DISCUSSION

Fundamentals of Model Calibration: Theory & Practice

Uploaded by

Copyright:

Available Formats

You might also like

Fundamentals of Model Calibration: Theory & Practice

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fundamentals of Model Calibration: Theory & Practice

Uploaded by

Copyright:

Available Formats

Fundamentals of Model Calibration:

Theory & Practice

ISPOR 17th Annual International Meeting

Model Inputs Model Model Outputs

When is calibration needed?

When is calibration needed?

CHD Risk US Men

Adjust Inputs Assess Results

Looks straightforward, but

Easy to implement Easy to implement Need specific data

$600 $50K threshold

$500 Median: $10,500

Simplex 1 $8,400 $13,800 $4,400 $11,600 $5,300

Simplex 2 $17,100 $20,800 $7,800 $15,100 $8,100

Simplex 3 $20,500 $11,500 $27,800 $17,300 $10,900

Simplex 4 $20,700 $22,000 $1,500 $8,000 $5,400

Simplex 5 $20,700 $21,000 $39,100 $12,100 $8,900

$600 $50K threshold

$500 Median: $10,500

We constructed hypothetical PFS and OS (with censoring) curves for

We will look at results of three increasingly comprehensive PSAs using

Sensitivity analysis process flow

Life-table estimates are computed by counting the numbers of

ni is the effective sample size in [t i 1, t i ]

Conditional probability of an event in [ti 1, ti ] is estimated by q i =

Comparison of PSA approaches

Calibrating Longitudinal Models to

How did we calibrate?

Two-stage model run w/

SEER Target SEER Target

Single-stage model / single-stage calibration

You might also like