Research Synthesis Methods - 2023 - Zhang - Four alternative methodologies for simulated treatment comparison How could

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Received: 7 March 2023 Revised: 23 August 2023 Accepted: 30 October 2023

DOI: 10.1002/jrsm.1681

RESEARCH ARTICLE

Four alternative methodologies for simulated treatment


comparison: How could the use of simulation be
re-invigorated?

Landan Zhang 1 | Sylwia Bujkiewicz 2 | Dan Jackson 1

1
Statistical Innovation, AstraZeneca,
Cambridge, UK Abstract
2
Biostatistics Research Group, Simulated treatment comparison (STC) is an established method for perform-
Department of Population Health ing population adjustment for the indirect comparison of two treatments,
Sciences, University of Leicester,
where individual patient data (IPD) are available for one trial but only aggre-
Leicester, UK
gate level information is available for the other. The most commonly used
Correspondence method is what we call ‘standard STC’. Here we fit an outcome model using
Dan Jackson, Statistical Innovation
Group, AstraZeneca, Cambridge, UK.
data from the trial with IPD, and then substitute mean covariate values from
Email: daniel.jackson1@astrazeneca.com the trial where only aggregate level data are available, to predict what the first
of these trial's outcomes would have been if its population had been the same
as the second. However, this type of STC methodology does not involve simu-
lation and can result in bias when the link function used in the outcome model
is non-linear. An alternative approach is to use the fitted outcome model to
simulate patient profiles in the trial for which IPD are available, but in the
other trial's population. This stochastic alternative presents additional chal-
lenges. We examine the history of STC and propose two new simulation-based
methods that resolve many of the difficulties associated with the current sto-
chastic approach. A virtue of the simulation-based STC methods is that the
marginal estimands are then clearly targeted. We illustrate all methods using a
numerical example and explore their use in a simulation study.

KEYWORDS
indirect treatment comparisons, population adjustment, simulation based methods

Highlights

What is already known


• Simulated treatment comparison (STC) is an established method for per-
forming population-adjusted indirect treatment comparisons.
• It is typically used where the results from two trials are compared, one of
which only provides aggregate-level data.
• The most commonly used approach, which we call ‘standard STC’, does not
in fact involve simulation.
• The other existing approach is to simulate patient profiles.
• Both of these approaches have statistical problems.

Res Syn Meth. 2023;1–15. wileyonlinelibrary.com/journal/jrsm © 2023 John Wiley & Sons Ltd. 1
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 ZHANG ET AL.

What is new
• We propose two new simulation-based methods for performing STC.
• These new approaches resolve most of the difficulties with the current
simulation-based approach.

Potential impact for research synthesis methods readers


• Much of the existing literature on STC is non-specific.
• By fully describing the steps involved in this type of analysis, we aim to
make the corresponding statistical theory more tangible.
• In addition to presenting two new statistical methods, we examine the his-
tory of STC and provide a statistically rigorous account of this methodology.
• This article is intended to provide an accessible introduction for readers
who may be unfamiliar with STC, and develop this statistical methodology
further.

1 | INTRODUCTION covariates have already been accounted for by within-


trial randomisation, as mentioned by Phillippo et al.9
With an increasing number of new treatments, it However, in the unanchored case, it is necessary to adjust
becomes even more important for clinicians, patients and for both effect modifiers and prognostic variables,9
regulators to compare the available treatments using because the indirect treatment comparison is then per-
comparable estimates and outcomes. In medical statistics, formed by comparing average outcomes. In this work, we
an indirect treatment comparison is frequently used to will consider the anchored case where patients in the
compare treatment effects from different studies when company's trial randomly receive treatment A or
there is no evidence from head-to-head trials. Standard treatment B, and patients in the competitor's trial ran-
methods for indirect treatment comparisons1–5 either domly receive treatment A or treatment C. Hence treat-
assume no important differences in the trial populations, ment A is the common comparator. For brevity, we will
incorporate these differences as between-study heteroge- also refer to the company's trial as the ‘AB trial’ and the
neity or try to explain these differences using meta- competitor's trial as the ‘AC trial’. The aim is to adjust
regression.5 When making indirect comparisons, adjust- the AB trial patient outcomes to match the AC trial popu-
ing for population differences between trials is often lation, and so estimate the relative effect of treatment B
desirable to make this comparison more equitable. versus A in the AC trial's population. The indirect treat-
A variety of statistical methods for population- ment comparison of treatments B and C can then be per-
adjusted indirect comparisons have been developed for formed using standard methods for indirect treatment
the case where we compare active treatments from two comparisons.1 This adjusted indirect comparison is per-
different trials. We will consider the common scenario formed in the AC trial population because we only have
where one of these trials is the ‘company's trial’, with access to aggregate level information for the AC trial, and
individual patient data (IPD) available, and the other is so we cannot make individual level adjustments for this
the ‘competitor's trial’ with only aggregate level data trial.
available.6–10 This aggregate level data are typically pro- Simulated treatment comparison (STC) is an estab-
vided in published papers and reports. We distinguish lished approach for performing this type of statistical
between anchored and unanchored analyses by consider- analysis. The history of STC starts with Caro and Ishak,7
ing if a common comparator treatment is present in both who simulate data in an additional arm within a trial
trials. For the anchored case, a common comparator, without any indirect treatment comparison involving
such as a placebo, exists in each trial, but this is not the another trial. Ishak et al.11 build upon this idea, propos-
case in the unanchored scenario.9 ing to fit a model in the company's trial where IPD are
In the anchored case, we only adjust for population available (referring to this as the ‘index trial’) and using
differences of effect modifiers (covariates that have this model to predict what patient outcomes would have
impact on treatment efficacy) but do not adjust for the been if the trial had been performed in the population of
solely prognostic covariates. This is because the indirect competitor's trial. By using these predicted outcomes in
comparison is performed by comparing estimated treat- the indirect treatment comparison, a population-adjusted
ment effects, and any imbalances between the prognostic analysis can then be performed in the population of
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 3

competitor's trial. Two approaches for making the outcome model is non-linear.11 Furthermore, it is unclear
required predictions are discussed in section 4.2 of Ishak whether conventional STC targets a marginal or a condi-
et al.11: we either ‘plug in’ the mean values of the com- tional estimand.17 These are the main reasons we wish to
petitor's trial or simulate patient outcomes. Both re-invigorate simulation back into STC: it resolves these
approaches have statistical problems. Ishak et al. discuss two concerns with the conventional approach. If Monte
bias with the first, when the link function used in the Carlo Simulation is not used, STC may appear to be a mis-
model is not linear. For the second, Phillippo et al.9 raise nomer unless one interprets simulation in the context of
concerns about ‘correctly quantifying the uncertainty in ‘virtual’ outcomes. However, this is not the usual meaning
the resulting indirect treatment comparison’, and there is of the term simulation in statistics. We can apply ML-
also the additional difficulty in simulating correlated NMR to any connected network of evidence and obtain
patient covariates (as discussed in section 4.2 of Ishak estimates in any target population, but methodological
et al., an issue is that the correlation structure of the cov- development is required for ML-NMR to be applied to
ariates is typically unknown for aggregate level data time-to-event data.
study because this is not usually reported, so covariate In this article, we will carefully re-examine the two
correlations will need to be ‘borrowed’ from the IPD existing STC methods already in the literature and dis-
study). In addition, there is the problem that the cussed by both Ishak et al.11 and Phillippo et al.9 We will
simulation-based method may be subject to an unaccept- propose two new ways to use simulation in STC, where we
able amount of Monte Carlo error and not reproducible can quantify the uncertainty of treatment effect estimates
because the results could be very sensitive to the random using principled statistical methods. Our two new methods
seed used. The former, more straightforward ‘plug-in the also greatly reduce concerns about reproducibility and the
means’ approach is used in the worked example in NICE magnitude of the Monte Carlo error. The rest of this article
DSU Technical Support Document 18: Methods for is structured as follows. Section 2 describes Bucher's method
population-adjusted indirect comparisons in submissions for performing indirect treatment comparisons without pop-
to NICE9 and, in our experience, has become the default ulation adjustment and its limitations. In Section 3, we
‘standard STC’ because it is so straightforward to imple- describe four different methods for STC. These methods
ment. However, it is essential to recognise that the two include the standard STC method, with no simulation, and
possibilities of plugging in the mean values from the the existing ‘single imputation method’. We also develop
competitor's trial and simulating from the model fitted the new ‘multiple imputation’ and ‘infinite population’
using the data from the IPD trial have always co-existed methods to resolve most of the difficulties with the current
in the literature. simulation-based approach. Section 4 applies all four STC
Other widely-used approaches in the population- methods to the numerical example in NICE DSU Technical
adjusted indirect comparison include matching-adjusted Support Document.9 A simulation study is conducted in
indirect comparisons (MAIC)6,8–10 and multilevel network Section 5, followed by a discussion in Section 6.
meta-regression (ML-NMR).12–14 MAIC is a propensity
score reweighting method that calculates weights for the
IPD in the company's trial to match the moments of cov- 2 | BUCHER'S METHOD
ariates from the competitor's trial. ML-NMR is an exten-
sion of standard network meta-analysis (NMA).2,5,15,16 In An unadjusted indirect treatment comparison of two ran-
the ML-NMR model, we specify the individual level model domised controlled trials (Bucher's method1) is conducted
and obtain aggregate level models by integrating over dis- without population adjustment. This method is a special
tributions of covariates. Each method has its advantages case of network meta-analysis, but it has the potential to
and disadvantages. For example, MAIC and STC have only produce biased results if the distributions of effect modi-
been developed to compare two treatments and can be fiers are not balanced across studies. To estimate the rela-
used for both anchored and unanchored cases. MAIC tive effect of treatment C versus treatment B, we use the
assumes that the distributions of the covariates overlap formula Δ b ACðACÞ  Δ
b BC ¼ Δ b ABðABÞ . Here, Δ b ACðACÞ is the
between trials and can perform poorly when population estimated relative effect of treatment C versus treatment
distributions are significantly different.13 Regression-based A in the AC trial population, and Δ b ABðABÞ is the estimated
methods such as STC do not require overlapping covari- relative effect of treatment B versus treatment A in the
ates because we can use the regression equation to extrap- AB trial population. To emphasise the populations in
olate. However, this extrapolation may be sensitive to the which the indirect treatment comparison is made, we use
model specification and may raise concerns in practice. As the notations ðABÞ and ðAC Þ. This emphasis becomes
mentioned above, STC methods that do not use simulation important in the sections that follow where population
may lead to bias when the link function used in the adjustment is performed.
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 ZHANG ET AL.

The point estimate of Δb ABðABÞ and its variance σ 2 x AB ¼ 0. Model (2) is essentially also given as eq. (8) of
ABðABÞ
can be obtained by fitting a regression model to the IPD Phillippo et al.9 that is also fitted to the ‘AB trial’, where
of AB trial on treatment group assignment only, where we have not included any prognostic variables. We use
the choice of regression model depends on the type of the subscript AB in x AB to emphasise the fact that this
outcome data. The standard error σ BC of the estimate Δ b BC model is applied to the AB trial data, whereas Phillippo
is calculated as et al.9 place this emphasis on the model parameter, but
this is a cosmetic difference.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
σ BC ¼ σ 2ACðACÞ þ σ 2ABðABÞ , ð1Þ Phillippo et al.9 state that adding other prognostic
covariates might increase precision and improve model
fit (see their supplementary materials). Hence, instead of
where, σ ACðACÞ is the standard error of Δ b ACðACÞ in the AC the model (2), we could fit the slightly more general
trial, and we assume this is available from the published model
report in the AC trial or can be calculated from other
 
published quantities, such as a confidence interval or gðθt ðx AB ÞÞ ¼ β0 þ βTp x AB,p þ βT1 x AB þ γ t þ βT2 x AB I ðt ¼ BÞ,
p value. ð3Þ
An unadjusted analysis is based on the implicit
assumption that the marginal relative effects are similar
across different populations. Random effects models for where the term βTp x AB,p is the effect of the additional
network meta-analysis include between-study heteroge- prognostic variables in the model.
neity. However, estimating between-study heterogeneity
based on only two studies is not feasible. In our study, we
assume that the effect modifiers account for any 3.1 | The standard ‘no simulation’
between-study heterogeneity. Consequently, when varia- method (STC)
tions exist in the distribution of effect modifiers across tri-
als, the relative effect may no longer remain constant. As In the standard ‘no simulation’ method, we predict what
a result, using the unadjusted method may lead to biased the AB study's estimated treatment effect would have
estimates of the relative effect. To address this, we need been if in AC trial's population, by ‘plugging in’ the
to consider the influence of effect modifiers carefully in mean covariate values in the AC trial's population into
our analysis to ensure more equitable results. In the next the model fitted to the AB trial data. Furthermore we
section we present four versions of STC that can be used may determine how the predicted treatment effect
for this purpose. depends on the properties of the population of the com-
petitor trial by substituting other representative values of
x AC . When we use the fitted model (2) or (3), the pre-
3 | FOUR DIFFERENT METHODS dicted treatment effect will be b γ B þ βT2 x AC . Hence the
F O R ST C required prediction, and its variance, would need to be
calculated from this linear combination. As explained in
An essential step of any STC is to fit an outcome model the supplementary materials of Phillippo et al.,9 the nec-
to the company's trial with IPD available. As explained in essary calculations are simplified by re-parameterising
the introduction, we consider the anchored case so that models (2) or (3) so that the covariates in the company's
the model should include all effect modifiers. Typically, trial x AB are centred at the corresponding competitor's
this model will be a generalised linear model of the form trial means. Defining zAB to be these centred values, that
is, zAB ¼ x AB  x AC , instead of fitting model (2), we equiv-
 
gðθt ðx AB ÞÞ ¼ β0 þ βT1 x AB þ γ t þ βT2 x AB I ðt ¼ BÞ, ð2Þ alently fit the model
 
gðθt ðzAB ÞÞ ¼ β0 þ βT1 zAB þ γ t þ βT2 zAB I ðt ¼ BÞ, ð4Þ
where, θt is the mean outcome on treatment t, for
t  fA,Bg, and g is an appropriate link function. x AB
denotes the effect modifier, the term βT1 describes the to the company's trial's data. Note that there is a clash of
main effect of the effect modifier and βT2 describes the notation between Equations (2) and (4), for example, the
interaction of the effect modifier with treatment group. intercepts β0 take different values when using the trans-
The term I ðt ¼ BÞ is an indicator that the treatment t is formed data zAB . We then substitute the covariate means
the active treatment B, so that γ B is the treatment effect of the aggregate data from the AC trial into the outcome
of treatment B, relative to treatment A, for a patient with model and also use the regression model parameter
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 5

estimates to predict the outcomes of the competitor's required predictions of what results the company's trial
trial. This is straightforward when using the transformed would have produced if it had been performed in its
covariates because, by definition, the transformed covari- competitor's population. It is simple to implement, but
ate means of the population of competitor's trial are it may be biased when the link function g is non-lin-
zAC ¼ 0. This results in the prediction ear.11 Ishak et al.11 propose a second approach for STC
using simulation. In this section, we describe this alterna-
 
g bθt ðzAC ¼ 0Þ ¼ b
β0 þbγ t I ðt ¼ BÞ: ð5Þ tive approach.
We follow Ishak et al. in simulating a trial of the same
size as the competitor's trial. They use the terms ‘index
Equation (5) is a straightforward formula for prediction, trial’ and ‘comparator group’ when referring to the trials
where b γ B is the predicted effect of treatment B, relative for which we do, and do not, have IPD available and say,
to A, in the AC trial population. The advantage of using ‘It is important to maintain the sample size of the com-
the transformed zAB now becomes clear because the pre- parator group when generating virtual event times to
dicted treatment effect b γ B is directly an estimated model ensure the variance estimate is a reasonable approxima-
coefficient that can simply be extracted from the model tion’. Hence we simulate a trial where the covariate dis-
fit, reported by statistical software. We now define tributions and the sample size are the same as the
b ABðACÞ ¼ bγ B to emphasise that this is the estimated treat-
Δ competitor's trial but where the company's treatment is
ment effect of treatment B, relative to treatment A, but the active treatment. We therefore simulate what the
predicted for the AC trial population. company's trial's results would have been if these impor-
Finally, we estimate the required effect of treatment tant features had been the same as its competitor. How-
C, relative to treatment B, in the AC trial population by ever, we suggest that it is open to debate as to whether or
performing a standard indirect treatment comparison1 as not it is most appropriate to simulate a trial of the same
size as the company's or the competitor's trial, and we
b BCðACÞ ¼ Δ
Δ b ACðACÞ  Δ
b ABðACÞ , ð6Þ return to this in the discussion.
A difficulty is that we do not know all aspects of the
population of competitor's trial. We therefore do not
where, Δ b ACðACÞ is the published estimate of the relative immediately have the information required to perform the
treatment effect of C, relative to A, from the competitor's simulation. Typically we will only have the means and
trial. Note that this requires that the link function g standard deviations for continuous covariates and sample
relates to the outcome measure used in the indirect com- proportions for binary covariates reported in the publica-
parison. For example, if the outcome is a mean differ- tion or trial report for the competitor's trial. We therefore
ence, the identity function would be used. However, a simulate from the competitor's trial population as accu-
binary outcome is most commonly modelled using log rately as we can, given the information available to us. We
odds scale, in which case g is a logit function. The stan- simulate continuous covariates with their reported means
dard error is calculated as and standard deviations, and binary covariates with their
reported proportions, from the competitor's trial but using
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
σ BCðACÞ ¼ σ 2ACðACÞ þ σ 2ABðACÞ , ð7Þ correlations and distributional assumptions from a differ-
ent source of information. If the competitor's trial does not
report standard deviations, these will also be obtained
where, σ 2ABðACÞ (the variance of b b ABðACÞ ) is extracted
γB ¼ Δ from another source of information. Typically this infor-
from the model fitted to the company's trial and σ 2ACðACÞ mation will come from the IPD from the company's trial
is either available from the aggregate level data published and we will use this throughout. We describe the steps of
for the competitor's trial or can be calculated from other this approach as follows:
published quantities.
1. Simulate patient covariates for a trial that is the same
size as the competitor's trial with
3.2 | The single imputation method
(STC-SI) a. Means and standard deviations reported by the com-
petitor's trial for continuous covariates.
The standard STC method described in Section 3.1 does b. Proportions reported by the competitor's trial for
not involve any simulation. Instead, we plug the covar- binary covariates.
iate means of the competitor's trial into the outcome c. Appropriate properties for other types of covariates,
model fitted to the company's trial to make the for example categorical or count data, could also be
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 ZHANG ET AL.

simulated using reported information from the com- to this issue in the discussion. We now let the notation
petitor's trial. ðx AC0 , t Þ denote a patient simulated covariates from
d. Correlations and any distributional assumptions step 1 and treatment group in this step, the subscript
(e.g., for continuous covariates) from a different infor- ðAC 0 Þ emphasises that the estimation is performed in
mation source where this information is available. the simulated AC trial population.
Typically this information will come from the IPD in 3. Fit an outcome model to the data from the AB trial.
the company's trial. An appropriate parametric model should be used,
such that data can easily be simulated from, for
The main difficulty is simulating correlated covari- example model (2) or (3). In the context of a time-
ates of different types, for example, continuous and to-event outcome, this model could be a parametric
binary. This is discussed by Ishak et al.,11 who says survival model that includes the main effects and
that ‘individual characteristics should be generated interactions of effect modifiers with the treatment
from a multivariate normal distribution’, but they group, and also the main effects of any baseline cov-
also note that ‘complications arise for categorical pre- ariates that are solely prognostic. We suggest that it
dictors since a normal distribution will generate a is better to include the main effects of prognostic
continuous value rather than 0/1 value’. For compu- variables as in model (3) if there are not so many
tational convenience we will simulate correlated cov- such variables to render this unfeasible.
ariates of different types using the ‘SimCorrMix’ 4. Compute the estimated linear predictors using the
package18 in R.19 More specifically, the ‘corrvar2’ fitted model from step 3 and the simulated data from
function from this package was used to simulate cor- steps 1 and 2. For example, if we fit model (3) to the
related variables (which may be continuous, binary data from the AB trial in step 3, we compute the linear
 
or count) using the properties reported in the aggre- predictor g b θt ðx AC0 Þ and also obtain b
θt ðx AC0 Þ. For
gate level competitor's trial information and the
required correlation matrix computed from the com- another example, b θt ðx AC0 Þ is the probability of Ber-
pany's trial. By using method = ‘polynomial’ and noulli distribution if the outcome is binary:
specifying both the required mean and standard devi-   T
 T

ation of continuous covariates, we have found that g b β0 þ b
θt ðx AC0 Þ ¼ b β1 x AC0 þ bγ t þ b
β2 x AC0 I ðt ¼ BÞ: ð8Þ
‘corrvar2’ provides simulated continuous random
variables that appear nicely normally distributed and
with the correct correlations with other variables. 5. Complete the simulation procedure by simulating the
However, a limitation of our use of the ‘corrvar2’ outcome data yðx AC0 ,t Þ. We simulate this outcome
function is that we have to use a normal distribution data using the ðx AC0 , tÞ, simulated in steps 2 and 3, the
in situations where continuous covariates are not estimated linear predictors in step 4 (and any addi-
truly normally distributed even after a transforma- tional parameters estimated in the model fitted at step
tion, for example where a uniform distribution is 3 and required for simulation; for example, the resid-
required. In the next section, we will discuss this in ual variance if this is a linear model or the baseline
the context of our example, where the covariate age hazard if this is a Cox model).
is known to follow a uniform distribution. In situa- 6. Having simulated the outcome data in the AB trial,
tions where all covariates are continuous, it is sim- if it had instead been conducted in the AC trial pop-
pler to simulate directly from a multivariate normal ulation with the same size as the AC trial, we can
distribution, possibly using transformed covariates to now produce the population-adjusted analysis. We
aid normality. When simulating covariates in steps now use the notation Δ b ABðAC0 Þ to denote the esti-
(a–d) above, suitable aggregate-level information mated treatment effect of treatment B relative to
from the competitor trial is required. The methods treatment A, using the simulated data from steps 1–
described by Wier et al.20 may be useful in situations 5 for the AC trial population. We compute Δ b ABðAC0 Þ
where some values are not reported. and its variance σ 2ABðAC0 Þ by fitting the required type
of regression model on the treatment group variable
2. Generate the treatment group t for each of the set of using the simulated data (e.g., logistic regression for
simulated patient covariates produced in step 1. Here binary outcomes).
we allocate simulated patients to the two treatment 7. Finally, we perform the indirect comparison in the
groups (A and B) in the company's trial. If the treat- same way as for the previous standard STC method.
ment allocation ratios are not 1:1 in either trial, either We compute the population-adjusted treatment effect
allocation ratio could be justified here, and we return of treatment C relative to treatment B, as
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 7

b BCðACÞ ¼ Δ
Δ b ACðACÞ  Δ
b ABðAC0 Þ , ð9Þ estimate. By using a relatively large N draw , for example,
N draw ¼ 50 or N draw ¼ 100, or an even larger value, we
obtain results that are robust to the random seed used. If
with standard error the random seed (or seed for each resample) is specified
prior to analysis, the analysis can be made reproducible.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
σ BCðACÞ ¼ σ 2ACðACÞ þ σ 2ABðAC0 Þ , ð10Þ By resampling the AB trial multiple times and obtaining
different estimated regression parameters for each simu-
lated dataset in step 4 in Section 3.2, we accommodate
where, as in the standard STC method in Section 3.1, parameter uncertainty in the outcome model fitted to the
b ACðACÞ is the published estimate of the relative treat-
Δ AB trial.
ment effect of C, relative to A and σ 2ACðACÞ is either avail- We call this the ‘multiple imputation method’
able or calculable from the aggregate level data published because, as in the single imputation method, we concep-
for the competitor's trial. Equations (9) and (10) are simi- tualise the predicted results, that would have been
lar to (6) and (7) but in the former pair of equations AC 0 obtained if the AB trial had been conducted in the AC
is used in the subscript to emphasise that the estimation trial population, as missing data. However, now we simu-
is based upon simulated data in the AC trial population. late multiple realisations of this missing data, analyse
Since we only simulate the AC 0 data once, and we con- them as if they were observed and average the resulting
ceptualise the predicted results that would have been estimates, as in Rubin's Rules for multiple imputation.
obtained if the AB trial had been conducted in the AC However, a difficulty is that we must obtain a valid
trial population as missing data, we refer to this approach σ 2ABðAC0 Þ to use in Equation (10), which is now the vari-
as the ‘single imputation method’. ance of the average population-adjusted estimate of treat-
ment B relative to A across the N draw replications.
Rubin's rules provide an estimated variance formula that
3.3 | The multiple imputation method allows for within and between imputation variation of
(STC-MI) estimators,21 but this is for the case where a more con-
ventional type of statistical analysis is performed, that is,
The single imputation method in Section 3.2 resolves a single statistical model is adopted, and some data used
concerns about the bias when the link function g is not for model fitting are missing. Here a much more unusual
linear but at the cost of introducing new issues. One issue statistical procedure is used, whereupon fitting a statisti-
is that we only simulate the data once in the single impu- cal model at step 4 in Section 3.2, we then use this to
tation method, which is not robust. This is because if a dif- make predictions for an entirely new set of patients at
ferent random seed is used, then substantively different step 5 and fit a substantively different statistical model
results may be obtained. Another related issue is that the at step 6 to provide our estimator. In particular, step
results may only be reproducible if the random seed is 6, where an additional statistical model is used, does not
judiciously specified prior to analysis. Finally, the single align with the assumptions of Rubin's rules. Hence,
imputation model fails to account for parameter uncer- rather than attempting to use or modify the usual Rubin's
tainty because this is not allowed in step 4, where the esti- rules variance formula,21 non-parametric bootstrapping
mated linear predictor is used as if it were the truth. was used. Here patients are resampled, with replacement
Repeating the single imputation method multiple and within each treatment group, to produce a bootstrap
times, giving rise to what we call the ‘multiple imputation sample. The multiple imputation method is then imple-
method’, is a more advanced (but also more complicated mented for each bootstrap sample. Finally, σ 2ABðAC0 Þ is
and computationally intensive) approach. We propose obtained as the sample variance of the estimates from all
using a resampling approach to produce robust results and bootstrap replications. One consequence is then that the
accommodate parameter uncertainty. Here we resample resamples made for the bootstrap replications are then
patients in the AB trial within treatment groups with resamples of resamples of the original data (because the
replacement to produce a new representative sample. By bootstrap samples are themselves resamples). This can
creating N draw resamples in this way and applying the have implications for the stability of the standard error,
single imputation method to each of these in the way as we will illustrate in our simulation study in Section 5.
described in the previous section, we produce adjusted This method is very similar to a double bootstrap
estimates Δb ABðAC0 Þ for k ¼ 1, …, N draw . These N draw esti- approach which has been previously used to, for exam-
k
mates are then averaged to provide a final Δ b ABðAC0 Þ that ple, estimate the correlation with an interval used as an
can be used in Equation (9) in exactly the same way as in input in the multivariate meta-analysis of mixed
the single imputation approach, to produce the adjusted outcomes.22,23
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 ZHANG ET AL.

3.4 | The infinite population method Similarly, 300 patients in the competitor's trial randomly
(STC-IP) receive treatments A and C with equal proportions of 0.5.
The covariates considered are age and sex. The age of
Finally, we consider a third STC method that involves sim- the patients from the company's trial is uniformly distrib-
ulation. This approach is conceptually different to the first uted between 45 and 75, and the age of the patients from
two methods. This is because it translates what Equation (9) the competitor's trial is uniformly distributed between
implies for the estimated effect of treatment B, relative to 45 and 55. In the company's trial, 64% of patients are
treatment A, in the AC trial population, where this popula- female, while in the competitor's trial, 80% of patients
tion is of infinite size. Populations are usually conceptua- are female. The IPD are available from the company's
lised as being of infinite size in statistics, and this elegantly trial, and only aggregate data are available from the com-
resolves questions about the size of the counterfactual AB petitor's trial. The outcome variable of interest is binary
study in the AC population that should be simulated (see and is simulated from the logistic model
also Section 3.2, where we discuss Ishak's recommendation
in this regard). This is because we no longer simulate rea- logit ðpit Þ ¼ 0:85 þ 0:12maleit þ 0:05ðageit  40Þ
lisations of the AB trial as if its key characteristics had been þ ðβt  0:08ðageit  40ÞÞI ðt ≠ AÞ, ð11Þ
that of the AC trial, where previously this had included
both its covariate distributions and its size. Instead we now where pit is the probability of an event that the individual
direly translate Equation (9) to the AC trial population that patient i receives treatment t for t  fA, B, Cg, βB ¼ 2:1 if
is regarded as infinite. the individual patient receives treatment B, βC ¼ 2:5 if
To apply this method, we follow the single imputation the individual patient receives treatment C, and I ðt ≠ AÞ
method in steps 1–6, but in the first step, we simulate a is an indicator function for t ≠ A. The true relative effect
very large sample of, say 100,000, to approximate the infi- ΔBC , that is, the log OR for treatment C versus B is condi-
nite AC trial population. An even larger sample is even tional (on covariates age and sex) and calculated by
better if it is computationally feasible, and we use 1:1 ran- βC  βB ¼ 2:5  ð2:1Þ ¼ 0:4. Thus treatment C has
domisation in step 2. This gives us a point estimate of better performance than treatment B if the outcome is
b ABðAC0 Þ for our adjusted analysis in step 6.
Δ taken as an adverse event. It is worth noting that from
We use non-parametric bootstrapping to compute Equation (11), the conditional relative effect of treatment
σ 2ABðAC0 Þ in Equation (10) and so propagate the uncertainty C versus B is unique for all ages because it does not
in the model fitted to the IPD from the AB trial. Here we cre- depend on age. However, the calculation of conditional
ate sufficient bootstrap samples by resampling patients of the relative effects of B versus A and C versus A will both
AB trial with replacement in each treatment group. For each depend on the individual's age.
bootstrap sample, we obtain the point estimate Δ b ABðAC0 Þ by The covariate age is an effect modifier, and the
applying the single imputation method to the resampled patients enrolled in the competitor's trial are older than
AB dataset and the large sample of AC 0 simulated earlier. the patients in the company's. Hence we require popula-
Then σ 2ABðAC0 Þ is calculated using the sample variance of tion adjustment in the indirect comparison to make this
all point estimates from bootstrap replications. more equitable. The patients of the AC trial were simu-
lated at the IPD level but aggregated to the number of
events in each group, so that only the number of events
4 | A P PL I C A T I O N TO A in each treatment group, the proportion of male patients
N U M E R I C A L EX A M P L E and the mean age (and its standard deviation) are avail-
able for analysis.9
In this section, we apply all four STC methods described
above to the numerical example from NICE DSU Techni-
cal Support Document.9 This is a simulated example but 4.2 | Marginal and conditional
has been used to illustrate methodological work in the estimands
past.24
There are two main ways to define the true value of the
relative effect of treatment C versus B in AC trial popula-
4.1 | Data generation tion, ΔBCðACÞ : the marginal relative effect and the condi-
tional relative effect. The topic of the most appropriate
In this example, 500 patients in the company's trial (the estimator to use has been debated particularly by Remiro-
AB trial) receive treatment A (a common comparator such Azocar et al.17 and Phillippo et al.25 The conditional relative
as a placebo) or treatment B with equal proportions of 0.5. effect implied by model (11) is calculated at the individual
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 9

level because the treatment effect βt is the coefficient of b BC is calculated as in Equation (1)
σ BC of the estimate Δ
logistic regression, which is conditional on the effects of assuming that σ 2ACðACÞ is available from the published
covariates. The marginal relative effect is the average report of AC trial.
treatment effect at the population level in the AC trial
population. The conditional and marginal effects differ
when the effect size is not collapsible,26,27 as is the 4.3.2 | The no simulation method (STC)
case here.
Which type of treatment effect standard STC targets For the no simulation method, Equation (2) was used to pro-
is unclear, because the calculation of its relative effect duce the adjusted estimate where the outcome is binary, g is
b BCðACÞ involves plugging in the covariate means into a
Δ the logistic function, θt is the mean outcome and X AB is
multivariable regression fitted to the AB trial (and so the the covariate age, which is the effect modifier. This ver-
resulting estimated relative effect is conditional) and sion includes no prognostic variables, but we apply it
comparing this estimated relative treatment effect to an because it is the standard and the primary analysis in the
unadjusted, and so marginal, estimate from the AC trial. supplementary materials of NICE DSU Technical Sup-
The STC with simulation methods (STC-SI, STC-MI and port Document.9 The relative effect Δ b ABðACÞ is the esti-
STC-IP) however clearly target the marginal relative mated model coefficient that can simply be extracted
effect because we simulate the AC trial population and from the model fitted as Equation (4). Then we perform
estimate marginal treatment effects from this. the indirect comparison as described in Equation (6) and
This article uses the conditional relative effect calculate standard error as Equation (7), where b σ 2ABðACÞ
ΔBC ¼ βC  βB ¼ 0:4 (which is the differences of log odds can be extracted directly from the model fitted to the
ratios) as the true effect in this numerical example and company's trial.
the simulation study that follows in Section 5. This condi-
tional relative effect is based on the model parameters from
(11). One way to calculate the marginal relative effect is by 4.3.3 | The single imputation method
using the Monte Carlo approach to compute an otherwise (STC-SI)
complicated integral. More specifically, by simulating a very
large sample of AC trial covariates, but randomly assigning For the single imputation method, we follow the steps in
patients to receive treatment B or C, calculating the proba- Section 3.2. In the first step, we simulate the patient cov-
bility of an event using Equation (11), simulating the out- ariates with mean and standard deviation estimated from
comes and calculating the unadjusted log OR comparing the AC trial, but the correlation between age and sex esti-
treatments B and C. We simulated 1,000,000 patients' cov- mated from the AB trial. This is because the correlation
ariates and outcomes in this way and obtained a marginal between covariates in the aggregate level AC trial infor-
relative effect of 0.3796. The marginal relative effect differs mation is not reported. We allocate half the patients with
only slightly from the conditional relative effect. Hence we treatment A and half with treatment B to maintain the
will take 0.4 as the true effect, in both this section and the 1:1 randomisation as discussed in step 2. Then we follow
simulation study in Section 5. step 3 to fit the outcome model of Equation (3) to the
IPD of AB trial, where x AB,p is an indicator for sex and
x is age. In step 4, we estimate the linear predictor
AB 
4.3 | Statistical methods b
θt ðx AC0 Þ for the simulated patient covariates using the
 
In this section, we describe how we apply all four STC fitted model. Here the linear predictor b θt ðx AC0 Þ is the
methods described more generally in Section 3 to our mean of the Bernoulli distribution because we have the
numerical example. We will compare their results to an binary outcome. In step 5, the individual outcomes are
unadjusted indirect comparison (Bucher's method1). then simulated from a Bernoulli distribution
(or Binomial distribution) with the estimated means. In
b ABðAC0 Þ of treatment B versus A
step 6, the relative effect Δ
4.3.1 | The unadjusted method
in simulated AC0 trial is estimated by fitting the logistic
To perform the unadjusted analysis with binary outcome regression on treatment. Finally, the indirect comparison
case, we obtain the point estimate of Δ b ABðABÞ and its vari- is performed using Equation (9) in step 7, with the stan-
ance σ ABðABÞ by fitting a logistic regression model to the
2 dard error calculated as Equation (10).
IPD of AB trial on treatment group only, and so calculat- In this example, age is uniformly distributed, but a
ing an unadjusted treatment effect. The standard error normal distribution is used to simulate the covariates in
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 ZHANG ET AL.

the AC trial. This is a limitation of using the SimCorrMix T A B L E 1 Summary of Δ b BC for unadjusted analysis and Δ
b BCðACÞ
R package. In practice, the IPD from the AC trial are not for all four STC methods with 95% confidence intervals.
observed, so we would not know that this is necessarily Method Estimates 95% CI
an issue, although we might suspect it is due to the cor-
True 0.400
responding covariate distribution in the AB trial, or
even from common-sensical reasoning. Although we Unadjusted (Section 4.3.1) 0.325 (0.481, 1.130)
are content to simulate from a normal distribution with STC (Section 4.3.2) 0.247 (1.180, 0.686)
standard deviation implied by the uniform distribution, STC-SI (Section 4.3.3) 0.165 (1.032, 0.701)
and so a normal distribution that resembles the uni- STC-MI (Section 4.3.4) 0.186 (1.143, 0.770)
form distribution as closely as possible in terms of its STC-IP (Section 4.3.5) 0.257 (1.183, 0.669)
first two moments, this does involve an element of
Note: The results of the unadjusted method and standard STC are already
pragmatism. We advocate that analysts use the most
given in supplementary materials of Phillippo et al.9
sophisticated simulation procedures available to them,
and we regard the SimCorrMix package as impressive
but not a panacea.

4.3.4 | The multiple imputation method


(STC-MI)

We apply the STC-SI method multiple times for the


multiple imputation method to allow for parameter
uncertainty and make the results more robust. Here we b BC for unadjusted analysis and
F I G U R E 1 Comparison of Δ
simulate N draw ¼ 50 resamples of AB trial with replace- b BCðACÞ for all four STC methods.
Δ
ment and within the treatment group to maintain the
ratio in randomisation of each trial (see Section 3.3). For
each resample, we apply the STC-SI method and obtain N bs ¼ 30 bootstrap samples by resampling patients of the
the point estimate of Δ b ABðAC0 Þ for k ¼ 1, …, N draw (see Sec- AB trial with replacement in each treatment group.
k
tion 4.3.3), and then take the average value of the point The random seed specification was also set prior to analy-
estimate of each resample to produce the final estimate sis to ensure reproducible results. However, in this
of Δb ABðAC0 Þ . As explained in Section 3.3, we use boot- method, a very large sample is simulated, and the only
strapping to calculate the corresponding standard error. other stochastic method is bootstrapping to get standard
Here we simulate N bs ¼ 30 bootstrap replications by errors, so the seed setting is not essential.
resampling the patients of the AB trial with replacement
and within the treatment group. Specified seeds were set
for any stochastic method (simulation, resampling, boot- 4.4 | Results
strapping) to ensure reproducibility.
The resulting five different estimates of ΔBC (using the
four different methods for STC and also the unadjusted
4.3.5 | The infinite population method indirect treatment comparison using Bucher's method),
(STC-IP) and the corresponding 95% confidence intervals, are
summarised in Table 1. These estimates and confidence
When applying the infinite population method, we repeat intervals are visualised in Figure 1. These results show
steps 1–6 of this method discussed in Section 4.3.3 to cal- that all STC methods performed better than the unad-
culate the point estimate Δ b ABðAC0 Þ , except that we simu- justed comparison in the sense that they reversed its
late a substantial sample (100,000) of patients to incorrect directional estimated effect. However, all STC
approximate the AC trial population. Since we do not methods result in similar estimates and confidence inter-
maintain the AC trial's sample size when simulated vals, and all five confidence intervals contain 0.4. It is
patients, the variance cannot be simply extracted from therefore difficult to determine which method is best on
the model fitted to the simulated AC0 trial. Instead, we the basis of this single example. In order to compare the
use the non-parametric bootstrapping to propagate the performance of these five methods more meticulously, a
uncertainty of parameters in model (2) or (3) and com- simulation study was performed, and this is described in
pute σ 2ABðAC0 Þ for this purpose, see Section 3.4. We create the next section.
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 11

TABLE 2 b BC for unadjusted analysis and Δ


Bias and Monte Carlo standard error of Δ b BCðACÞ for all four STC methods in the simulation
study.

Unadjusted STC STC-SI STC-MI STC-IP


Bias and Monte Carlo SE 0.696 (0.013) 0.003 (0.015) 0.029 (0.018) 0.030 (0.016) 0.005 (0.015)
Estimated coverage probability 0.592 0.941 0.883 0.947 0.942
Average length of 95% CI 1.625 1.876 1.739 1.959 1.908

Note: Also reporting the estimated coverage probability and the average length of confidence 95% intervals.

5 | S I MU LA T I ON S T U D Y due to a non-linear link function g are unimportant.


STC-IP also performs well, but our simulation study pro-
In the previous section, we found that all four methods vides proof of concept that all four methods for STC are
for STC perform very similarly for the simulated dataset effective in removing bias from an unadjusted analysis.
used in NICE DSU Technical Support Document.9 In this The bias resulting from the unadjusted analysis has
simulation study, this scenario is replicated 1000 times so undesirable consequences for the coverage probability of
that their performance can be more accurately assessed. its confidence interval, which is substantively below the
The same codes used by Phillippo et al.9 were used to nominal value of 0.95. The coverage probability for STC-
produce the replications, so that each replication has SI is also too low. This is because only one simulated AB
exactly the same properties as the numerical example in trial, in the AC trial population, is produced where the
the previous section. As in Phillippo et al. and so in the uncertainty in the model fitted to IPD from the com-
example in Section 4, the simulated data from the com- pany's trial is not taken into account. One way to
petitor AC trial was reduced to the aggregate level, so improve this coverage probability is to propagate the
that only the number of events in each treatment uncertainty in this model in the same way as in STC-IP,
group, the proportion of patients who are male and the but the resulting hybrid approach would then be STC-IP
mean age (and its standard deviation) are available for in a finite population of the same size as the competitor's
analysis. The five methods used in Section 4 were trial. The other three STC methods, including the con-
applied to each replication, so that their properties ventional approach, have coverage probabilities close to
across all 1000 realisations of the simulated datasets the nominal value.
could be assessed. The main quantities of interest are The poor coverage probability of the unadjusted
the bias of estimates of ΔBCðACÞ (difference between approach is also due to its shorter average confidence
average Δ b BCðACÞ and  0.4), the coverage probability of interval length (Table 2). All other methods perform pop-
the corresponding 95% confidence intervals and the aver- ulation adjustment, and there is some loss of precision
age confidence interval length. when performing this type of adjustment, so narrower
The main results of the simulation study are sum- unadjusted confidence intervals were expected. The poor
marised in Table 2 where biases are shown with Monte coverage probability of STC-SI appears to be entirely due
Carlo standard errors in parentheses. The main conclu- to its shorter average confidence interval length than the
sion is that all four STC methods are effective in eliminat- other STC methods, again as expected because this
ing the bias from an unadjusted analysis. The STC-MI method does not consider parameter uncertainty. All the
and STC-SI biases are very similar as expected, because other three STC methods perform well, and the standard
these two methods are very closely related, where the for- STC method results in the shortest average confidence
mer method is simply the repeated use of the latter interval length. Hence this very simple and standard
method using resampled datasets. The Monte Carlo stan- approach is adequate in this scenario, as might be
dard error of STC-MI is, however, slightly smaller than expected because the marginal and conditional treatment
the STC-SI Monte Carlo standard error, which is likely effects are so similar, suggesting that the non-linearity of
because of its greater stability due to being based on mul- the link function g is not a serious issue.
tiple resamples. One consequence of this slightly smaller In Figure 2, we show boxplots describing the distribu-
Monte Carlo standard error is that the lower boundary of tions of the widths of the confidence interval for all five
the corresponding 95% confidence interval for the bias is methods across all 1000 simulations, which reveals addi-
only slightly less than zero, and so the ramification of the tional insights. The STC-MI method occasionally pro-
greater precision of STC-MI is that a little bias from it duces very wide 95% confidence intervals, for example,
may be harder to rule out. The standard STC method has intervals that are wider than 3. Although all methods
performed well, suggesting that any issues relating to bias occasionally produce wider confidence intervals, STC-MI
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 ZHANG ET AL.

drawbacks. The commonly used ‘plug-in means’ approach


is convenient for calculating estimates without simulation,
despite the fact that the STC method's concept originated
with simulations. In this article, we hope to put the ‘S’ for
simulation back into the STC method for two main rea-
sons: The first reason is to avoid the bias that can be
caused by a non-linear link function, the second reason is
that simulation-based STC methods clearly target the mar-
ginal estimands. There are still, however, a number of lim-
itations and imperfections in the methods in their current
form and a lack of consensus in their use.
In practice, one of the most challenging aspects of
performing STC, and any other statistical method for
F I G U R E 2 Boxplot of widths of 95% confidence intervals of
population adjustment, is choosing appropriate covari-
estimates using the unadjusted method and STC methods in the
simulation study.
ates to adjust for. In our simulated numerical example,
we know that age is an effect modifier, whereas sex is
prognostic, but this type of information will not be
known with certainty in real applications. Clinical
appears to be especially prone to this. On closer inspec- input from experts, who understand the application
tion of the simulated datasets, this was found to be a con- area, will often be crucial to determine which covari-
sequence of the resamples of resamples required in the ates should be adjusted for. Statistical criteria may also
bootstrapping: as explained in Section 3.3, we resample be used for covariate selection, for example, from
to obtain the point estimate, and the bootstrapping proce- descriptive statistics and statistical modelling using
dure involves taking further resamples. In some resam- individual patient data from the company's trial. Pub-
ples of resamples, the event rate in one of the study arms lished aggregate-level information, such as the sub-
was occasionally very low, resulting in quite extreme group analysis results, may also be valuable. In
bootstrap replications and hence a large estimated stan- practice, covariate selection will likely depend on both
dard error. Hence we can explain this phenomenon, and clinical input and statistical results. Despite this,
this might reasonably be regarded as a disadvantage of uncertainty concerning which covariates to include
this method. will be almost inevitable. Sensitivity analyses, that
To summarise, all methods have performed as one explore the implications of using different covariates,
might expect. In particular, the unadjusted analysis has will therefore, often be necessary in practice. Further
performed poorly. The STC-SI method performed well complications may arise due to covariate information
in terms of removing bias but less well in terms of its not being available or due to the desire to adjust for
coverage probability. All the other three STC methods more variables than the available data allows.
performed well, including the standard approach. Another challenge for using STC is that there are dif-
However, these results may not generalise to other set- ferent ways of applying it in practice. The size of the trial
tings, for example, in a situation where the marginal to simulate is an open question. The STC-SI method fol-
and conditional relative effects are more different. If lows the suggestion from Ishak et al.11 to maintain the
this is the case then we expect to find more apparent size of the AC trial when simulating the individual pro-
differences in results between the standard STC files. However, STC-IP uses an infinite population where
method and the simulation-based STC methods. A we simulate a substantial sample of patients and propa-
more extensive simulation study investigating all STC gate model uncertainty using bootstrapping. When we
methods, and also other methods for population adjust- simulate the treatment group, 1:1 randomisation is the
ment, in different scenario settings would be helpful as most straightforward and efficient choice, but if the trials
a complete guide for applying population adjustment have other randomisation ratios, then this could be used
methods. instead to make the simulated trial more realistic.
We have used the ‘SimCorrMix’ package18 to simu-
late correlated variables in R. This package's advantage is
6 | DISCUSSION that it can generate correlated continuous, binary or ordi-
nary variables. This package generates variables using a
As presented in this article, there are at least four different normal distribution and then transforms variables to the
ways to conduct STC, all with their strengths and desired distributions. However the age covariate in our
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 13

numerical example and the simulation study is uniformly also deal with both anchored and unanchored pairwise
distributed. In practice, we do not know the true distribu- comparison, but it relies on the strong assumption that
tion of covariates in the AC trial. Other simulation tech- the between study covariates distributions should over-
niques which overcome this limitation should be lap. ML-NMR can compare treatments from multiple
considered and we are currently exploring the use of studies as long as they are connected in the network,
other techniques. One issue is that the function ‘corr- but this method cannot currently be used for time-
var2’ requires a random seed as an input (with to-event data. We are not advocating STC intrinsically,
default = 1234). This is undesirable because we usually and acknowledge that there are different ways to per-
only set the seed once, at the beginning, of a stochastic form this type of analysis, but we hope that this article
statistical analysis to ensure reproducibility. We resolved will help to make the methods more accessible to
this issue by setting different seeds in each call of the analysts.
‘corrvar2’ function. To summarise, we have reviewed the standard STC
There are two main ways to calculate the true value of the STC-SI methods and have proposed two new simu-
the relative effect of treatment C versus B in the AC trial lation based methods (STC-MI and STC-IP). A familiar
population, ΔBCðACÞ : the marginal relative effect and the numerical example has been used to illustrate the
conditional relative effect. This topic has been debated application of the STC methods, for which R code is
particularly between Remiro-Azo car et al.17 and Phillippo provided in the supplementary materials to assist the
25
et al., and attracts ongoing attention and discussion applied analyst. Results have been compared with the
from other researchers.28–30 It is clear that MAIC unadjusted method. All four STC methods have better
methods target marginal relative effects. See Remiro- performance than the unadjusted method in our explo-
Azocar et al.,17 and the response of Phillippo et al.,25 for rations because there are important differences in the
discussion related to the type of conditional treatment populations between our trials. The simulation study
effects that ML-NMR targets. STC methods that involve provides additional information about the properties of
simulation target marginal treatment effects but it is not the STC methods and provides proof of concept that
clear what type of effect standard ‘no simulation method’ STC is a fully viable approach. We hope to see STC
targets. By putting the simulation back into STC, one receive greater attention in both applied and methodo-
consequence of our work is to help resolve this particular logical statistical work, and that our article will
concern. facilitate this.
A standard network meta-analysis can incorpo-
rate, but not account for, the differences in the trial AUTHOR CONTRIBUTIONS
populations as between-study heterogeneity, however Landan Zhang: Conceptualization; investigation;
this approach usually requires substantially larger methodology; visualization; writing – original draft;
number of studies than two. Alternatively, these dis- writing – review and editing; formal analysis. Sylwia
crepancies can also be accounted for by using meta- Bujkiewicz: Writing – review and editing; methodol-
regression,5 where we can use estimated study means ogy; supervision. Dan Jackson: Conceptualization;
as covariates, but this also requires larger number of investigation; writing – original draft; writing – review
studies. Our investigations show that the standard and editing; supervision; methodology; validation.
STC method performs well for our example, which
was taken from NICE DSU Technical Support Docu- A C KN O WL ED G EME N T S
ment.9 This is because all STC methods perform sim- The authors thank Owain Saunders for useful discussion
ilarly in our simulation study. However, this need that motivated the multiple imputation method.
not be the case more generally. Furthermore, in
many applications, it is only by applying all four C O N F L I C T O F I N T E R E S T S T A TE M E N T
methods we are able to confirm that they give simi- The authors declare no conflict of interest.
lar results. We expect the simulation-based STC to
perform better than the standard STC method under DA TA AVAI LA BI LI TY S T ATE ME NT
certain circumstances; for example, when the mar- All data is simulated and is provided in the supplemen-
ginal and the conditional estimands differ more. tary materials.
Future work should explore all four methods in dif-
ferent situations. ORCID
Different statistical methodologies have their Landan Zhang https://orcid.org/0009-0003-1947-2032
advantages and disadvantages, for example, MAIC can Dan Jackson https://orcid.org/0000-0002-4963-8123
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14 ZHANG ET AL.

R EF E RE N C E S 18. Fialkowski A, Tiwari H. SimCorrMix: simulation of correlated


1. Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results of data with multiple variable types including continuous and
direct and indirect treatment comparisons in meta-analysis count mixture distributions. R J. 2019;11(1):250.
of randomized controlled trials. J Clin Epidemiol. 1997;50(6): 19. R Core Team. R: A Language and Environment for Statistical
683-691. Computing. R Foundation for Statistical Computing; 2022.
2. Lumley T. Network meta-analysis for indirect treatment com- 20. Weir CJ, Butcher I, Assi V, et al. Dealing with missing standard
parisons. Stat Med. 2002;21(16):2313-2324. deviation and mean values in meta-analysis of continuous out-
3. White IR, Barrett JK, Jackson D, Higgins JP. Consistency and comes: a systematic review. BMC Med Res Methodol. 2018;
inconsistency in network meta-analysis: model estimation 18(1):1-14.
using multivariate meta-regression. Res Synth Methods. 2012; 21. Rubin DB. Multiple Imputation for Nonresponse in Surveys.
3(2):111-125. John Wiley & Sons; 2004:81.
4. Jackson D, Veroniki AA, Law M, Tricco AC, Baker R. Paule- 22. Bujkiewicz S, Thompson JR, Sutton AJ, et al. Multivariate
Mandel estimators for network meta-analysis with random meta-analysis of mixed outcomes: a Bayesian approach. Stat
inconsistency effects. Res Synth Methods. 2017;8(4):416-434. Med. 2013;32(22):3926-3943.
5. Dias S, Ades AE, Welton NJ, Jansen JP, Sutton AJ. Network 23. Papanikos T, Thompson JR, Abrams KR, Bujkiewicz S. Use of
Meta-Analysis for Decision-Making. John Wiley & Sons; 2018. copula to model within-study association in bivariate meta-
6. Signorovitch JE, Wu EQ, Andrew PY, et al. Comparative effec- analysis of binomial data at the aggregate level: a Bayesian
tiveness without head-to-head trials. Pharmacoeconomics. approach and application to surrogate endpoint evaluation.
2010;28(10):935-945. Stat Med. 2022;41(25):4961-4981.
7. Caro JJ, Ishak KJ. No head-to-head trial? Simulate the missing 24. Jackson D, Rhodes K, Ouwens M. Alternative weighting
arms. Pharmacoeconomics. 2010;28(10):957-967. schemes when performing matching-adjusted indirect compari-
8. Signorovitch JE, Sikirica V, Erder MH, et al. Matching-adjusted sons. Res Synth Methods. 2021;12(3):333-346.
indirect comparisons: a new tool for timely comparative effec- 25. Phillippo D, Dias S, Ades AE, Welton NJ. Target estimands for
tiveness research. Value Health. 2012;15(6):940-947. efficient decision making: response to comments on “Assessing
9. Phillippo DM, Ades A, Dias S, Palmer S, Abrams KR, the performance of population adjustment methods for
Welton NJ. NICE DSU technical support document 18: anchored indirect comparisons: a simulation study”. Stat Med.
methods for population-adjusted indirect comparisons in sub- 2021;40:2759-2763.
missions to NICE. Report by the Decision Support Unit; 2016. 26. Gail MH, Wieand S, Piantadosi S. Biased estimates of treatment
10. Phillippo DM, Ades AE, Dias S, Palmer S, Abrams KR, effect in randomized experiments with nonlinear regressions
Welton NJ. Methods for population-adjusted indirect compari- and omitted covariates. Biometrika. 1984;71(3):431-444.
sons in health technology appraisal. Med Decis Making. 2018; 27. Agresti A. Categorical Data Analysis. John Wiley & Sons;
38(2):200-211. 2012:792.
11. Ishak KJ, Proskorovsky I, Benedict A. Simulation and 28. Van Lancker K, Vo TT, Akacha M. Estimands in heath tech-
matching-based approaches for indirect comparison of treat- nology assessment: a causal inference perspective. Stat Med.
ments. Pharmacoeconomics. 2015;33(6):537-549. 2022;41(28):5577-5585.
12. Phillippo DM, Dias S, Ades A, et al. Multilevel network meta- 29. Spieker AJ. Comments on the debate between marginal and
regression for population-adjusted treatment comparisons. J R conditional estimands. Stat Med. 2022;41(28):5589-5591.
Stat Soc A Stat Soc. 2020;183(3):1189-1210. 30. Russek-Cohen E. Discussion of “Target estimands for population-
13. Phillippo DM, Dias S, Ades A, Welton NJ. Assessing the perfor- adjusted indirect comparisons” by Antonio Remiro-Azocar. Stat
mance of population adjustment methods for anchored indirect Med. 2022;41(28):5573-5576.
comparisons: a simulation study. Stat Med. 2020;39(30):4885-
4911.
AUTHOR BIOGRAPHIES
14. Phillippo DM, Dias S, Ades A, et al. Validating the assumptions
of population adjustment: application of multilevel network
meta-regression to a network of treatments for plaque psoriasis. Landan Zhang is a Postdoctoral Research Fellow in
Med Decis Making. 2022;43:0272989X221117162. the Statistical Innovation Group at AstraZeneca who
15. Lu G, Ades A. Combination of direct and indirect evidence in specialises in methods for population adjusted indi-
mixed treatment comparisons. Stat Med. 2004;23(20):3105-
rect treatment comparisons.
3124.
16. Hoaglin DC, Hawkins N, Jansen JP, et al. Conducting indirect- Sylwia Bujkiewicz is Professor of Biostatistics in the
treatment-comparison and network-meta-analysis studies: Biostatistics Research Group at the University of
report of the ISPOR task force on indirect treatment compari- Leicester; her primary research interests are in the
sons good research practices: part 2. Value Health. 2011;14(4):
area of Bayesian methods for evidence synthesis, with
429-437.
17. Remiro-Az ocar A, Heath A, Baio G. Conflating marginal and
a main focus on multi‐parameter evidence synthesis
conditional treatment effects: comments on “Assessing the per- for combining data from diverse sources.
formance of population adjustment methods for anchored indi-
Dan Jackson is Statistical Science Director in the Sta-
rect comparisons: a simulation study”. Stat Med. 2021;40(11):
tistical Innovation Group at AstraZeneca; his primary
2753-2758.
17592887, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1681 by University Of Leicester Library, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ZHANG ET AL. 15

research interests are methods for meta‐analysis, sur-


How to cite this article: Zhang L, Bujkiewicz S,
vival analysis and population adjustment.
Jackson D. Four alternative methodologies for
simulated treatment comparison: How could the
use of simulation be re-invigorated? Res Syn Meth.
2023;1‐15. doi:10.1002/jrsm.1681
S UP PO RT ING IN FOR MAT ION
Additional supporting information can be found online
in the Supporting Information section at the end of this
article.

You might also like