Evaluation of Regression Procedures For Methods Comparison Studies

CLIN.CHEM.
39/3,
424-432 (1993)
Evaluation of Regression Procedures for Methods Comparison Studies

Kristian Lmnet
Using simulation, I evaluated five regression procedures need not be constant. The Deming procedure (3-10) real-
that are used for analyzing methods comparison data: istically assumes that themeasurements by both methods
ordinary least-squares regression analysis, weighted are subject to random errors, and a weighted modification
least-squares regression analysis, the Deming method, a oftheDemingmethod(11)takesintoaccountnonconstant
weighted modification of the Deming method, and a rank analytical standard deviations for both methods. For ex-
procedure. I recorded the following performance mea- ample, it can be assumed that both methods are subjectto
sures: plus or minus bias of the slope estimate, the root proportional measurement errors (constant coefficients of
mean squared error of the slope estimate, and correct- variation), which is a very common situation in clinical
ness of hypothesis testing. I evaluated the unweighted chemistry. Finally, a regression procedure based on the
regression procedures by using a simulated comparison rank principle, which also takes random errors for both
of two electrolyte methods; only the Deming method gave methods into account, may be considered (12-14). In this
unbiased slope estimates. Using the jackknife method, a paper, I evaluate the regression procedures mentioned
computer program that estimates the standard error, I above, using simulations of typical situations in clinical
showed that hypothesis testing was correct for the Dem- chemistry. I focus on the shortcomings of ordinary least-
ing method. I used all regression procedures on data from squares regression analysis, and how these can be over-
a simulated comparison of two measurement methods comeby using an alternative procedure.
with proportional analytical errors dispersed over one
decade. Bias of the slope estimates was not a problem for Comparison of Two ClInical Chemistry Methods by
these cases. The weighted least-squares regression anal- Regression AnalysIs
ysis and the weighted Deming method were most efficient Because every method in clinical chemistry is subject
(lowest root mean squared error); the other procedures to some random measurement error, we must distin-
required 1.6 to 2.2 times as many observations to attain guish between the measured value (xe)and the true or
the same precision for the slope estimate. Hypothesis target value (Xe). The latter is the average of all the
testing was correct by the weighted Deming method with values that we would obtain if we repeated the measure-
the jackknife principle for standard error computation; the ment of a given sample an indefinite number of times. A
other methods rejected the null hypothesis 1.4 to 4.4 particular measured value is likely to deviate from the
times too frequently. In conclusion, it is preferable to use target value by some small “random” amount (#{128} or 5).
an alternative to ordinary least-squares regression anal- Given two clinical chemistry methods, we have for the
ysis for methods comparison studies. ith sample
IndexIngTerms:statistics intermethod companson
xi =X + a
It is often necessary in clinical chemistry to look for a

systematic difference between the measurements of two yi = Yi + s
methods. To do this, a set of samples is analyzed by both
methods and, on the basis of a regression analysis, any
The standard deviation of the dispersion of measured
systematic difference is evaluated. Ordinary least-squares
values about the target value is the analytical standard
regression analysis, which is readily available in pocket
deviation (Figure 1). The analytical standard deviation
calculators and statistical packages (1), is commonly used.
may be constant, i.e., independent of the target value,
This procedure presumes that the measurementsfor oneof
but usually it increases with increasing target values.
the methods are without random error, i.e., the analytical
For many compounds,the analytical standard deviation
standard deviation is zero, and that the analytical stan-
is approximately proportional to the concentration
dard deviation for the other method is constant through-
level, corresponding to a constant coefficient of variation
out the measurement range. Both assumptions are rarely
(Figure 2a) (15). In somecases,e.g., hormone assays,the
fullilled. Therefore, it is reasonable to consider alterna-
analytical standard deviation becomes constant in the
tives to ordinary least-squares regression analysis for
low range, resulting in an increased coefficient of vari-
evaluation of methods comparisondata. In weighted least-
ation in this area (Figure 2b) (16). One must consider
squares regression analysis (2), the analytical standard
the relation between analytical standard deviation and
deviation for the method that is subject to random errors
concentration value when one choosesthe best regres-
sion analysis for a method comparison study (see below).
Department of Clinical Biochemistry KB 76.4.2, Rigshospitalet From a set of paired measured values {(x; yj)} the
University Hospital, Tagensvej 20, DK-2200 Copenhagen N, Den-
mark. relation between the target values {(X; Y)} is to be
Received March 4, 1992; accepted September 21, 1992. elucidated. The null hypothesis of no difference (H0)
424 CLINICALCHEMISTRY,Vol.39,No.3,1993
at a given level, and that the standard deviation is
constant throughout the measurement range (Figure 3)
(1). The computation of an ordinary least-squares re-
gression analysis is based on minimization of the
squared deviations from the line in the vertical direc-
tion. It can be shown that this procedure is optimal
The analyticalstandarddeviationsSDm,and SD,, are schematically shown under the stated conditions.
Weighted least-squares regression analysis (Appen-
dix) (2) allows for a nonconstant standard deviation for
they method, but it is still presumed that the x method
is without random measurement error. Weights are
introduced that are inversely proportional to the
squared analytical standard deviation of y measure-
ments at a given concentration. For example, the pro-
x cedure can be applied to the case with a proportional
a analytical standard deviation for y (constant coefficient
of variation). The sum of weighted squared deviations
from the line in the vertical direction is minimized. In
the situation with proportional analytical standard de-
viation, the weights reduce the influence of observations
at high concentrations. This is reasonable because the
analytical standard deviation, and thus the uncertainty,
increases with the concentration. The result is a more
b
x precise estimation of the regression line.
FIg.2. Examples ofrelations between analyte concentration (X), The Deming method (Appendix) (3-10) allows mea-
analytical standard deviation (SD,j, and coefficient of variation surement errors for both methods, requiring that the
(CV) ratio between the analytical standard deviations is
known. The Deming method is primarily used when
H0:Y = a0 + pX (a0; 13)= (0;1) analytical standard deviations are constant. The sum of
squared deviations from the line at an angle determined
is tested against the alternative hypotheses (Hai and by the ratio between the standard deviations is mini-
mized (Figure 4). Various procedures for estimation of
standard errors of slope and intercept have been sug-
Ha1 A linear relationship exists between X and Y. gested (5, 10). if a computerized method such as the
jackknife method is used, gaussian error distributions
= + 3X4 (a0; 13) (0;1) need not be assumed; i.e., the procedure becomes non-
He2: Y is a nonlinear function of X or totally indepen- parametric (11).
A weighted modification of the Deming method takes
dent of X.
into account nonconstant measurement errors for both
The hypotheses are evaluated by some type of regres- methods (Appendix) (11). It still has to be presumed that
sion analysis, supported by graphical plots. If a linear the ratio between the analytical standard deviations is
relationship seems plausible from graphical plots or constant. This is true for the common situation with
statistical tests of linearity, we must decide between H0 proportional analytical standard deviations (constant
and Hai. By determining whether the slope deviates coefficientsof variation) for both methods [and strictly, a
significantly from 1 (proportional systematic error) and regression line passing through (0; 0)1. The sum of
whether a4,, is different from 0 (constant systematic
error), we decide between thesetwo hypotheses. y
Each regression method requires certain assumptions
concerning data distribution and analytical errors. if
these assumptions are violated, the statistical analysis
may be incorrect or nonoptimal. The regression method
that requires the simplest computations, ordinary least-
squares regression analysis, relies on the most rigid
assumptions, whereas the more complicated methods
operate with assumptions that are more likely to be
fulfilled in reality (Appendix).
Ordinary least-squares regression analysis requires
x
that one of the methods (corresponding to x) is without Fig. 3. The model assumed In ordinary least-squares regression
random measurement error, that the measurement er- analysis: no error in x and constant gaussianerrordistilbutlon for y
rors of the other method (y) have a gaussian distribution In this example, the regression line is coinciding with the line of identity
CLINICALCHEMISTRY,Vol.39, No.3, 1993 425

Y method. Using duplicate measurements, one can esti-
mate the analytical standard deviations and compute
their ratio. This ratio is then used for computing the
slopeby the Deming method. In the simulation process,
pseudorandom gaussian numbers are generated accord-
ing to the specified model and regression equations are
calculated. By repeating the procedure many times
(=nrun), e.g., 2000-5000 times, one obtains various
performance measures for each regression method.
These are listed in Table 1. The average slope (b)
x indicates whether the slope estimate is biased or not.
Fig. 4. ThemodelassumedintheDemingregression procedure
The root mean squared error (RMSE)
The regression line is estimated by minimizationof squared distances to the
line atanangle determined by the ratioof the analytical standard deviations for
the two methods
RMSE = \/(b - 1)2/nrun = yBIAs2 + SE2
weighted squared deviations at an angle to the line is isan estimateofthetotal error of the slopeand includes
minimized. both the random error, i.e., the standard deviation of the
Finally, a nonparametric rank method developed by dispersion of b around and the systematic error or
,
Passing and Bablok is evaluated (Appendix) (12-14). bias. The real standard error of the slope [SE(E - 1)] is
The slope of the regression line is calculated as the the standard error observed in the simulation study, i.e.,
median of all possible slopes. This estimation principle the standard deviation for the distribution of b in the
makes the method robust towards outliers, which is the simulation runs. The average estimated standard error
main advantage of the method. Measurement errors for of the slope, on the other hand, refers to the standard
both chemistry methods are partially taken into ac- error derived by statistical analysis. In each simulation
count. Passing and Bablok claim that this method can run, an estimated standard error is computed so as to
be used when errors are proportional. carry out a test of the null hypothesis. A prerequisite for
a correct test is that the estimated standard error, on
Performance of the RegressIon Methods
average, agrees with the real one. In the rank method,
To compare the regression methods, I used each re- hypothesis testing is carried out in another way, and no
gression analysis in two simulated methods comparison average estimated standard error is given. Lastly, I
studies; one compared two methods for determination of evaluated the performance of hypothesis testing by
an electrolyte, and the second,two methods for determi- comparing the observed and expected number of rejec-
nation of a metabolite. tions of the null hypothesis. Under the null hypothesis,
we expect 50 rejections out of 1000 trials, given a
ElectrolyteStudy nominal type I error of 0.05. Any deviation from this
Serum concentrations of electrolytes are tightly reg- number may be expressed as a factor. For example, if
ulated, which means that the range of concentrations is 100 rejections are recorded, the factor is
very small in healthy individuals, and deviations asso-
ciated with disease are small to moderate. For example, 1= 100/50 = 2
the ratio between the upper and lower reference limits
(range ratio) for serum sodium is 146 mmolIL:136 The ideal hypothesis test factor is, of course, 1. Figure 5
mmol/L = 1.07 (17). In a study that included samples shows the average regression lines obtained by the
from diseased subjects, Cornbleet and Gochman (5) three procedures for this example. The scatter points of
found a range ratio of 1.15 for serum sodium. For my one simulation run are also shown.
model I supposea gaussian distribution of target values Ordinary least-squares regression analysis and the
with a mean of 135.5 mmol/L and a standard deviation
of 3.8 mmol/L. if the minimum and maximum values
Table 1. Performance CharacterIstics of Three
are 2.5 standard deviations below and above the mean,
RegressIon Procedures for a Simulated Comparison of
respectively, the range ratio is 1.15 (145 mmolIL:126
Two Electrolyte Methods
mmol/L). The analytical coefficients of variation are
L.aat-Squarss Dsmlng Rank
supposedto be 0.01 and 0.0 15 for methods 1 (x) and 2 (y), rsg. anal. mg anal. reg anal.
respectively. For this case, it makes little difference Averageslope (b) 1.001 l.O3l’
whether we supposeconstant or proportional analytical Root mean squared error 0.088 0.069 0.072
errors; I selected constant analytical errors with the Real SE(b) 0.064 0.069 0.065
above-mentioned coefficients of variation at the mean Average estimated SE(b) 0.058 0.070
value. Consequently, I did not need to use the weighted Hypothesis test factor (f) 4.1 1.0 3.8
least-squares regression analysis or the weighted Dem- The true slope p Is equal to 1. SamplesIze: 50 duplIcate measurementsby
ing method. To test the null hypothesis, that the target each method. Based on 2000-5000 simulation runs.
values for the two methods are identical, I selected a Ordinary least-squares regression analysis.
bjgnffinuy different from 1 (P <0.001).
sample size of 50 duplicate measurements by each
426 CLINICALCHEMISTRY,Vol.39, No.3, 1993

y
y mmol/L
mmol/L 0
0
0
30
00 0
130 140 x 0
mmol/L
Fig.5. Thesimulatedelectrolyte case
Fifty scatter points (means of duplicate measurementsby each method) are
shown for one simulation run. The regression lines are averages of 5000
simulation runs. (-) the Deming procedure; (--) ordinary least-squares
regression analysis: (.- -) the rank procedure
b = 0.94 and
rank method yieldbiased slope estimates,
1.03, respectively, whereas the Deming method is with-
out bias. The negative bias of the ordinary least-squares
method is simply the squared ratio of the coefficientsof 10 20 30
x
variation of x errors and target value distributions mmol/L
(0.012 0.5/0.0282 = 0.06; the factor 0.5 enters because
. Fig.6. Thesimulatedmetabolitecase
we have duplicate measurements). This example and Fifty scatterpointsare shownfor one simulation run. The solid line Is the
diagonal, which corresponds to the nearly coincidingaverage regression lines
other similar simulations show that the rank method of each of the five methods
doesnot provide unbiased slope estimates if the analyt-
ical standard deviations for methods 1 and 2 are differ-
ent. This conclusion is also apparent from the study by tions. The simulations show that ordinary and weighted
Bablock and Passing (13). For both ordinary least- least-squares regression analyses yield slope estimates
squares regression analysis and the rank procedure, I with a negligible bias (b = 0.996 and 0.993) (Table 2).
have observed that a biased slope estimate is accompa- The analytical coefficient of variation of method 1 is
nied by a biased intercept. small when compared with the coefficient of variation of
The root mean squared error is smallest for the the target value distribution. However, ordinary least-
unbiased Deming method. The jackknife principle of squares regression analysis is not appropriate because
estimating the standard error results in correct hypoth- the hypothesis testing is inaccurate; an actual type I
esis testing [hypothesis test factor (f) = 1.01. In con- error exceedsthe nominal one by a factor of 1= 4.4. This
trast, the null hypothesis is rejected too frequently by is causedby an underestimation of the standard error of
the other two procedures, f = 3.8 and 4.1, corresponding the slope by -40%, as evident from the simulation
to rejection frequencies of -20%. This is caused partly study. The weighted procedure performs better, with an
by bias and partly by incorrect estimation of the stan- error factor of f = 1.7. In addition, the slope, as calcu-
dard error. lated by the weighted procedure, has a smaller root
mean squared error than it would if it were calculated
MetaboliteStudy by the ordinary least-squares regression analysis.
Metabolites include commonly measured substances The Deming and the weighted Deming methods give
such as glucose, urea, and urate. The serum concentra- unbiased slope estimates, as expected, and the rank
tions of these substances vary more than those of elec- method has only a negligible bias (b = 1.002). The
trolytes, resulting in larger range ratios. I compared two weighted Deming method performs hypothesis testing
methods for glucose determination and supposed that correctly, whereas the unweighted procedure and the
the glucose concentrations extended from 2.5 to 25 rank method have moderately increased type I errors (f
mmol/L, corresponding to a range ratio of 10.1 assumed = 1.4 and 1.6, respectively). The average estimated
that a majority of the samples came from healthy standard error of the unweighted Deming procedure is
subjects or patients with moderately elevated glucose correct, but the distribution of the test statistic does not
concentrations; thus three-quarters of the observations conform exactly to the t-distribution, affecting the hy-
were located on the lower half and one-quarter on the pothesis testing. The weighted Deming method is more
upper half of the interval (Figure 6). The measurement efficient than the unweighted modification, as indicated
errors were proportional to the concentrations, with by the root mean squared error, and the rank method
coefficients of variation of CV1 = 0.05 (x) and CV2 = has an intermediate efficiency.
0.075 (y). The sample comprised 50 duplicate observa- For comparison, Table 2 shows the number of obser-
CLINICALCHEMISTRY.Vol.39, No.3, 1993 427

Table 2. SImulated Comparison of Two Glucose Methods
OrdInary Weighted
least-squares lsast.squar.s Dsmlng Weighted DomIng Rank
rag. anal. rag.anal. rag. anal. rag. anaL rag. anaL
Averageslope(b) 0.996 0.993a 1.001 1.000 1.002a
Rootmeansquared error 0.028 0.019 0.028 0.018 0.023
RealSE(b) 0.028 0.018 0.028 0.018 0.022
AverageestimatedSE(b) 0.018 0.016 0.028 0.018 -
Hypothesis test factor(f) 4.4 1.7 1.4 1.0 1.6

Samplesizes yielding thesame 210 100 220 100 160
precisionof slopeestimate,n
p = 1. CV, = 0.05, CV2= 0.075, 50 duplIcate measurements,and 2000-5000 simulation runs.
Significantly differentfrom I (P <0.001).
vations required by each of the methods to obtain slope the relation is x = y, + d, where d is some constant. The
estimates that have the same precision. Weighted least- target value of the outlier is shifted by one to eight
squares regression analysis and the weighted Deming analytical standard deviations at the given value from
method each require a sample size of 100, with the other the identity line. The slope is estimated in each simu-
methods demanding up to 2.2 times as many observa- lation run, and the root mean squared error is recorded
tions. for 1000-5000 repetitions. For the weighted Deming
Performance of Regression in the Presence
Methods method, I calculated the root mean squared error both
with and without applicationof a rejection rule for
of OutlIers and Nongaussian Error Distributions
outliers. For each pair of observations (x yj), I calcu-
In the previous section, I considered ideal situations, lated the standardized deviation from the estimated
i.e., cases with gaussian error distributions and no regression line, and if this deviation exceeded three
outliers. Because skew error distributions and outliers standard deviations of the distribution of deviations, I
may occur in real cases, itis relevant to evaluate the rejected the observation (11). If a rejection rule is not
regression methods under these conditions. I compared used, the root mean squared error for the slope estimate
the weighted Deming procedure, which is based on the of the weighted Deming method increases steadily as
least-squares computation principle, with the rank outlier deviation increases, whereas the root mean
method. The least-squares computation procedure is squared error of the rank method is almost unaffected
sensitive towards outliers, in contrast to the rank prin- (Figure 7).When the rejection rule is applied, however,
ciple, where a median value serves as a slope estimate. the root mean squared error of the weighted Deming
I investigated the impact of outliers by using the method increases only moderately up to a maximum
metabolite model and introducing an outlier randomly and then declines towards the starting value. For all
in 1 of 50 observations. Here, an outlier is considered as outlier positions, the root mean squared error of the
an observation for which the target value deviates from weighted Deming procedure is smaller than that of the
the value expected under the given hypothesis. Under rank method. No rejection rule for outliers is specified
the null hypothesis, we expect x, = y, but for the outlier for the rank method (12). Actually, identification of
outliers is important, because they may indicate the
RMSE presence of special matrix effects or interferents.
Skew measurement error distributions may be mod-
0.0300- eled by log-gaussian distributions (Figure 8). The coef-
0.0250 PROBABILITY
DENSITY
0.0200-
0-;’’’ .
STANDARD
DEVIATIONS
MEASURED
FIg. 7. Therootmeansquarederror(RMSE) oftheslopeestimate as VALUES
a function ofouther deviationexpressedin standarddeviations
the weighted Deming procedure; 0, the weIghted Demingprocedurewith
#{149}, Fig.8.The shape oftheskew measurementerrordistributions
outlier relection; & the rank procedure The coefficient of skewness is 1.3
428 CLINICAL CHEMISTRY, Vol.39,No.3,1993

ficient of skewness is 1.3 for this example. Table 3 gives testing was not as good as that achieved by using the
average slopes, root mean squared errors, standard jackknife method to derive standard errors (f = 1.5 for
errors, and hypothesis test factors for the metabolite the electrolyte case and 3.8 for the metabolite case,
case with skew error distributions. Bias was not impor- compared with f = 1.0 and 1.4, respectively, for the
tant for either method. The root mean squared error of jackknife method).
the weighted Deming method is smaller than that of the Given proportional measurement errors for both
rank method, and f = 1 only for the weighted Deming methods, the weighted Deming method is more efficient
method. than the usual unweighted approach. The larger the
range ratio, the greater the advantage. For a range ratio
Discussion of 10, the weighted approach requires less than half the
sample size of the unweighted one to provide a slope
Ordinary least-squares regression analysis is the sim-
estimate with the same precision. Achieving high pre-
plest regression method and by far the most widely used
cision for the slope estimate is as important as eliminat-
one for methods comparison studies. However, modern ing bias. In a particular methods comparison study, the
computer technology has made more complicated meth-
slope estimate is likely to be within plus or minus two
ods generally accessible because computations can be standard errors from the true, unknown slope value.
automated. Therefore, it appears reasonable to evaluate Thus, if an unbiased slope estimate has a low precision,
whether more complicated regression methods offer ad- perhaps caused by the use of nonoptimal regression
vantages over ordinary least-squares regression analy- technique, the result may still deviate considerably
sis. Earlier, we noted that apart from the electrolyte from the true value.
case, ordinary least-squares regression analysis gives a For both modifications of the Demrng method, it is
slope estimate with only a modest bias, even though presumed that the ratio (A) between the analytical
both analytical methods are subject to random measure- standard deviations is constant. This condition is sat-
ment errors. A more serious problem is that hypothesis isfied if both standard deviations are constant, i.e.,
testing is unreliable when errors are proportional. For independent of the concentration, or if both methods
the metabolite case studied here, the standard error of have a constant coefficient of variation and a propor-
the slope was on average underestimated by 40%, and tional relationship exists between the methods (a(, =
thus the null hypothesis was rejected too frequently and 0). Although in reality A is often not constant, it is
confidence intervals were underestimated. Also, the frequently approximated. If the ratio is not quite con-
efficiency of ordinary least-squares regression analysis stant, a small bias may arise. Studies have shown that
may be low for large range ratios when there are the Deming model is robust towards misapecification of
proportional errors. A (18). The bias is, at most, of the same order of
Weighted least-squares regression analysis is efficient magnitude as that of ordinary or weighted least-
for the metabolite case, and the bias is small. Hypothe- squares regression analysis.
The rank method of Passing and Bablok (12-14) gives
sis testing is better than for ordinary least-squares
an unbiased slope estimate under the null hypothesis
regression analysis, but the type I error is still some-
only if the analytical standard deviations for the two
what high because the random measurement errors of x methods to be compared are of equal size. Hypothesis
are neglected. Simulations show that if x is without testing is not correct for the rank method, and because
error, hypothesis testing becomescorrect. the rank method is, in principle, an unweighted method,
The unweighted Deming method yields an unbiased it is not optimal for the case with proportional errors.
slope estimate in both the metabolite and electrolyte The advantage of the method is its resistance towards
method comparisons and, with use of the jackknife outliers, but the method needs further development to
principle, hypothesis testing becomes correct for the eliminate the inadequacies revealed by the simulations
electrolyte case with constant analytical standard devi- performed here.
ations. Cornbleet and Gochman (5) suggest estimating In a time when there is an increasing focus on
the standard error of the slope by using the same accuracy in analytical methods, the choice of an optimal
formula as for ordinary least-squares regression analy- regression technique for methods comparison studies is
sis. Using this simple principle, I found that hypothesis important. For example, an inaccuracy of <3% has been
recommended for cholesterol measurement methods in
clinical laboratories (19). If an inappropriateor nonop-
Table 3. Performance of the WeIghted Demlng and the timal regression procedure is used to analyze the data
Rank Regression Procedures on Skew Error for a cholesterol methods comparison study, a bias of
Distributions perhaps 1% or an unnecessarily large standard error for
Weighted Doming Rank
rsg. anal. reg. anal.
the slope estimate may soon become critical.
Averageslope(b) 1.000 O.997a AppendIx
Rootmeansquarederror 0.017 0.022 ComputationPrinciplesfor the RegressionMethods
RealSE(b) 0.017 0.022
Ordinary least-squares regression analysis (1). On the
AverageestimatedSE(b) 0.017 -
basis of k paired observations (x yj), the regression line
Hypothesis
testfactor(f) 1.0 2.2
is estimated from sums of squared deviations and cross-
#{149}
Significantlydifferent from (P <0.001).
products:
CLINICALCHEMISTRY,Vol.39, No.3, 1993 429

u= .(x-; q= u,L, w(x1-j)2;q= w(y-)2
p= Pw
Slope and location parameters are determined:

b=p/u; a=3
bpJu; a=5
‘=a+b(x-)=a0+bx1; (ao=a-b)
The line is estimated:
The caret of 2, denotes that we obtain an estimate of the
“true” value for a given x.
The standard deviation for the distribution about the
line is as follows: The proportionality factor (c) for the standard deviation
SDy x = (y - ‘j2/(k -2)

about the line is computed:
c= (y - 1)2w/(k - 2)
The standard errors of a and b are as follows:
The standard errors of a and b are determined:
SE(a) = SDy
SE(a) = cI\/; SE(b) = c/\/Z
SE(b) = SDy .
The null hypothesis of identity is tested by two indepen-
dent t-tests:
The null hypothesis of identity is tested by two indepen-
dent t-tests:
t = (a - Z)/SE(a); t = (b - 1)/SE(b)
t = (a - )/SE(a); t = (b - 1)/SE(b)
The Deming regression analysis (also called: errors-in-
variables regression analysis, a structural or functional
Weighted least-squares regression analysis (2). In relationship model) (3-11). In addition to the sums of
principle, weights (we) are introduced in computations of
squared deviations and cross-products, the ratio be-
sums of squared deviations and cross-products. The
tween the squared analytical standard deviations for
weights are inversely proportional to the squared ana- the methods is needed:
lytical standard deviation of y at a given value. It is
usual to assume that the analytical standard deviation
A = SDX/SD,
is a function [h()J of the “true” concentration (= x), i.e.,
Assuming constant standard deviations and given du-
SD, ch(x)
plicate measurements (x1, and x2; and y21), the
analytical standard deviations are estimated as
where c is a proportionality factor. Then
SDX = (1/2k) (xj - x2j)2
w = 1/ [h(x)}2
SD, = (1/2k) (yj, - Y21)2

If it is plausible from experimental data to assume that
the coefficient of variation is constant, we have SD, = c
x, (supposing that a0 = 0) and w, = 1I;2. For some The regression line is estimated as follows:
analytes (as shown in Figure 2b), the analytical stan-
dard deviation becomes constant in the low region, b = [(Ag - u) + \/(u - Aq)2 + 4Ap2]/2AP
below a limit L. We then have that SD, = cL for x, <
L and SD,,, = cx for x, > L. Correspondingly, w, = ilL2
for x L and w = 1/;2 for x#{149}
> L. a=3
The regression line is estimated as follows:
Weighted averages are computed: ‘1=a+b(k-)=a0+bk; (a0=a-b)
= Rather complicated methods for computation of stan-

dard errors are given in the theoretical statistical liter-
= ature (10). I used the computerized jackknife method
(11), which has the advantage of being distribution-free.
Weighted squared deviations and cross-products are Weighted Deming regression analysis. In the weighted
summed: Deming regression analysis, weights (we) are introduced
430 CLINICALCHEMISTRYVol.39, No. 3, 1993

in computations of sums of squared deviations and The intercept is determined:
cross-products,and the slope is calculated as described
in the previous section (11). The weights are inversely a= Yw
proportional to the squared analytical standard devia-
tions at a given value. Supposing that the analytical
The usual Deming method is the special case where all
standard deviations are functions of the true concentra-
weights are equal to 1. Standard errors are obtained by
tion, we have
the jackknife method.
The weighted Deming regression analysis is evalu-
SD5 = fxhx(X); = fh(Y)
ated in reference 11 for the null hypothesis situation.
To use a weighted modification of the Deming method, Supplementary evaluations with f3 = 0.8 and 1.2 (slopes
are usually within this range in methods comparison
we must assume that the ratio SD2/SD2 is a constant
studies) have confirmed that the procedure also is valid
(= A). Given a proportional relationship between ana-
under alternative hypotheses.
lytical standard deviations and concentrations, perhaps
truncated at some lower limit L, and a0 = 0, this
A plot of standardized distances from the observed
points to the line against the average value [(.t, +
condition holds true. Keeping in mind the symmetry of
2)/2] is useful for evaluation of linearity and outliers.
this model, we may then express the weights as
A practical rejection rule for outliers is three standard
deviations of the distribution of standardized distances.
w, = 1/[(X + Y)/2]2
A BASIC program that performs the weighted Deming
regression analysis is available from the author.
Because X and Y are unknown true values, we have to
Rarth regression analysis (12-14). In the rank regres-
use estimates, which can be obtained by an iterative
sion analysis, in principle, slopes between all possible
procedure, described in detail in reference 11:
sets of two points are calculated and ranked. The me-
dian expresses a slopeestimate. Measurement errors for
th, = 1/[(X, + both methods are taken into accourit by a symmetry
argument, which results in calculation of a shifted
Actually, it is possible that the most precise estimate median that is the final slope estimate. Confidence
may be the inverse of a weighted average of the esti- intervals and hypothesis testing are established by
mated true values, i.e., 1/[C + A2,)/(1 + A)]2. In the using traditional rank techniques.
simulations performed here, only the simple average
has been used.
The regression line is estimated as follows: References
1. Snedecor GW, Cochran WG, eds.Statistical methods, 6th ed.
Weighted averages are calculated: Ames, IA: Iowa State University Press,1967.
2. HaM A. Statistical theory with engineering applications. New
= th/ th; = thy8/ th York: Wiley, 1952.
3. Wakkers PJM, Heliendoom HBA, Op De Weegh GJ, Heerspink
W. Applications of statistics in clinical chemistry. A critical
Weighted sums of squares and cross-products are com- evaluation of regression lines. Cliii Chim Acta 1975;64:173-84.
puted: 4. Lloyd PH. A scheme for the evaluation of diagnostic kits. Ann
Clin Biochem 1978;15:136-45.
5. Cornbleet PJ, Gochman N. Incorrect least-squares regression
u,,,= th(x-)2;
coefficients in method-comparison analysis. Cliii Chem 1979;25:
432-8.
Pw= 6. Feldmann U, Schneider B, Klinkers H, Haeckel R. A multi-
variate approach for the biometric comparison of analytical
methods in clinical chemistry. J Clin Chem Clin Biochem 1981;
The proportionality factors f and f,,are estimated from 19:121-37.
duplicate measurements: 7. Parvin CA. A direct comparison of two slope-estimation tech-
niques used in method-comparison studies. Clin Chem 1984;30:
751-4.
= (1/2k) (x1 - x2)2/[( + 5)/2]2 8. Strike PW, Michaeloudis A, Green AJ. Standardizing clinical
laboratory data for the development of transferable computer-
= (1/2k) (yi Y2i)2’[( + based diagnostic programs. Clin Chem 1986;32:22-9.
9. Mandel J. The statistical analysis of experimental data New
York: Wiley, 1964.
10. Kendall MG, Stuart A. The advanced theory of statistics, Vol.
and as the means of the duplicate measurements.
i, ,
2. London: Charles Griffin, 1973:393-7.
We obtain the following expression: 11. Linnet K. Estimation of the linear relationship between the
measurements of two methods with proportional errors. Stat Med
1990;9:1463-73.
A _/.2//.2 12. Passing H, Bablok W. A new biometrical procedure for testing
the equality of measurements from two different analytical meth-
The slope estimate is calculated: ods. J Clin Chem Clin Biochem 198321:709-20.
13. Passing H, Bablok W. Comparison of several regression pro-
cedures for method comparison studies and determination of sam-
b = [(Aq - u) + \/[(u - Aq)2 + 4ApjI2Ap ple sizes. J Cliii Chem Clin Biochem 1984;22:431-45.
14. Bablok W, Passing H, Bender R, Schneider B. A general
CLINICAL CHEMISTRY, Vol. 39, No. 3, 1993 431

regression procedure formethod transformation. J Clin ChemClin 17. Tietz NW, ed. Texthook of clinical chemistry. Philadelphia:
Biochem 198826:783-90. Saunders, 1986.
15. Ross JW, Fraser MD. The effect of analyte and analyte 18. Beech DG. Some notes on the precision of the gradient of an
concentration upon precision estimates in clinical chemistry. Am J estimated straight line. Appl Stat 1961;1O:14-31.
Clin Pathol 1976;66:193-205. 19. Current status of blood cholesterol measurements in clinical
16. Sadler WA, Smith MH, Legge HM. A method for direct laboratories in the United States: a report from the Laboratory
estimation of imprecision profiles, with reference to iunmunoassay Standardization Panel of the National Cholesterol Education Pro-
data. Clin Chem 1988;34:1058-61. gram. Clin Chem 1988;34:193-201.
432 CLINICALCHEMISTRY,Vol.39, No.3,1993

Evaluation of Regression Procedures For Methods Comparison Studies

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluation of Regression Procedures For Methods Comparison Studies

Uploaded by

Copyright:

Available Formats

CLIN.CHEM.

Evaluation of Regression Procedures for Methods Comparison Studies

It is often necessary in clinical chemistry to look for a

CLINICALCHEMISTRY,Vol.39, No.3, 1993 425

426 CLINICALCHEMISTRY,Vol.39, No.3, 1993

CLINICALCHEMISTRY.Vol.39, No.3, 1993 427

Hypothesis test factor(f) 4.4 1.7 1.4 1.0 1.6

428 CLINICAL CHEMISTRY, Vol.39,No.3,1993

CLINICALCHEMISTRY,Vol.39, No.3, 1993 429

Slope and location parameters are determined:

SDy x = (y - ‘j2/(k -2)

SD, = (1/2k) (yj, - Y21)2

= Rather complicated methods for computation of stan-

430 CLINICALCHEMISTRYVol.39, No. 3, 1993

CLINICAL CHEMISTRY, Vol. 39, No. 3, 1993 431

432 CLINICALCHEMISTRY,Vol.39, No.3,1993

You might also like