Bootstrapping The General Linear Hypothesis Test: Pedro Delicado

Bootstrapping the general linear
hypothesis test
Pedro Delicado
Departamento de Estadstica y Econometra, Universidad Carlos III de Madrid, 28903
Getafe (Madrid), Espa
na.
Manuel del Ro1

Departamento de Estadstica e Investigacion Operativa, Universidad Complutense de
Madrid, 28040 Madrid, Espa
na.
Abstract: We discuss the use of bootstrap methodology in hypothesis testing, focusing

on the classical F-test for linear hypotheses in the linear model. A modification of
the F-statistics which allows for resampling under the null hypothesis is proposed.
This approach is specifically considered in the one-way analysis of variance model. A
simulation study illustrating the behaviour of our proposal is presented.
Key words: Bootstrap, F-test, General Linear Hypothesis, Hypothesis Testing, Linear
Model, One-way Model, Resampling.
Introduction.
The use of bootstrap methodology in hypothesis testing essentially based on approximating the critical values by bootstrap distributions has received less attention
that its application in other problems like, for instance, the construction of confidence
regions. Here are some recent references. Beran (1988) studies the asymptotic error
in level of bootstrap tests and the improvement given by prepivoting the test statistic.
In Hinkley (1988) some general ideas about bootstrap testing are briefly discussed.
Romano (1988) uses the bootstrap to aproximate critical values of nonparametric tests
based on measures defined on the empirical distribution. In Boos and Brownie (1989)
1
Correspondence to: Manuel del Ro, Departamento de Estadstica e Investigacion Operativa,
Facultad de Matematicas, Universidad Complutense de Madrid, 28040 Madrid, Espa

na.
and Zhang and Boos (1992) the use of bootstrap methods for testing homogeneity of
variances and covariances matrices, respectively, is considered; good (asymptotic and
simulated) results in level and power are obtained. The bootstrap approximation, resampling U- and V-statistics, is also used in Boos, Janssen and Veraverbeke (1989)
for testing homogeneity of scale in the two-sample problem. Hall and Wilson (1991)
and Hall (1992) insist on two guidelines that we analyze in Section 2: usage of pivotal functions and resampling reflecting the null hypothesis. In Mammen (1992) the
convergence of bootstrapped F-test in linear models is proved.
This paper is concerned with the use of bootstrap idea in hypothesis testing. Section 3 is dedicated to present our work for testing a general linear hypothesis in a linear
model. Following the second guideline cited above, we propose a natural modification
of the classical F-statistics that permits resampling under the null hypothesis, though
the data fail to comply with it. In Section 4, this proposal and the basic idea of resampling taking into account the model in consideration is analyzed with some detail
in the framework of the one-way analysis of variance. In Section 5, the corresponding
behaviour is illustrated by reporting the results of a simulation study.
Two general guidelines.
Naturally, the duality between hypothesis testing and confidence regions is maintained
under bootstrap methodology. That leads to consider the first guideline cited: use of
(asymptotically) pivotal quantities, that could be extended to the use of methods with
good behaviour in the problem of confidence interval construction. In general, acting
in this way will imply an improvement in the test level.
However, there is an important conceptual difference when both problems are considered under the bootstrap approach. In hypothesis testing accurate estimates of the
critical values are needed, even if the data had their origin in the alternative hypothesis. This demand could invalidate some direct approximations to the distribution of
interest. Consider, as in Hall and Wilson (1991), a one-dimensional parameter situation and the simple testing problem H0 : = 0 against H1 : 6= 0 . Given that
T = T (X1 , . . . , Xn ) is a good estimator of the unknown , a reasonable test could be
based on the difference T 0 . If our goal is in relation to its level or p-value, our
interest lies in the distribution of T 0 under H0 . A direct and nave application
2
of the bootstrap method would lead to estimate this distribution using the statistic
T 0 , where T is the value of T in the resampling, that is, under the empirical
distribution of (X1 , . . . , Xn ). In Hall (1992, Sec.3.12) it is shown the bad behaviour of
the power function of the corresponding test. The trouble here is that T 0 does not
approximate the null hypothesis when the sample comes from a parameter far away
from 0 .
Hall and Wilson (1991) propose to estimate the T 0 distribution by means of the
statistic T T . Note that this corresponds to resampling the quantity T0 = T
instead of T 0 . In the next section we extend this idea to testing a general linear
hypothesis in a linear model.
Bootstrap hypothesis testing in the linear model.
It will be convenient to consider the coordinate-free version of the linear model. Let
Y = + e, where Y denotes the n-dimensional vector of observations, is the vector
of means belonging to the subspace V IRn with dimension p < n, and e is the vector
of independent errors ei F such that E(e) = 0. We want to test H0 : V0 versus
6 V0 , being V0 V a subspace with dimension p0 < p. The usual F-statistic can be
written as
T = T (Y ) =
kPV |V0 Y k2 /(p p0 )

,
kPV Y k2 /(n p)
(3.1)
where V , V |V0 , and PW denote, respectively, the subspace orthogonal to V, the

orthogonal complement of V0 in V , and the orthogonal projection onto W .
Here is an schematic outline of the bootstrap procedure for this version of the
linear model. We start from the model (, F ); after observing the response vector Y ,
we adjust (
, Fn ), being
= PV Y and Fn the empirical distribution of the residual
vector e = Y
= PV Y , that we suppose centered at 0. The bootstrap estimation of
an arbitrary function R = R(Y, F ) is the (conditional) distribution of R = R(Y , Fn )
where Y =
+ e , and the vector e has independent components ei Fn .
The bootstrap methodology proposes to consider as the critical region of nominal
-level T > t , where t is the 1 quantile of the bootstrap distribution of T under
samples coming from H0 , i.e., P (T (Y ) t | Y H0 ) = 1 .
3
Hence, we need to know or, in the practice, to approximate by Monte Carlo

trials this null bootstrap distribution. For this, we propose to consider
kPV |V0 (Y )k2 /(p p0 )
kPV |V0 ek2 /(p p0 )
T0 = T0 (Y, ) =
=
.
kPV (Y )k2 /(n p)
kPV ek2 /(n p)
(3.2)
Comments:
i) Since it is based on Y , T0 does not relies on the hypothesis under which the
data are obtained; in other words, it is invariant against the hypothesis generating the
data. Besides, T0 agrees with T under H0 .
ii) A direct reasoning leading to T0 is the following. Since we attempt to approximate the null distribution of T , let us take an arbitrary 0 V and transform Y to
Y0 = Y + 0 . The vector Y0 verifies H0 and, we are naturally conducted to
kPV |V0 Y0 k2 /(p p0 )
kPV |V0 (Y )k2 /(p p0 )
T (Y0 ) =
=
= T0 (Y, ).
kPV Y0 k2 /(n p)
kPV (Y )k2 /(n p)
Note that the above expression is independent of the initial vector 0 V .
Therefore, the bootstrap distribution of T0 (Y, ) is the null bootstrap distribution
of T ; consequently, the critical value t should be taken verifying
P (T0 = T0 (Y , ) t | Y ) = 1 .
(3.3)
We see that the modification of the F-statistics arises in a natural and direct way.
In some sense, this presentation completes the one given in Mammen (1992). Moreover,
his results assure the convergence of our proposal.
Bootstrapping the one-way ANOVA model.
We now consider the previous proposal in the particular case of testing equality of
means in the one-way model.
4.1
Raising the problem.
We assume a set of ni observations, Yij , j = 1, . . . , ni , coming from a population

Pi with mean i and variance 2 , i = 1, . . . , p.
4
We want to test H0 :
1 =
= p . This problem fits into the framework of last section by defining Y =

(Y11 , . . . , Y1n1 , . . . , Ypnp )0 , n =
Pp
i=1
ni , = (1 , . . . , p )0 ; V0 and V are now the
spaces spanned, respectively, by the vector 1n = (1, .n). ., 1)0 and by the p vectors in
0
IRn : (1n1 , 0)0 , (0, 1n2 , 0)0 , . . . , (0, 1np )0 . It is well know that
P
ni (Y i Y )2 /(p 1)
T (Y ) = P iP
.
2
i
j (Yij Y i ) /(n p)
(4.1)
The statistic T0 in (3.2) is now

P
2
ni (X i X)2 /(p 1)
i ni (ei e) /(p 1)
T0 (Y, ) = P P
=
,
P
P
2
2
i
j (eij ei ) /(n p)
i
j (Xij X i ) /(n p)
i
where Xij = Yij i = eij , X i =
4.2
P
j
Xij /ni , X =
P P
i
(4.2)
Xij /n.
Resampling schemes.
To obtain a good approximation through the bootstrap approach in a general arbitrary

situation, the resampling distribution must reflect the model distribution in an adequate way. In our concrete situation this means that the bootstrap distribution of T0
must be obtained mimicking the considered model. The one-way model will allow
us to illustrate this comment by considering two slightly different sets of assumptions
on the underlying populations.
Resampling R1 : Identical populations.
This is the standard case where the distributions associated to the populations differ
only in the means, that is: eij Fi = F, i = 1, . . . , p. Since errors are interchangeable
through the populations, we should take Fn as the empirical distribution of the n
residuals eij = Yij Y i , i = 1, . . . , p, j = 1, . . . , ni . Recalling the schematic outline
presented after (3.1), we have
P
T0
2
ni (X i X )2 /(p 1)
i ni (ei e ) /(p 1)
=P P
,
=
P
P
2
i
j (eij ei ) /(n p)
i
j (Xij X i ) /(n p)
i
(4.3)
i = eij + Y i Y i = eij Fn .
where Xij = Yij
In practice, the simulation of the T0 distribution will be done generating B independent samples from the empirical distribution of the n residuals eij , that is, considering
5
B sets with n elements chosen with replacement from R = {

eij , i = 1, . . . , p, j =
1, . . . , ni }.
Resampling R2 : Different populations.
Assume now the distributions associated with every population are different not
only in the means but in other aspects; that is, eij are independent and eij Fi , j =
1, . . . , ni , where Fi are different with mean 0 and variance 2 , i = 1, . . . , p. To reflect
the new joint distribution of the vector of errors, we will take now Fn as the joint
distribution of the p empirical independent distributions, Fini , corresponding to the
sets Ri = {
eij , j = 1, . . . , ni }, i = 1, . . . , p. The statistic T0 still has the expresion
(4.3), with Xij = eij Fin .
i
In practice, T0 will be simulated from B sets of n elements, ni of them will be

selected with replacement from Ri , i = 1, . . . , p. Note that resampling from Ri is
equivalent to resampling from Oi = {Yij , j = 1, . . . , ni }, then obtaining Oi = {Yij , j =
1, . . . , ni }, and, finally, defining Xij = Yij Y i , j = 1, . . . , ni .
4.3
Normalizing the residuals.
The idea of reflecting the model hypothesis in the bootstrap scheme, leads us to consider the resampling in a population of residuals where the empirical variances in each
subpopulation are equal; without loss of generality we take this common value equal
to 1. This leads us to replace (4.3) by
P
TN
2
i ni (eN i eN ) /(p 1)
=P P
,
2
i
j (eN ij eN i ) /(n p)
(4.4)
where eN ij FN n , and FN n is the empirical (joint) distribution of the standardized

i ,
i2 =
residuals: eN ij = (Yij Y i )/
j (Yij
Y i )2 /ni .
Note that the statistic TN can be also obtained in the following way. Assume
initially heteroscedasticity and consider, as alternative to (4.2),
P
2
i ni (W i W ) /(p 1)
T0N (Y, , 1 , . . . , p ) = P P
,
2
i
j (Wij W i ) /(n p)
where Wij = (Yij i )/i , and i2 is the populational variance of Pi . If there exists
homoscedasticity, T0N becomes T0 and its distribution is the same as the distribution
6

of T under H0 . It follows inmediately that the bootstrap distribution of T0N
coincides
with the distribution of TN . Heuristically, we should expect that this statistic will
improve the approximation to the null distribution.
The simulation of TN will be done similarly as in Subsection 4.2., but starting with
the normalized residuals eN ij . Both schemes above cited are still valid using now the
normalized residuals. In the next section, we will denote them resampling R3 and R4 .
Remark: The analysis and comparison of different residual resampling schemes
adapted to the assumptions are frequent in the bootstrap literature on linear models.
The first suggestion is already in Efron (1979); in fact, our scheme R1 adapts his
suggestion to the one-way model. Close to the resamplings presented above are those
considered in Boos and Brownie (1989) and Boos et al. (1989). A wide discussion on
residual resampling in linear regression can be found in Wu (1986) (see also Liu (1988)).
Likewise, residual resamplings for bootstrap generalized linear regression models are
considered in Moulton and Zeger (1991).
A simulation study.
In this section we illustrate the behaviour of our proposal by means of a simulation

study . The results, obtained using only 199 bootstrap resamples, generally indicate a
satisfactory performance.
Through our simulation study, we have maintained three populations, i.e., p = 3.
We considered combinations of two nominal levels, = .05, .1, two total sample sizes,
n = 30, 60, and three groups of error distributions associated to the populations,
Gk , k = 1, 2, 3. The composition of these groups is the following. In G1 the errors
have normal distributions; in G2 they have shifted exponential distributions; and, in
G3 the errors have different distributions: N (0, 1), shifted exponential and normalized
t3 .
The computations were performed using Fortran routines running on a DECstation
5000/200 under Ultrix-32. The routines GGUBS, GGNML and GGAMR of the IMSL
library are used to generate the pseudo-random variables and to make the resamplings.
Error distributions
Resamplings
G1 : N (0, 1)
G2 : E(1)
R1
R3
R1
.053
.055
R3
G3 : N (0, 1), E(1), t3

R1
R2
R4
.049
.032#
.042
.096
Population sizes
(10, 10, 10)
(5, 10, 15)
(20, 20, 20)
(10, 20, 30)
= .05
.051 .048
= .1
.099
.099
.095 .096
.098
.080
= .05
.038
.042
.044 .041
.060
.043
.052
= .1
.091
.090
.109 .111
.115
.090
.110
= .05
.043
.047
.046 .047
.050
.043
.050
= .1
.096
.093
.097 .097
.112
.094
.105
= .05
.046
.043
.057 .056
.063
.043
.052
= .1
.101
.097
.110 .107
.106
.090
.097
Table 1: Bootstrap levels for 28 combinations of error distributions, resamplings, and

population sizes.
5.1
Results under the null hypothesis.
Without loss of generality, we took 1 = 2 = 3 = 0. The specific distributions for

the above groups were: N (0, 1), shifted exponential with = 1, and t3 normalized to
have variance unity.
We conducted 1000 trials under each of the 28 combinations of 4 particular population sizes and 7 types of resampling listed in Table 1. In each trial, B = 199 resamples
were drawn from the empirical distribution associated to the resampling. By means of
(3.3), these resamples provided the approximate ti values and, in each trial, H0 was
rejected if T (Y ) > ti , = .05, .1, i = 1, . . . , 1000. (Specifically, ti was the 9th largest
of the 199 values of T0 when = .05, and the 19th largest when = .1 .)
The entries of Table 1 are the bootstrap levels, , that is, the proportion of trials
rejecting H0 . The symbol # denotes cases with levels differing significantly of .
We would like to point out some empirical facts. Most of the bootstrap levels are
very close to the nominal levels, specially when the appropriate resamplings are used,
and even when the inadequate resamplig R1 is used in G3 . The performances under
normal or exponential distributed errors are very similar, indicating that asymmetric
errors are not troublesome. For the group G3 (different populations), the results with
resampling R2 are slightly better than with R1 and both are dominated by the results
using R4 .
5.2
Results under the alternative hypothesis.
We have also verified the good approximation to the true critical values and the high
empirical power provided by the bootstrap test when the data come from populations
with unequal means. Our first illustration will use some of the combinations considered
in Section 5.1, being now 1 = 1, 2 = 0, 3 = 1.
(Insert Figure 1 about here)
Figure 1 depicts box-plots (with whiskers ending at the extreme values) of 1000
bootstrap estimated critical values, ti , obtained from (3.3). The errors belong to
group G1 with N (0, 1) and G2 with shifted exponential, = 1. (The true values are
F2,27,.05 = 3.35 in G1 and were obtained by simulation using 5000 trials in G2 .) Note
the concentration around the true critical values, and the similar results given by R1
and R3 .
We also consider the general alternative HA : 1 = , 2 = 0, 3 = , (0, 1],
for the above 28 combinations. We obtained similar results for = .05 and = .1 and
we will report the one corresponding to = .05.
Table 2 gives the proportion of trials rejecting the equality of means, that is, the
power, under the above alternatives = .1, .4, .7, 1 for the combinations. In parenthesis
are estimates of the power obtained using the correct critical values. From now on the
true values will be estimated by simulation using 5000 trials. It is worthwhile noting
the high values always attained for = 1, and for = .7 when n = 60.
For the general alternative, we have the following comments. In groups G1 and
G2 , the approximation to true power provided by resampling R1 and R3 is good for
both sizes. In G3 and equal population sizes ((10, 10, 10), (20, 20, 20)), resamplings R1
and R4 gave analogous and excellent approximations to the true power; resampling R2
behaved worse, specially when n = 30.
Error dist.
G1 : N (0, 1)
Resamplings
R1
R3
(S)
G2 : E(1)
R1
R3
G3 : N (0, 1), E(1), t3

(S)
R1
R2
R4
(S)
Pop. sizes
(10, 10, 10)
.1
.059 .059 (.071)
.055 .057
(.072)
.064 .047
.057
(.068)
.4
.316 .314 (.309)
.398 .384
(.390)
.347 .279
.318
(.346)
.7
.734 .724 (.763)
.776 .771
(.786)
.787 .713
.764
(.806)
.976 .979 (.976)
.945 .946
(.957)
.957 .917
.947
(.961)
.1
.061 .061 (.060)
.056 .049
(.058)
.086 .067
.076
(.068)
.4
.258 .261 (.269)
.292 .295
(.318)
.338 .269
.318
(.285)
.7
.667 .662 (.674)
.725 .726
(.730)
.690 .595
.658
(.656)
.931 .928 (.941)
.926 .927
(.931)
.930 .867
.908
(.905)
.1
.072 .075 (.077)
.105 .100
(.091)
.076 .061
.063
(.084)
.4
.604 .602 (.594)
.625 .619
(.614)
.616 .589
.607
(.616)
.7
.975 .975 (.979)
.970 .969
(.970)
.972 .954
.970
(.966)
1.00 1.00 (1.00)
1.00 1.00
(.999)
.999 .996
.998
(.997)
.1
.083 .081 (.073)
.084 .082
(.062)
.104 .066
.084
(.083)
.4
.517 .515 (.503)
.556 .552
(.563)
.544 .493
.534
(.511)
.7
.947 .948 (.950)
.946 .945
(.936)
.930 .910
.921
(.929)
1.00 1.00 (1.00)
.998 .998
(.996)
.991 .985
.988
(.992)
(5, 10, 15)
(20, 20, 20)
(10, 20, 30)
Table 2: Bootstrap and true power in HA : 1 = ; 2 = 0, 3 = , = .1, .4, .7, 1.0,

for the 28 combinations of Table 1. Estimates of power for the tests with correct .05
level are in parenthesis.
10
The situation G3 and different population sizes is illustrated in Figure 2 that shows,
as function of [0, 1], = 0(.1)1, the true (simulated) power for the test with = .05
level and the power curves for resampling R2 and R4 . R1 was not included for the sake
of presentation clarity; its graphic was always above the R4 curve, its approximation
was in general worse that the one corresponding to R4 and practically in agree with this
one for ' 1. With respect to Figure 2 we note that the true curve is nearly always
between R2 and R4 curves. With n = 30, the true power is better adjusted by R2 for
alternatives close to H0 (approximately .4) and by R4 for the opposite situation.
With n = 60, extremly good approximations are obtained and the differences between
R4 and R4 vanish.
5.3
Results under heteroscedasticity.
Although our development does not deal with heteroscedastic situations, we planned
to check how our proposal would perform under unequal variances. With this aim,
we considered the groups G1 and G3 . In G1 the normal distributions had standard
deviations i = i, i = 1, 2, 3; in G3 we included standard normal, shifted exponential
with = 2, and t3 ( = 3) distributions. The total sizes (n = 30, 60) were assigned to
the populations in three ways: balanced, higher sizes to higher variances, and viceversa.
For each assignation, resamplings R1 and R2 were considered. Both groups and sizes
provided similar results.
As a summary of empirical conclusions about the level approximation we have: the
resampling R1 was very unappropriated with unbalanced sizes, the resampling R1 was
beaten by R2 in all the situations; therefore, if we suspect of possible heteroscedasticity,
the resampling starting from different populations will provide more accurate levels.
The best results occur when there is correspondence between sizes and variability in
populations.
With respect to the power, the above behaviour is maintained and we had good
approximations to the true (simulated) power even with total size n = 30. This is
illustrated in Figure 3. It shows, for G1 and n = 30, the power curves for R1 and R2
(and two population sizes: ((5, 10, 15), (15, 10, 5)) ) as function of [0, 1], = 0(.1)1.
Note that = 0 corresponds to the level approximation. The low true power is a
11
consequence of the high ratios between variances and means.

5.4
Bootstrapping the F-statistic directly
Finally, we will show the inaccuracy of the nave bootstrap directly based on the Fstatistic of (4.1), that is, when the resampling uses
P
TD
ni (Y i Y )2 /(p 1)
=P P
,
2
i
j (Yij Y i ) /(n p)
i
where Yij = eij + Yi , and eij follow one of the empirical distributions presented in
Sections 4.2 and 4.3. We will refer this procedure as direct bootstrap.
We began conducting 1000 trials for the situation and combinations considered in
Section 5.1. The direct bootstrap levels, D

, turned out to be much lower than the
nominal levels in fact, only 5 of 56 cases considered were different of zero with
a maximum value of .004. We also studied the distribution of the direct critical
values, tD
i , i = 1, . . . , 1000, = .05, .1, corresponding to 4 combinations: groups
G1 and G2 with resampling R1 , and G3 with R1 and R3 .
The population sizes
were ni = 10. This study was done with data from H0 and from the alternative
HA : 1 = 1, 2 = 0, 3 = 1.
The distributions of the direct critical values were very similar in the four combinations. Figure 4 shows box-plots summarizing the distributions under H0 and HA ,
= .05, n = 30, in the combination G1 and R1 . For comparison, we have also included
box-plots for the distributions of the critical values, ti , derived from our proposal.
Two points clearly stand out. Whether the data came from H0 or from HA , the
distribution of critical values tD
i is useless to approximate the true critical value
(F2,28,.05 = 3.35). On the contrary, both from H0 and HA , the distributions of proposed
critical values ti are extremely similar and very concentrated around the true critical
value.
12
Acknowledgement
We thank the Co-Editor and two referees for their helpful comments and suggestions.
References
Beran, R., Prepivoting tests statistics: A bootstrap view of asymptotic refinements,
J. Amer. Statist. Assoc., 83 (1988) 687697.
Boos, D. D. and Brownie, C., Bootstrap methods for testing homogeneity of variances,
Technometrics,31 (1989) 6982.
Boos, D. D., Janssen, P. and Veraverbeke, N., Resampling from centered data in the
two-sample problem, J. Statist. Plann. Inference, 21 (1989) 327345.
Efron, B., Bootstrap methods: another look at the Jacknife, Ann. Statist., 7 (1979)
126.
Hall, P., The Bootstrap and Edgeworth Expansion (Springer-Verlag, New York, 1992)
Hall, P. and Wilson, S.R., Two guidelines for bootstrap hypothesis testing, Biometrics,
47 (1991) 757762.
Hinkley, D. V., Bootstrap methods, J. Roy. Statist. Soc. Ser. B,50 (1988) 321337.
Liu, R. Y., Bootstrap procedures under some non-i.i.d. models, Ann. Statist., 16
(1988) 16961708.
Mammen, E., When does bootstrap work? (Lecture Notes in Statistics, Vol. 77,
Springer-Verlag, New York, 1992)
Moulton, L. H. and Zeger, S.L., Bootstraping generalized linear models, Comp. Stat.
Data Anal., 11 (1991) 5363.
Romano, J. H., A bootstrap revival of some nonparametric distance tests, J. Amer.
Statist. Assoc., 83 (1988) 698708.
Wu, C. F. J., Jacknife, bootstrap and other resampling methods in regression analysis,
Ann. Statist., 14 (1986) 16211295.
Zhang, J. and Boos, D. D., Bootstrap critical values for testing homogeneity of covariance matrices, J. Amer. Statist. Assoc., 87 (1992) 425429.
13
G1 -R1 G1 -R3 G1 -R1 G1 -R3 G2 -R1 G2 -R3 G2 -R1 G2 -R3

(10, 10, 10) (5, 10, 15)
(10, 10, 10) (5, 10, 15)
Fig. 1: Box-plots of bootstrap critical values (N=1000) with data from

HA : 1 = 1, 2 = 0, 3 = 1.
Mean of bootstrap critical values. True critical value.
14
Fig. 2: Power graphics for different populations: G3 , = .05.

HA : 1 = , 2 = 0, 3 = ; = 0(.1)1.
true power, - - - power using R2 , --- power using R4 .
15
Fig. 3: Bootstrap and true power with heteroscedastic data, = .05.

Error distributions: N (0, i = i), i = 1, 2, 3.
HA : 1 = , 2 = 0, 3 = ; = 0(.1)1.
true power, --- power using R1 , - - - power using R2 .
16
Data from H0
Modified statistic
Direct statistic
Data from HA
Modified statistic
Direct statistic
Fig. 4: Box-plots of bootstrap critical values (N=1000) using

modified and direct F-statistics.
Mean of bootstrap critical values. True critical value
17

Bootstrapping The General Linear Hypothesis Test: Pedro Delicado

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bootstrapping The General Linear Hypothesis Test: Pedro Delicado

Uploaded by

Copyright:

Available Formats

Bootstrapping the general linear

Manuel del Ro1

Abstract: We discuss the use of bootstrap methodology in hypothesis testing, focusing

Correspondence to: Manuel del Ro, Departamento de Estadstica e Investigacion Operativa,

Facultad de Matematicas, Universidad Complutense de Madrid, 28040 Madrid, Espa

Two general guidelines.

Bootstrap hypothesis testing in the linear model.

kPV |V0 Y k2 /(p p0 )

where V , V |V0 , and PW denote, respectively, the subspace orthogonal to V, the

Hence, we need to know or, in the practice, to approximate by Monte Carlo

Bootstrapping the one-way ANOVA model.

Raising the problem.

We assume a set of ni observations, Yij , j = 1, . . . , ni , coming from a population

= p . This problem fits into the framework of last section by defining Y =

ni , = (1 , . . . , p )0 ; V0 and V are now the

The statistic T0 in (3.2) is now

where Xij = Yij i = eij , X i =

To obtain a good approximation through the bootstrap approach in a general arbitrary

B sets with n elements chosen with replacement from R = {

In practice, T0 will be simulated from B sets of n elements, ni of them will be

Normalizing the residuals.

where eN ij FN n , and FN n is the empirical (joint) distribution of the standardized

In this section we illustrate the behaviour of our proposal by means of a simulation

G3 : N (0, 1), E(1), t3

Table 1: Bootstrap levels for 28 combinations of error distributions, resamplings, and

Results under the null hypothesis.

Without loss of generality, we took 1 = 2 = 3 = 0. The specific distributions for

Results under the alternative hypothesis.

G3 : N (0, 1), E(1), t3

(10, 10, 10)

.059 .059 (.071)

.316 .314 (.309)

.734 .724 (.763)

.976 .979 (.976)

.061 .061 (.060)

.258 .261 (.269)

.667 .662 (.674)

.931 .928 (.941)

.072 .075 (.077)

.604 .602 (.594)

.975 .975 (.979)

1.00 1.00 (1.00)

.083 .081 (.073)

.517 .515 (.503)

.947 .948 (.950)

1.00 1.00 (1.00)

(5, 10, 15)

(20, 20, 20)

(10, 20, 30)

Table 2: Bootstrap and true power in HA : 1 = ; 2 = 0, 3 = , = .1, .4, .7, 1.0,

Results under heteroscedasticity.

consequence of the high ratios between variances and means.

Bootstrapping the F-statistic directly

Section 5.1. The direct bootstrap levels, D

The population sizes

G1 -R1 G1 -R3 G1 -R1 G1 -R3 G2 -R1 G2 -R3 G2 -R1 G2 -R3

(10, 10, 10) (5, 10, 15)

Fig. 1: Box-plots of bootstrap critical values (N=1000) with data from

Fig. 2: Power graphics for different populations: G3 , = .05.

Fig. 3: Bootstrap and true power with heteroscedastic data, = .05.

Fig. 4: Box-plots of bootstrap critical values (N=1000) using

You might also like