Professional Documents
Culture Documents
Chap5 - Multivariate Regression and Linear Model
Chap5 - Multivariate Regression and Linear Model
Chap5 - Multivariate Regression and Linear Model
5. Multivariate Regression
individual Y X1 X2 ··· Xk
1 y1 x11 x12 ··· x1k
2 y2 x21 x22 x2k
. . . . .
. . . . .
. . . . .
n yn xn1 xn2 ··· xnk
Assume that the model is linear in β1 , . . . , βk
Y = β1 X1 + β2 X2 + · · · + βk Xk + ε, ε ∼ N (0, σ 2 ).
Then, we have
y = X β + ε
n×1 n×k k×1 n×1
β̂ = (X 0 X)−1 X 0 y .
Remarks
1. Y is continuous and Xi 's are assumed to be non-random. Or when the Xi 's are
also random, we treat it as a conditional model given the values of the Xi 's.
Indeed, if(Y, X1 , . . . , Xk ) is (k + 1)-variate normal, we see from Property 7(d)
in Section 2.1 that the conditional mean E(Y |X1 , . . . , Xk ) is really a linear
function of the given values of X1 , . . . , Xk .
3. When all Xi 's are dummy variables for discrete variables, it is called (one-way,
multi-way) ANOVA.
4. When some are continuous and some are dummy, it is called ANCOVA.
HKU STAT7005 1
STAT7005 Multivariate Methods
5. Multivariate Regression
y11 · · · y1p x11 · · · x1k β11 · · · β1p ε11 · · · ε1p
. . . . . . . .
. . = . . . . + . . .
. . . . . . . .
yn1 · · · ynp xn1 · · · xnk βk1 · · · βkp εn1 · · · εnp
β 01
y 01 x11 · · · x1k ε01
. . . . .
. = . . . + .
. . . . .
y 0n xn1 · · · xnk β 0k ε0n
Y = X B + U
n×p n×k k×p n×p
where ε0j = (εj1 εj2 · · · εjp ) denotes the vector of errors of observation j for j =
iid
1, . . . , n; ε1 , . . . , εn ∼ Np (0, Σ).
Remarks
1. Yi 's are continuous and Xi 's are assumed to be non-random. Or when the Xi 's
are also random, we treat it as a conditional problem given the values of the
Xi 's.
HKU STAT7005 2
STAT7005 Multivariate Methods
5. Multivariate Regression
3. When all Xi 's are dummy variables for discrete variables, it is called (one-way,
multi-way) MANOVA.
4. When some are continuous and some are dummy, it is called MANCOVA.
Notation
Consider the response matrix Y = (yij ), where yij is the j th response from the ith
observational unit (i = 1, . . . , n; j = 1, . . . , p),
0
(1) (p)
Y = y ,...,y = y1 , . . . , yn ,
n×p n×1 n×1 p×1 p×1
and a k × p matrix B = (β`j ) where β`j is the `th regression coecient for the j th
response ` = 1, . . . , k ; j = 1, . . . , p). According to the partition of the matrix Y , y (j)
0
represents the observations for the j th response (j = 1, . . . , p) while y i represents
(j)
the ith observations of responses (i = 1, . . . , n). Simply speaking, y and y i are
corresponding to partition the matrix Y into columns and rows respectively. The
general multivariate linear model can be written as follows,
Y = X B + U , (5.2.1)
n×p n×k k×p n×p
iid
where U = (ε(1) , . . . , ε(p) ) = (ε1 , . . . , εn )0 and ε1 , . . . , εn ∼ Np (0, Σ). Based on the
denition of U ,
ε(1)
ε(2)
Vec(U ) = . .
.
.
ε(p)
Clearly,
E(Vec(U )) = 0
np×1
HKU STAT7005 3
STAT7005 Multivariate Methods
5. Multivariate Regression
The model combines p univariate linear models into a single multivariate linear
model. If the n × k matrix X has full rank and we apply the Least Squares
(LS) principle separately to the j th column of Y and the j th column of B ,
noting that the rows are independent, we shall obtain the LS estimator, which
are unbiased, for columns of B. Putting these columns together, we have
B̂ = (X 0 X)−1 X 0 Y (5.3.1)
1 0
S= Û Û , Û = Y − X B̂ ,
n−k
n
np n 1X
`(B, Σ) = − log(2π) − log |Σ| − (y − µi )0 Σ−1 (y i − µi )
2 2 2 i=1 i
np n 1 −1 0
=− log(2π) − log |Σ| − tr[(Y − XB)Σ (Y − XB) ]
2 2 2
np n 1 −1 0
= − log(2π) − log |Σ| − tr[Σ (Y − XB) (Y − XB)].
2 2 2
By the method of completing squares,
Therefore, we have
np n 1
`(B, Σ) = − log(2π) − log |Σ| − tr[Σ−1 (Y − X B̂)0 (Y − X B̂)]
2 2 2
1
− tr[X(B̂ − B)Σ−1 (B̂ − B)0 X 0 ]
2
HKU STAT7005 4
STAT7005 Multivariate Methods
5. Multivariate Regression
np n 1
=− log(2π) − log |Σ| − tr[Σ−1 (Y − X B̂)0 (Y − X B̂)]
2 2 2
n
1 X 0 −1
− a Σ ai ,
2 i=1 i
where a0i is the ith row of X(B̂ − B). Since a0i Σ−1 ai ≥ 0 for any B, we have
np n n 0
`(B, Σ) ≤ `(B̂, Σ) = − log(2π) − log |Σ| − tr[Σ−1 Û Û /n],
2 2 2
0
!
Û Û
`(B, Σ) ≤ `(B̂, Σ) ≤ ` B̂, .
n
(i) (j)
E(B̂) =B and Cov(β̂ , β̂ ) = σij (X 0 X)−1
Proof:
(i)
Since y (i) = Xβ (i) + ε(i) and β̂ can be obtained by the regression of the
response Yi on X1 , . . . , Xk , we have
(i)
β̂ = (X 0 X)−1 X 0 y (i)
= (X 0 X)−1 X 0 (Xβ (i) + ε(i) )
= β (i) + (X 0 X)−1 X 0 ε(i)
(i)
E(β̂ ) = β (i) for i = 1, . . . , p.
Hence,
(1)
β (1)
β̂
(2) (2)
β̂ β
E(Vec(B̂)) = E . = . = Vec(B).
. ..
.
β̂
(p) β (p)
HKU STAT7005 5
STAT7005 Multivariate Methods
5. Multivariate Regression
where
(i) (j)
Cov(β̂ , β̂ ) = (X 0 X)−1 X 0 Cov(ε(i) , ε(j) )X(X 0 X)−1
= (X 0 X)−1 X 0 σij I X(X 0 X)−1
= σij (X 0 X)−1 for i, j = 1, . . . , p.
0
E = Û Û = (Y − X B̂)0 (Y − X B̂)
= Y 0 (I − X(X 0 X)−1 X 0 )Y ∼ Wp (n − k, Σ). (5.4.1)
E 1
S= = (Y 0 Y − Y 0 X(X 0 X)−1 X 0 Y ). (5.4.2)
n−k n−k
Proof: Use property 6(a) of Wishart distribution in Section 2.3, with A =
I − X(X 0 X)−1 X 0 and M = E(Y ) = XB .
\
Var(Vec(B̂)) = S ⊗ (X 0 X)−1 .
HKU STAT7005 6
STAT7005 Multivariate Methods
5. Multivariate Regression
where the rank condition for matrix L ensures no redundant hypothesis. Then,
we can obtain the following results
Therefore, the hypothesis (5.5.1) can be tested on the basis of the eigenvalues
−1
of E H , where E is given by (5.4.1) and H by (5.5.2) (See Section 4.2 for
the four commonly used statistics in which Λ and R are used for LRT and
UIT respectively).
H0 : LBM = 0. (5.5.3)
HKU STAT7005 7
STAT7005 Multivariate Methods
5. Multivariate Regression
(a)
0
H = M 0 B̂ L0 {L(X 0 X)−1 L0 }−1 LB̂M ∼ Wm (c, M 0 ΣM ) under H0 ,
(5.5.4)
(b)
E = (Û M )0 (Û M ) ∼ Wm (n − k, M 0 ΣM ), (5.5.5)
where
E(y 0 ) = B 0 x0 .
0
\
E (y 0 ) = B̂ x0
0
\
Var(E (y 0 )) = Var(B̂ x0 )
(1)
x00 β̂
0 (2)
x0 β̂
= Var
..
.
(p)
x00 β̂
(1) (p)
where B̂ = (β̂ , . . . , β̂ ). Consider the (i, j )th element of the above
covariance matrix
HKU STAT7005 8
STAT7005 Multivariate Methods
5. Multivariate Regression
The (1 − α)100% Scheé's simultaneous C.I.s for E(y0i ) = x00 β (i) are
q
0 (i)
p
x0 β̂ ± Tα (p, n − k) sii x00 (X 0 X)−1 x0
2
0
!0 0
!
y 0 − B̂ x0 y 0 − B̂ x0
T2 = p 0
S −1 p 0
∼ T 2 (p, n − k).
0 −1 0 −1
1 + x0 (X X) x0 1 + x0 (X X) x0
Consequently, the (1 − α)100% condence region for y0 is a p-dimensional
ellipsoid,
( 0
!0 0
! )
B̂ x0 − y 0 B̂ x0 − y 0
y0 : S −1 ≤ Tα2 (p, n − k) .
1 + x00 (X 0 X)−1 x0 1 + x00 (X 0 X)−1 x0
p p
HKU STAT7005 9
STAT7005 Multivariate Methods
5. Multivariate Regression
5.7 Examples
Example 5.1
Paper qualities were measured for 41 specimen on their density (X ), strength
in machine direction (Y1 ) and strength in cross direction (Y2 ). The data set can be
found in the Moodle, named as paper_quality.dat .
(a) With this data set, t the multivariate regression for the two types of strength
on the density.
(c) Does the independent variable X have the same eect to the two types of
strength?
Solution
(a)
Consider the following regression model.
Y1 = β01 + β11 X + ε1
Y2 = β02 + β12 X + ε2
where the numbers in the parentheses are the standard errors of the coresponding
parameter estimates.
(b) We test the hypothesis H0 : β11 = β12 = 0. From the results of four multivariate
statistics, their p-values are less than 5%. Therefore, we can conclude that the
independent variable X is signicant at 5% level.
HKU STAT7005 10
STAT7005 Multivariate Methods
5. Multivariate Regression
respectively.
p-values are greater than
From the results of four multivariate statistics, their
5%. Therefore, we can conclude that the independent variable X has the same eect
on the strength.
Example 5.2
The data le BANK.DAT contains observations on 100 bank employees on each of
six variables. The six variables are
(a) Fit the multivariate regression for LCURRENT and LSTART on EDUC,
EXPER, AGE and SENIOR:
(b) Test whether all four independent variables are not useful. If not, test
individually for each independent variable.
(c) Test whether each of the four independent variables has the same eect to
LCURRENT and LSTART. If not, test individually for each independent
variable.
(d) Plot the residuals against predicted values of each dependent variable. Any
observations?
Solution
(a)
\
LCURRENT = 8.6988 + 0.0832EDUC + 0.0161EXPER − 0.0151 AGE + 0.00212SENIOR
(0.3012) (0.0097) (0.0045) (0.0038) (0.0030)
HKU STAT7005 11
STAT7005 Multivariate Methods
5. Multivariate Regression
\ =
LSTART 8.2848 + 0.0814EDUC + 0.0160EXPER − 0.0105 AGE − 0.0035SENIOR
(0.2679) (0.0087) (0.0040) (0.0034) (0.0027)
(b) For the testing of the joint eect of all variables, the hypothesis is
The linear hypothesis is H0 : LBM = 0 where
α 0 β0
0 1 000 α 1 β1
0 0 1 0 0
L= , B = α 2 β2 , and M = I2
0 0 010
α 3 β3
0 0 001
α 4 β4
respectively.
All multivariate tests reject the null hypothesis. Therefore, the hypothesis that
all four independent variables are not useful is rejected.
The multivariate tests for the eect of each independent variables (write down
L and M) are given in the order of EDUC, EXPER, AGE and SENIOR below:
HKU STAT7005 12
STAT7005 Multivariate Methods
5. Multivariate Regression
(c) For the hypothesis that all independent variables have the same eect on
LCURRENT and LSTART,
0 1 0 0 0
0 0 1 0 0 1
L= and M= .
0 0 0 1 0 −1
0 0 0 0 1
It is rejected at 5% because the p-values by all 4 multivariate tests are less than
5%.
HKU STAT7005 13
STAT7005 Multivariate Methods
5. Multivariate Regression
For the hypothesis that AGE has the same eect on LCURRENT and LSTART,
1
L= 00010 and M= .
−1
It is rejected at 5% because all p-values of multivariate tests are less than 5%.
For the hypothesis that SENIOR has the same eect on LCURRENT and
LSTART,
1
L= 00001 and M= .
−1
It is rejected at 5% because all p-values of multivariate tests are less than 5%.
For The hypothesis that EDUC and EXPER have the same eect on
LCURRENT and LSTART,
01000 1
L= and M= .
00100 −1
For the hypothesis that EDUC, EXPER and AGE have the same eect on
LCURRENT and LSTART,
0100 0
1
L= 0010
0 and M= .
−1
0001 0
It is rejected at 5% because all p-values of multivariate tests are less than 5%.
HKU STAT7005 14
STAT7005 Multivariate Methods
5. Multivariate Regression
(d) The plots (ε̂1 vs LCURRENT) and (ε̂2 vs LSTART) are given below:
No specifc pattern is observed in both plots. This implies the residuals are close
to random.
(e) To check the univarite normality of residuals ε̂1 and ε̂2 , the Q-Q plots for residuals
of tting LCURRENT and LSTART are given belows:
Since the dots in both diagrams move along a straight line approximately, the
residuals are close to univariate normal distribution.
HKU STAT7005 15
STAT7005 Multivariate Methods
5. Multivariate Regression
However, from the chi-square plot, the dots are not close to the reference line.
Therefore, the multi-normality assumption may not be so valid.
(f )
rep.meas = lcurrent:
educ exper age senior lsmean SE df lower.CL upper.CL
12 4 28 77 9.50 0.0543 95 9.39 9.61
rep.meas = lstart:
educ exper age senior lsmean SE df lower.CL upper.CL
12 4 28 77 8.76 0.0483 95 8.67 8.86
Confidence level used: 0.95
> sci95
[,1] [,2]
lcurrent 9.364883 9.636202
lstart 8.643349 8.884700
HKU STAT7005 16