Professional Documents
Culture Documents
Pakistani
Pakistani
I. INTRODUCTION
estimators of regression coefficients with pooled data have smaller variances as (5') E(Ut U)8 = 40'2 for all t= s
compared to the estimators with yearly or quarterly data.
The paper is organized as follows. In Section 2 we present the model and = 0 for all t =f s.
derive an Ordinary Least Squares estimator and the corresponding variance- With no loss of generality, we assume that the quarterly observations are
covariance matrix of the vector of regression coefficients with alternative data sets. available over a whole number of years, say m. In addition, n yearly observations are
Relative efficiency of the alternative estimators is compared in Section 3. Section 4 also assumed to be available. Asis more likely, it is assumed that the yearly observa-
presents the conclusions of the paper with some thoughts on future research. tions appear before the quarterly ones. We consider the following three options
availableto estimate the relationship.
2. THE MODELAND ITS ESTIMATIONWITH
ALTERNATIVE DATA SETS Option I
Let the relationship to be estimated be described by the following linear Use yearly data over n + m years. With this option the regression model can
be written as:
regression equation:
Y t 1 =b1 + b2 x 2 t 1. + . . . . . + bk
' X
k tl
.+U .
tl (1)
B
['"
= (X' X + x' * X *)-1 (X' Y + X' * Y *)
Thus we can write Equation (1) for yearly observations as:
(2) E(B[ ) =B
Yt = 4b1 + b2 X2 t + +bk Xkt + Ut
2
V(B[) =4 0 (x' X + X~X *r1 (3)
where Ut satisfies the following obvious properties.
'It is assumedthat the matrix [X' X~] has full column rank k which is less than n + m.
(4') Ut is randomly distributed with E(Ut) =0 for all t and
718 Eatzaz Ahmad Combining Yearly and Quarterly Data 719
Option n E(BnI ) =B
Use only quarterly data over 4m quarterly observations. In this case the
V(Bm) = a2 (~X' X + x'*x *rl (5)
regressionmodel can be written as:
A-I) - vI I. Thus all v ;;. 0 implies that A(B-l - A-I) is positive semidefinite.
E(Bn) = B Since B is positive definite and A - B is positive semidefini te, sum of the two, that is
-:4is positive defmite and so is A-I. Multiplying this positive definite matrix (A-I)
V(Bn) = a2- (x~x *rl (4) by the positive semidefmite matrix A(B-1 - A-I) gives a positive semidefinite
matrix B-1 - A-I. This completes the proof.
Option III Let us now apply this theorem to compare variances of various estimators of
Combine.4m quarterly and n yearly observations. Sincevariance of Ut is four B. We will consider the variance of a linear combination of the elements of B r
times as high as the variance of Uti' we have the problem of heteroscedasticity. (r = I. II, /II), namely e' B r where, e is a k x 1 column vector of known constants.
With pre-adjustment for heteroscedasticity, the model becomes: .
n+m 4 n+m
L
4
L
t=n +1
Lex.
i=1 Itl
. - x.It ) (x
hti
- X
ht
) = L L
Zjti Zhti
2 The matrix x * is assumed to have full column rank k < 4m. t=n+l i=1
720 Eatzaz Ahmad Combining Yearly and Quarterly Data 721
4 theorem implies that the matrix (x'* x *r1 - (% X' X + x;" x *r1 is positive
where, Ziti = Xiti - x it' Zhti = x hti - x ht' x it = ~ x. ./4 semidefinite.
i= 1 ltl
4 Therefore we can write:
and Xht = .~
1=1
Xht./4
1 0'2 c'(x~x*r1c ~a2 c'(%X'X+x;"x*r1c
The above discussion impliesthat the whole matrix (%X'X + x~ *) - (%X'X This implies, according to Equations (4) and (5), that
+%"x'~*) =x~* - %x'~* can be written as Z~*where,
var (c' B II ) .~var (c' B III ).
0 x -x
2 ,n+1,1 2,n+1 xk,n+1,1 - xk,n+1
This result does not require any explanation. Addition of observations to a
0 x -x
2 ,n + 1,2 2 ,n +1 Xk,k+1,2 - x k,n +1 given data set improv~s efficiency of the estimators.
Yti =b x ti + Uti
Var (c' Bm) Versus Var (c' BI) This implies that:
Now consider the matrix (%X' X +X:.X*) - x'*x *= %X' X which is obvious-
var (bIII ) < var (bI ) unless X ti = x t for t = n+1, . . ., n+m
ly positive semidefinite. Since, in addition, x'* x * is positive defmite, the above
Ii t = 1, , 4
1
722 Eatzaz Ahmod
4
var(bII/) < var(bI/) unless 1=1
.~ x("1 = 0 for t = 1, , n.
S. CONCLUDINGREMARKS
Aggregation of quarterly data into yearly data for any part of the sample
period results in loss of efficiency. Using quarterly data alone when additional Comments on
yearly data are available also"resultsin loss of efficiency. The best use of the limited "Combining Yearly and Quarterly Data
data is to pool yearly observations with the quarterly observations with appropriate in Regression Analysis"
adjustment for heteroscedasticity. The result can be generalizedto include seasonal
effects in the regression equation. It can be shown that pooling yearly data to a given In this paper the author has proposed an alternative approach to deal with the
set of quarterly data also improves efficiency of seasonal effects although the yearly issue of non-availability of comparable data in regression analysis. In particular, the
data alone are uselessto estimate seasonal effects.3 paper deals with the problem where both quarterly as well as annual observations on
The research can be extended to develop tests for autocorrelation of first or some variables are availablebut the number of each type of observation i.e. quarterly
fourth order (these are the most likely orders of autocorrelation at quarterly level). limited as a result of which it is difficult to estimate the desired relationship, by
Our preliminary research suggests that it is quite complicated to determine the order either using quarterly or annual data with acceptable degrees of freedom. Various
of autocorrelation with pooled data. Once the order of autocorrelation is deter- approaches have been suggested in the literature to deal with such problems. A
mined, one can use various procedures to improve the asymptotic efficiency of the common weakness of most of these approaches is that they introduce measurement
estimates. errors. This paper suggests that instead of estimating the desired relationship by
either using quarterly or annual data the researcher should simply pool the quarterly
REFERENCES
and the annual data with appropriate adjustment for heteroscedasicity. It is claimed
Chow, G. C., and A. I. Lin (1971). "Best Linear Unbiased Interpolation, Distribution that the coefficients obtained using pooled data will have a lower variance as
and Extrapolation of Time Seriesby Related Series". Review of Economics and compared to the estimators with yearly or quarterly data.
Statistics. Vol. CIII, No.4. pp. 372-375. To show the relative efficiency of the estimates obtained using pooled data the
Friedman, M. (1962). "The Interpolation of Time Seriesby Related Series".Journal author has made use of one of the standard theorems available in the econometric
of American Statistical Association. Vol. 57, pp. 729-757. literature. There is thus little room to caste doubt on his claim. The paper, there-
Hsiao, C. (1979). "linear Regression Using Both Temporally Aggregated and Tem- fore, can be regarded as a theoretical contribution in the area.
porally DisaggregatedData". Journal of Econometrics. Vol. 10, No.2, pp. 243- A general problem faced by many researchers in at least developing countries
252.' is not that both quarterly as well as annual data are available for the same variables
Maddala,G. S. (1979). Econometrics. Tokyo; McGraw-Hill,Inc. but that for some variables quarterly data are available while for others the annual
Palm, F. C., and T. E. Nijman (1982). "linear Regression Using Both Temporally data are available. In these circumstances the techniques suggested by the author
Aggregated and Temporally Disaggregated Data", Journal of Econometrics. is hardly of any use. I wonder if we can somehow modify the suggestedtechnique
Vol. 19, pp. 333-343. to handle the above-mentioned problem which is more frequently faced by the
researchers.
Nadeem A. Burney
Pakistan Institute of
Development Economics,
Islamabad
3 'This result has been proved in the paper originally presented in the general meeting.