Professional Documents
Culture Documents
This Content Downloaded From 143.107.252.171 On Thu, 08 Jul 2021 04:15:01 UTC
This Content Downloaded From 143.107.252.171 On Thu, 08 Jul 2021 04:15:01 UTC
This Content Downloaded From 143.107.252.171 On Thu, 08 Jul 2021 04:15:01 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to
digitize, preserve and extend access to Journal of the American Statistical Association
ROBERT H. SHUMWAY*
In the area of applied time series analysis a commonly occurring problem involves
the detection and estimation of signals (regression functions) imbedded within a
collection of independent identically distributed noise processes. A general linear
model which represents each observed time series as the sum of a wide sense sta-
tionary error process and a vector of regression functions operated on by a matrix of
time invariant observables includes as special cases many signal models of interest.
In this article a possible unified approach to estimation and tests of hypotheses for
this linear model is presented. Asymptotic regression estimates and analysis of
variance (power) tables are presented in the frequency domain and simple deriva-
tions for the probability distributions of the sums of squares are given. The result-
ing analysis of power partitions the spectral power in each frequency band into
components which can be attributed directly to each of the regression functions. As
an example, a sample of ten time series is analyzed which contains a mean value
function and an effect function in the presence of error.
1. INTRODUCTION
1527
be used in either the finite or infinite dimensional case by solving the appro-
priate matrix or integral equation for the eigen values and eigen functions of 2,
the covariance function.
For stationary data, however, there is another possibility afforded by ex-
panding both sides of (1.3) with respect to a complete set of basis vectors given
by the complex exponentials eiwt. The noise process, then, is expanded in a set
of independent complex normal variables, and the matrix convolution in (1.3)
simplifies to a simple linear operation in the w or frequency domain. Since the
model in the transformed space spanned by the basis vectors eiwt is a linear
model, an extended form of the Gauss-Markov theorem yields best linear un-
biased estimates for the p regression functions.3
As a practical tool for the finite dimensional sampled data case, with the
time series observed at T points, I shall employ the finite Fourier transform
(see [3]) as an approximate orthogonalizing transformation. It is well known
that random variables generated by taking the Fourier transform of anob-
served stationary process are uncorrelated as the sample length T increases,
with asymptotic variances equal to the spectral ordinates f(- ), since the squared
moduli of the transformed variables are values of the Schuster periodogram.
This partition of the total variation into the power or variance contributed by
the separated cyclical or frequency dependent components is natural because
the periodic nature of many signal-generating phenomena implies that there
will be certain frequencies where the power of the signal or effect function
greatly exceeds the error power when the effect function is present. This cor-
responds to letting the frequency bands play a blocking role with the power or
variance of each of the p regression functions evaluated as a within block power.
Since adjacent frequencies are independent, the original multivariate regression
problem in time has been transformed into an approximate univariate regression
problem in frequency.
An algorithm for fast finite Fourier transformation can be taken from a con-
venient reference such as Jenkins and Watts [14]. A number of examples of its
use in convolutions and spectral estimation can be found in Cooley, et al. [3].
The structure of the variables obtained on transformation is approximately
compatible with the structure necessary for certain statistical testing procedure
involving complex random variables given in Giri [6], Goodman [7] and Khatri
[15]. Results from these papers are used in our practical procedure.
In this article a possible unified approach to problems of estimation and
tests of hypotheses for regression models of the form (1.3) is proposed and il-
lustrated. Basically, the approach is an exact copy of the usual analysis of
variance approach on a frequency dependent basis where an approximate like-
lihood ratio test yields the power or variance due to a particular subset of the
original p regression functions. In this way one may analyze the power due to
generalized effect functions such as those appearing in (1.2) in the same way
that one evaluates the variances due to treatment effects in a completely ran-
domized design for a classical analysis of variance. A frequency dependent
goodness of fit criterion analogous to R2 in the usual case is given. Simple deriva-
tions for the asymptotic probability distributions of the power components are
S See [22].
presented so that the prospective user of the method can easily modify the
theory to fit his particular model. Finally, as an example of the practical aspects
of the estimation and testing procedure, data are generated which conform to
the model given by (1.2). The mean value and generalized effect functions are
estimated and compared with the true input signatures. Tests of hypotheses
are performed to verify experimentally the approximate distribution of the F
statistics under each of the null hypotheses.
For the general linear model given by (1.3) assume that the N time series
are observed at T points where T> N. The finite Fourier transform (see [3]) of
the normal real zero mean stationary error series in (1.3) is of the form
=n (2wxn)/T. (2.2)
The squared modulus of the complex random variable in (2.1) is the usual
Schuster periodogram, and discussions of its use in spectral estimation can be
found in references [2, 3, 14, 17, 19]. The historical use of the periodogram as a
test statistic has primarily involved the case where the series are regarded as
independent and identically distributed over time so that the harmonic com-
ponents of the periodogram can be regarded as independent chi-square variates
with two degrees of freedom each.4 Even when observations are correlated over
time it can be shown that, when T is even, the first T/2 random variables in
the sequence defined by (2.1) will be asymptotically (T-* oo) uncorrelated with
asymptotic variance equal to the theoretical power spectrum f(W.) at the fre
quency point wco where f( ) is the power spectral density defined in (1.4).
Proofs of this property may be found in Grenader and Rosenblatt [9], Hannan
[10], Rosenblatt [21] and Jenkins and Watts [14]. An additional property of
the complex random variables ej(n) is that their real and imaginary parts will
be asymptotically uncorrelated zero mean normal random variables with com-
mon asymptotic variance f(W(n)/2 except at Wn =0, 7r, where the varianc
f(Wn). A short proof shows that the accuracy of the approximations is of the
order log TIT.
Suppose that for a sample of time series observed at t =0, 1, * *, T -1 we
assume that the model given in (1.3) obtains. Then, if the finite transforms of
ej(t), fj(t), Yj(t) and Xjk(t) are denoted by ej(n), , (n), Pj(n) and ZJk(n),
respectively, the basic model in (1.3) can be written in the frequency domain
as (the tilde denotes the finite Fourier transform)
where Y(n) and e (n) are N X 1 complex vectors with X(n) an N X p complex
matrix and 5(n) a pX1 vector of complex regression coefficients. Now, from
the discussion after (2.1) and (2.2) it follows that the asymptotic distribution
of the random vector e (n) is a special case of the complex multivariate normal
distribution of Goodman [7]. Therefore, an approximate version of the likeli-
hood of the (T/2)+1 vectors Y(O), Y(1), ..., Y((T/2)-1), Y(T/2) can be
written using the fact that for w,n, FO, 7r, e(n) follows asymptotically an
ate complex multivariate normal distribution with zero mean and covariance
matrix (f(Wn))IN where IN denotes the N XN identity matrix. At W.- = 0, -,
e (0) and e (T/2) are asymptotically real multivariate zero mean normals with
covariance matrix f(wco)IAN, and e (m) and e(n) are asymptotically independent
for m#n.
Now, by using the results of Goodman [7] and Khatri [13], it is immediately
obvious that the approximate maximum likelihood estimates for 5 (n) and
f(W,) are
which is a real quadratic form in the 2NX2N matrix A (n). Since the rank of
the idempotent matrix A(n) is 2(N-p), it follows that Nf+(co") is asymptoti-
cally distributed as f(Cwn)/2 times a central chi-square variate with 2(N-p)
degrees of freedom except at wo =0, ir. At the two endpoints the vectors in-
volved in the Hermitian form are purely real so that Nf+(wn) at these points
is distributed asymptotically as f(wn) times a central chi-square with (N - p)
degrees of freedom. Hence, an asymptotically unbiased estimate of the error
spectrum f(con) is Nf?+ (w,) / (N-p).
We note that the distributions of Hermitian forms in complex iiormnal vectors
involving either a sample or theoretical covariance matrix have been derived
in Giri [6]. The results for Hermitian forms in complex normal vectors involv-
ing idempotent matrices are simple extensions which use the original Goodman
structure. I include the derivation in (2.9) as an example of the simple pro-
cedure to be followed for extending the distribution theory of real quadratic
forms in the analysis of variance (see [8]) to the complex case. The next section
presents an analysis of power for the two-partition linear hypothesis which
allows selective testing for the presence of the different regression functions in
(1.3).
combined into the usual analysis of variance table.' I shall term such a par-
tition on a frequency dependent basis an analysis of power with the results
summarized in Table 1 where Z(n) = (X1(n), X2(n)) is the partitioned NXp
matrix composed of the N X pi and N X P2 matrices X1(n) and X2(n).
At the frequency points w,, =0, X- the degrees of freedom are halved. We no
that the asymptotic distributions of the Hermitian forms are derived again by
expressing the regression model in terms of the real and imaginary parts of the
matrices appearing in (3.2). Then, with the Hermitian forms equated to real
quadratic forms as in (2.9), the Fisher-Cochran theorem (see [20, p. 129])
applies yielding the chi-square distributions implied by the degrees of freedom
column in Table 1. If a test of the hypothesis 01(n) = 0 is of interest, I use the
ratio of the average power due to 01(n) divided by the mean error power esti-
mated by
has a central F-distribution with 2p2 and 2(N-p2) degrees of freedom when
=2(n)=0 and a noncentral F-distribution otherwise. If a test for goodness of
fit over and above the mean is desired it is clear that
Total Y*(n)Y(n) 2N
5 See [81.
4. SIMULATED EXAMPLE
1. Specify the functions Xjkt) and 13m(t) in the regression model given by (1.3) or
partitioned version (3.1).
2. Construct the theoretical entries in the matrix H(w) under the full hypothesis and
under the reduced hypothesis by calculating the appropriate version of (4.3).
3. Calculate an approximation, say Hajk(Of), to (4.3) over the interval 0 <w <2lr
using the points w, = 2rn/L, n = 0, 1, * ,L-1 with L .1 T where T is the length
of the data series.
4. An approximation to the time function wben the data begins at zero then is given
as
Sj3(t) = ,1 Z=o hJk(t - u) Yk(U) = nk= 0 Ej=o eiwkt fjk(n) k(n). (4.8)
Hence, I construct the estimators for the regression functions by calculating the
frequency functions
As a possible regression model for a set of time series I will use (1.2), which
assumes that each time series is composed of a fixed mean value function 1,(t)
plus a generalized effect function a (t) which appears at time t - Tj and then at
time t+ Tj on the jth time series with Tj known. It is realized in practice by
imagining that the N time series are suspended on a string. A signal a(t) is
generated which progresses up the array, is reflected off the top, and then moves
back down. The mean trend function ,(t) might be a signal which hits the
Power
//11 \\c
\ /
string broadside whereas the effect function is another signal whose angle of
incidence and velocity are specified by the time delays T1, T2, * * *, TN. The
model is similar to an analysis of variance model in the sense that I am trying
to resolve the total power into two directional components in much the same
way as a classical analysis of variance isolates row effects and column effects.
Ten time series (N = 10) each containing 1024 (T = 1024) points were gener-
ated according to the model specified by (1.2). In order to provide a check on
the probability distributions involved in the analysis of power the errors ej(t)
were generated by drawing normal independent variables with zero means and
standard deviation two and smoothing with the coefficients .1, .5, .2, -.1, .2
and .1. This leads to a correlated error process with a standard deviation of
1.2. The theoretical power spectrum of this series is shown in Figure 1. A fixed
mean value function ,u(t) to be added to each trace was generated by taking
three-point running averages of zero mean independent normal random vari-
ables. The standard deviation of the resulting process was 1.0. (A portion of the
1024-point fixed function ,u(t) is shown in Figure 5 and its theoretical spectrum
is given in Figure 1.) An arbitrary sampling rate of 20 points per second is as-
sumed which yields a ten cycle per second folding frequency corresponding to
the angular point 7r. The fixed effect function a(t) is generated as a relatively
short (128-point) exponentially decaying sine wave oscillating at three cycles
per second. (The true effect function of maximum amplitude two is shown in
Figure 4.)
Figure 2 shows a 240-point portion of one of the ten time series simulated
according to the model implied by (1.2). The arrows specify the two entry
points for a (t) and are respectively 23 points (To0 23) to the left and right
the center point. The other time delays T1, T2, * * *, T9 are 0, 12, 14, 19, 28,
29, 31, 38 and 50 points, respectively.
Equation (1.2) may be written as a special case of the general model (3.2)
by noting that with Pl = P2 = 1, p = 2, f,3(1) (t) = ,u(t), f3l(2) (t) = a(t) and
A (w)
A(c, = cos2wTk,
- k1 COS2 o,
for k = 1, N. Under the reduced model when a(t) = 0 it is easily seen that
the optimum filters are all equal to 1/N. In order to construct the approxima-
tion to the optimum time invariant filters corresponding to the frequency re-
sponse functions given in (4.14) and (4.15) consider a 128 (L = 128) point ver-
sion computed at the points wn =27rn/128, n=0, 1, * * *, 127. I take Hajk(O) =
for the singular point introduced by A(0) = 0 so that the approximate filters do
not pass the zero frequency. Then Step 4 yields an approximation to the time
function which is corrected to a two-sided time scale by (4.5). Two representa-
tive filter functions hi,io(t) and h2,io(t) are shown in Figure 3. The filter for
estimating the mean value function contains the expected peak at t= 0, while
the effect function filters contain peaks at the time delays experienced by the
function a(t) as it propagates up and down the array. The negative small peaks
match the effect function time delays T1, T2, * * * , Tg not present on Ylo(t).
In order to estimate ,u(t) and a (t) under the full hypothesis I obtain the finite
transforms of Yk(t) and hjk(t) using (4.6). Then (4.9) and (4.10) yield the esti-
mated functions A+(t) and a+(t). The true and estimated effect functions are
compared in Figure 4. Even in this relatively noisy case a reasonable repro-
duction of the primary signal is obtained. The true and estimated mean trend
functions are compared in Figure 5 and a similar result can be noted.
In order to perform the analysis of variance in Table 1, I reconstruct the
h1 10(t)
h2,10 (t)
z
0
U
LL.
U
LU
LL. (/)
LL.
LUJ Z
LUJ
LU
~~~~~~~~~1)~~~~~~~~~~~~~.
4-)~~~~~~~~~~~~~~~~~~-
o T
D C=4 ':=4~~~
41)U 4-4
4 4) 41)
4-)
E-4~~~~~~~~- 4-)
EQ
0
U
L..
2::
LUJ
0
I...
U.
UZ
a) a)
4-) 4-)
The power spectrum is computed using the procedure in Step 7 with the total
power calculated as in Step 8. Now, the power due to the mean ,u(t) is just the
total predicted power under the assumption that a(t) = 0. The optimum filter
functions under this assumption are all equal to 1/N so that each predicted
trace is just
z
0
0 ~-4c H- C\M H- Y 1 -~: ' c oC4(~ ~C C\ tm - Q~C0 co J C C\ M O~O LCtlo 0 tLC-4 Lr>--
- ~ C ':o
H ~ 40~ o
0 ftC
r oH
4C tm -o 0\'J CM C nz CO
00t \00 00 00 000L0 n 0 0 0 0 0 000Q 000 CL
LU c I Ct c c n Ct L 4 r 4 n 6 C H 4 r 9 O r O .H U H H H C
~~~~~~1i n C\j
0 r1i k-0 J Lf) C\j mr (> r-I Ln - ri CY) tr-) Ft:\ C\J r1i Ln M r- r1 m 0 c O co ( t) m M(
C) 0 -c t-kE LONO S\ OOt-D 0t r- 10 r- \ N C\ 00 ( m C\jr0 rI r- N \ \ Y
CLU4 Lf OJ - r-i t-) -4- 0 CC) 0 Lnzt \, 0 LnOO Ln Ch 1 CQl 0 0 Lf C\ -t ( -JJ CX: L!r C\J Mr)a
~~~~~t . Ln )t \o (r)a) F6) I4J t-X\9(O (n rit -4 0t:,oo t-o0) \'OF< NC\J O- \j( r1i C\ 0 oo
.3.
r- O41 -0C04ki0->-k)f~M H0
0~~~~'0, 09 W) -O LA. \,9 \) D M M XC ( r-i CO t- M) OJ ztO h O r-i y)-:t 00 M Ln rIq C5 t-O) 'I
S L 4 C5tO.: CC\H c>-<o\ o O - BLfV-X C\L CUl M1O C\C LfCM Ov(f4CO L C N Cr CY
LL.
_ I
: t:3 Ma O' t^-, -{ WH- \D t t- n 0 t- -n CQ 0r-0HC CQ ) M Lnt O \O 0 C) H MCOO 0)
C R; Li-Or L)4 O LH H\ CM0 H nO H H CM COH H CM H H HOH 0 00 O O O O 0
50
I'
I"
TOTAL POWER
- Cycles per
0 second (cps)
2.5 5.0 7.5 10.0
remained well below their significance values. Table 3 gives the theoretical and
observed percentiles of the F-statistics corresponding to the test for the signifi-
cance of the effect function. These sample and theoretical F-ratios agree very
well except at the .75 and .90 levels. Neither column, however, fails the Kolmo-
gorov-Smirnov test. Thus, it seems that the estimation and hypothesis testing
procedure gives results which are consistent with the asymptotic theory.
_ \ g F ~~~~~~~~~~~~~~~~99(30,00)
0.0 cps
2.5 5.0 7.5 10.0
5.0
F 99(30, 0)
0.0 cps
2.5 5.0 7.5 10.0
The computer time involved in the preceding example was not excessive; a
complete run involving the simulation of the data, estimation of the regression
functions, and analyses of power for the three models requires about 15 minutes
on an IBM 360 (Model 50) with approximately 130,000 bytes of storage avail-
able for each program. For this particular program no auxiliary storage is used
and the number of time series which can be analysed is unlimited. Core storage
limits the length of each series to about 2048 points with the filter length usually
less than 512 points.6
5. DISCUSSION
8 Copies of the program may be obtained by writing to Department of Statistics, The George Washington
University, Washington, D. C.
REFERENCES
[1] Anderson, T. W., Multivariate Analysis, New York: John Wiley & Sons, Inc., 1959,
Chap. 8.
[2] Bendat, J. S. and Piersol, A., Measurement and Analysis of Random Data, New York:
John Wiley & Sons, Inc., 1966.
[3] Cooley, J. W., Lewis, P. A. and Welch, P. D., "The Fast Fourier Transform and Its
Applications," IBM Research Paper, RC-1743, 1967.
[4] Dixon, W. J. and Massey, F. J., Introduction to Statistical Analysis, New York:
McGraw-Hill Book Co., 1957.
[51 Fisher, R. A., "Tests of Significance in Harmonic Analysis," Proceedings of the Royal
Statistical Society, A, 125 (1929), 54.
[6] Giri, N., "On the Complex Analogues of T2 and R2 Tests," The Annals of Mathemati-
cal Statistics, 36 (April 1965), 664-70.
[7] Goodman, N. R., "Statistical Analysis Based on a Certain Multivariate Complex
Gaussian Distribution," The Annals of Mathematical Statistics, 36 (March 1963),
152-76.
[8] Graybill, F. A., An Introduction to Linear Statistical Models, Vol. 1, New York:
McGraw-Hill Book Co., 1961.
[9] Grenander, U. and Rosenblatt, M., Statistical Analysis of Stationary Time Series,
New York: John Wiley & Sons, Inc., 1957.
[10] Hannan, E. J., Time Series Analysis, London: Methuen, 1960.
[11] , "Regression for Time Series," in M. Rosenblatt, ed., Time Series Analysis,
New York: John Wiley & Sons, Inc., 1962, 17-37.
[12] Hartley, H. O., "Tests of Significance in Harmonic Analysis," Biometrika, 36 (June
1949), 194-201.
[13] Helstrom, C. W., Statistical Theory of Signal Detection, London: Pergamon Press,
1960.
[14] Jenkins, G. M. and Watts, D., Spectral Analysis and Its Applications, San Fran-
cisco: Holden-Day, 1968.
[15] Khatri, C. G., "Classical Statistical Analysis Based on a Certain Multivariate Com-
plex Gaussian Distribution," The Annals of Mathematical Statistics, 36 (February
1965), 98-114.
[161 Parzen, E., Stochastic Processes, San Francisco: Holden-Day, 1962.
[17] , Times Series Analysis Papers, San Francisco: Holden-Day, 1967.
[18] , "Time Series Analysis for Models of Signal Plus White Noise," in B. Harris,
ed., Spectral Analysis of Time Series, New York: John Wiley & Sons, Inc., 1967.
[191 , "On Empirical Multiple Time Series Analysis," in L. LeCam, ed., Proceed-
ings of the Fifth Berkeley Symposium, Vol. 1, Berkeley: University of California
Press, 1967, 305-40.
[20] Rao, C. R., Linear Statistical Inference and Its Applications, New York: John Wiley
& Sons, Inc., 1968.
[21] Rosenblatt, M., Random Process,es, Oxford, Eng.: Oxford University Press, 1962.
[22] Shumway, R. H. and Dean, W. C., "Best Linear Unbiased Estimation for Multi-
variate Stationary Processes," Technometrics, 10 (August 1968), 523-34.
[23] Shumway, R. H. and Husted, H., "Frequency Dependent Estimation and Detection
for Seismic Arrays," Technical Report No. 242, Seismic Data Laboratory, Geotech,
A Teledyne Co., Alexandria, Va., January 1970.
[24] Wahba, Grace, "On the Distribution of Some Statistics Useful in the Analysis of
Jointly Stationary Time Series," The Annals of Mathematical Statistics, 39 (Decem-
ber 1968), 1849-62.