6
Short-Time Fourier
Transform
S. Hamid Nawab
Boston University
Thomas F. Quatieri
Lincoln Laboratory
6.0 INTRODUCTION
In this chapter, we examine the short-time Fourier transform (STFT), which has
played a significant role in digital signal processing, including speech, music, and
sonar applications. It has been found that in such applications it is advantageous to
combine traditional time-domain and frequency-domain concepts into a single frame-
work. For example, the physical phenomenon of Doppler shift in signals from moving
sources is generally characterized as a change in center frequency over time. If this
center frequency is defined in terms of the Fourier transform, we encounter the
problem that there is only one Fourier transform for the entire signal. It is thus
impossible to characterize a change in the center frequency over time. In contrast, the
short-time Fourier transform consists of a separate Fourier transform for each instant
in time. In particular, we associate with each instant the Fourier transform of the signal
in the neighborhood of that instant. To illustrate this, we consider the Fourier trans-
form plots in temporal order in Fig. 6.1. These plots were obtained from successive
2-s intervals of an acoustic recording of a moving helicopter. Each plot contains a gap
whenever the Fourier transform magnitude exceeds a threshold. By following such
aps from plot to plot, we can track the frequencies of highest energy. In particular,
S. Hamid Nawab is with the Electrical, Computer, and Systems Engineering Department, Boston
University, Boston, MA 02215. Thomas F. Quatien is with Lincola Laboratory, Massachusetts Institute
of Technology, Lexington, MA 02173. This work was sponsored by the Department of the Air Force. The
views expressed are those of the authors and do not reflect the official poliey or position of the U.S.
government.
289Frequency (Hs)
230 Short-Time Fourier Transform Chap. 6
Figure 6.1 Consecutive Fourier
‘wansform magnitudes of an acoustic
recording of a moving helicopter. Each
plot contains a gap whenever the Fourier
transform magnitude exceeds @
threshold. By following such gaps from
plot to pot, we ean tack the frequencies
of highest energy. The frequency tracks
ofthe two highestenergy harmonics are
Time i) indicated by dashed lines.
we can clearly see the Doppler shift in the various harmonics of the underlying
periodic signal. As we will see later in this chapter, the STFT is also useful for
characterizing other time-dependent frequency changes in applications such as sonar,
‘music, and speech processing. In fact, it has even been suggested that the human ear
performs this type of time-frequency analysis on speech,
‘The rigorous development of the STFT originally took place in the context of
analog signals through the works of Fano [1] and Schroeder and Atal [2]. This work
‘was motivated by previous experimental work for measuring time-dependent spectra
with analog devices such as the sound spectrograph [3]. The time-dependent spectrum
atthe output of such a device is generally displayed in a form known as a spectrogram,
an example of which is illustrated in Fig. 6.2 for a segment of speech. In this,
two-dimensional display, the horizontal axis is time and the vertical axis denotes
frequency. The gray level indicates the spectral magnitude, with the darkest regions
‘corresponding to the highest energy. The real-time constraint for such analysis means
that the transform at any time should depend only on the past values of the signal. Fano
thus defined and developed formal properties of a transform that at any instant
weighted the past values of the signal with a decaying exponential and took the
squared magnitude of the Fourier transform of the result. Schroeder and Atal extended
the concept by using arbitrary weighting functions instead of the exponential. The