Análisis de La Convergencia de Sucesiones para Wavelets

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Lecture 27

Wavelets and multiresolution analysis (cont’d)

We continue the discussion from the previous lecture.

Let us now revisit the example studied above but in another manner. Instead of focussing on the
expansion coefficients ajk and bjk , however, we now consider the values of the function f1 (x) on
each interval, namely 1/12 and 7/12. Recall that each of these values is obtained by multiplying the

appropriate coefficient a0k by 2, which is the value of the basis function φ0k (x). But let’s rewrite
this equation in terms of the original Haar scaling function φ(x), first in terms of the a0k and then in
terms of the coefficients A0k introduced earlier:

f1 (x) = a10 φ10 (x) + a11 φ11 (x)


√ √
= a10 2φ(2x) + a11 2φ(2x − 1)

= A10 φ(2x) + A11 φ(2x − 1). (1)

Since the Haar scaling function φ(2x) = 1 on the interval [0, 1/2), and is zero everywhere else, it
follows that f1 (x) = A10 on [0, 1/2). Likewise, it follows that f1 (x) = A11 on [1/2, 1). We’ll see later
that this holds in general: The constant Ajk corresponds to the value of the projection fj (x) on the
subinterval [k/2j , (k + 1)/2j ). In the special case of the Haar scaling function, it can also be shown
(Exercise) that Ajk corresponds to the mean value of the function f (x) over this subinterval.

Analysis and synthesis algorithms at general resolutions

Previously, we showed how the two expansions of a function f1 ∈ V1 are related to each other. We
can generalize this result to the decomposition

Vj = Vj−1 ⊕ Wj−1 (2)

for any j ∈ Z.

Let fj ∈ Vj , which may be the projection of a function f ∈ L2 (R) on Vj . It can be written in two
ways,
X
fj = ajk φjk (expansion in Vj basis), (3)
k∈Z

329
and
X X
fj = aj−1,k φj−1,k + bj−1,k ψj−1,k (expansion in Vj−1 ⊕ Wj−1 basis)
k∈Z k∈Z
= fj−1 + wj−1 . (4)

We shall proceed in the same way as we did in the previous lecture, first by expressing all functions
in the above expansions in terms of the Haar scaling function φ(t) and wavelet function ψ(t):
X
fj (t) = ajk 2j/2 φ(2j t − k)
k∈Z
X
= Ajk φ(2j t − k), (5)
k∈Z

and
X X
fj (t) = aj−1,k 2(j−1)/2 φ(2(j−1) t − k) + bj−1,k 2(j−1)/2 ψ(2(j−1) t − k)
k∈Z k∈Z
X X
j−1
= Aj−1,k φ(2 t − k) + Bj−1,k ψ(2j−1 t − k), (6)
k∈Z k∈Z

where, as before, we have introduced the modified coefficients,

Ajk = 2j/2 ajk , Bjk = 2j/2 bjk . (7)

We now return to the following equations, derived in the previous lecture,

1 1
φ(2t − 2k) = φ(t − k) + ψ(t − k)
2 2
1 1
φ(2t − 2k − 1) = φ(t − k) − ψ(t − k), (8)
2 2

which, you will recall, represents two cases, φ(t − even) and φ(t − odd). In order to apply these
equations to the expansions for fj (t), we’ll have to make one change, i.e., to replace t with 2j−1 t in
(8). Then split the sum in (5) into sums over even and odd indices, substitute the modified results
for (8) into this split equation, collect terms and compare with (6). The final result is given in the
following algorithms (Exercise):

330
Analysis or deconstruction algorithm for Vj = Vj−1 ⊕ Wj−1 (Haar wavelet basis)

1
Aj−1,k = [Aj,2k + Aj,2k+1 ]
2
1
Bj−1,k = [Aj,2k − Aj,2k+1 ] , (9)
2

or, in terms of the ajk and bjk ,

1
aj−1,k = √ [aj,2k + aj,2k+1 ]
2
1
bj−1,k = √ [aj,2k − aj,2k+1 ] . (10)
2

These results can easily be rewritten in order to express the higher resolution coefficients Ajk in
terms of the lower resolution coefficients Aj−1,k and Bj−1,k , etc.. The result is

Synthesis or construction algorithm for Vj = Vj−1 ⊕ Wj−1 (Haar wavelet basis)

Aj,2k = Aj−1,k + Bj−1,k

Aj,2k+1 = Aj−1,k − Bj−1,k . (11)

or, in terms of the ajk and bjk ,

1
aj,2k = √ [aj−1,k + bj−1,k ]
2
1
aj,2k+1 = √ [aj−1,k − bj−1,k ] . (12)
2

Note that we have written “Haar wavelet basis” in the above subtitles, in order to emphasize that
the above results apply to this particular wavelet system. Analysis/synthesis algorithms will exist for
other wavelet bases, but the forms of the equations, i.e., the coefficients, the number of coefficients on
the RHS, will be different.

Practical application of analysis/synthesis algorithms

We have arrived at one of the most important topics of this course regarding wavelet methods – the
use of the analysis/synthesis algorithms to (1) analyze signals and possibly (2) enhance (i.e., denoise,

331
deblur) them. The following discussion is quite simplified, but it does provide a general picture of
wavelet-based methods of signal/image analysis and processing.

In what follows, we assume that we are presented with some kind of digital data, possibly obtained
from the equal-time sampling of a continuous signal f (t). Because of the dyadic (i.e., powers of two)
nature of the wavelet spaces, we’ll also assume that the number of data points is N = 2J for some
J > 0.
So, as before, we assume that this digital sampling has produced the data points,

f [0], f [1], · · · , f [2J − 1]. (13)

Without loss of generality, we assume that these data points define the following piecewise-constant
approximation to f (t) on [0, 1].
J −1
2X
fJ (t) = f [k]φ(2J t − k), (14)
k=0

where, once again, φ(t) denotes the Haar scaling function. Note our choice of the scaled functions
φ(2J t − k) since 
 1, t ∈  k , k+1  ,
J 2J 2J
φ(2 t − k) = (15)
 0, otherwise.

By construction fJ ∈ VJ , the space of piecewise-constant functions on intervals of width 2−J . In the


previous section, we denoted the expansion of such functions as
X
fJ (t) = AJk φ(2J t − k). (16)
k∈Z

In fact, at any level 0 ≤ j ≤ J, the projection of our original continuous function f on the space Vj
will have the form
X
fj (t) = Ajk φ(2j t − k). (17)
k∈Z

The main point is as follows: From the given VJ resolution of our digital data, we may,
using the analysis/deconstruction algorithm to progressively compute construct lower
resolution projections Vj along with accompanying detail functions Wj , until we stop at

332
V0 :

fJ (t) = fJ−1 (t) + wJ−1 (t)

= fJ−2 (t) + wJ−2 (t) + wJ−1 (t)


.. .. ..
. . .

= f0 (t) + w0 (t) + w1 (t) + · · · + wJ−1 (t). (18)

This may be done by performing the analysis algorithm on the Ajk /Bjk coefficients, starting with
the input data,
AJk = f [k], k = 0, 1, · · · , 2J − 1 , (19)

and producing Ajk /Bjk coefficients for j = J − 1, then j = J − 2, and ending at j = 0. At each step,
the coefficients Ajk define the projection fj ∈ Vj via Eq. (17). Since these coefficients multiply the
basis functions φjk which span Vj , we shall refer to them (as well as the associated ajk ) as scaling
coefficients.
The Bjk coefficients define the detail function wj ∈ Wj that must be added to fj to produce the
approximation Vj+1 . Since these coefficients multiply the basis functions ψjk which span Wj , we shall
refer to them (as well as the associated bjk as wavelet coefficients.

This is essentially what was done in the figures presented in Lecture 23, where we started with
the f5 projection of a function f (t) and computed lower resolution projections fj and associated detail
functions wj . But we wish to go further than that – we would like to work with the wavelet coefficients
bjk that define the detail functions wj . So we can proceed in two ways:

1. Starting with the values AJk = f [k], 0 ≤ k ≤ 2J − 1, perform the analysis algorithm in the
Ajk /Bjk scheme to produce the coefficients Bjk , 0 ≤ k ≤ 2j − 1 for j = J − 1, J − 2, · · · , 0. Then
construct the wavelet coefficients bjk from these coefficients from their connection:

bjk = 2−j/2 Bjk . (20)

2. Start with the values aJk = 2−J/2 AJk = 2−J/2 f [k], perform the analysis algorithm in the ajk /bjk
scheme to produce directly the wavelet coefficients bjk , 0 ≤ k ≤ 2j − 1 for j = J − 1, J − 2, · · · , 0.

It’s really not a big deal which approach you apply – the first involves the factor 1/2 and the

second 1/ 2. The important point is that you produce the wavelet coefficients bjk . It is on these
coefficients that wavelet-based signal/image processing schemes can then be applied.

333
At each step j, the analysis algorithm takes the 2j scaling coefficients ajk and produces 2j−1
scaling coefficients aj−1,k and 2j−1 wavelet coefficients bj−1,k , for a total of 2j coefficients. The total
number of coefficients remains unchanged. You then repeat this procedure for the next smallest
j value, until you must stop at j = 1, where you have produced a00 and b00 . This analysis or
deconstruction procedure is illustrated schematically in the figure on the next page.

{aJk }

{aJ−1,k } {bJ−1,k }

{aJ−2,k } {bJ−2,k }

{a1,k }

{a0,0 } {b0,0 }

Schematic to illustrate the analysis or deconstruction procedure, starting with the scaling coefficients aJk ,
0 ≤ k ≤ 2J − 1, to produce the wavelet coefficients bjk .

The result of the analysis procedure is to produce a00 along with the wavelet coefficients bjk ,
j = 0, · · · , J − 1, k = 0, · · · , 2j − 1, which may be arranged as follows:

a00
b00
b10 b11
b20 b21 b22 b23
.. .. .. .. .. .. .. ..
. . . . . . . .
bJ−1,0 bJ−1,1 ··· ··· ··· ··· ··· bJ−1,2J −1 −1

Note that the final row is comprised of the wavelet coefficients {bJ−1 , k}: You cannot construct the
level bJk from the fJ resolution.
Once again, at the risk of over-repetition: This tree of coefficients represents the following decom-

334
position of a function fJ ∈ VJ :

VJ = V0 ⊕ W0 ⊕1 ⊕ · · · ⊕ WJ−1 . (21)

The element a00 represents the function f0 ∈ V0 since, as you will recall,
Z 1
a00 = hfJ , φi = fJ (t) dt, (22)
0

the mean value of fJ on [0, 1].

So, like, what can we do with these coefficients?

As mentioned earlier, they may be useful in “analyzing” a signal/image, for example, its frequency
components. In general, wavelet functions are localized, so the analysis of signals can be performed
locally, unlike the situation with Fourier transforms (unless we break up the data into blocks).
Secondly, various enhancement procedures may be performed in the wavelet domain, e.g., de-
noising, deblurring, edge enhancement. Once again, the localized nature of wavelet basis functions
can be advantageous here.
One of the fundamental properties of wavelet expansions that permits analysis and/or enhance-
ment is the decay of (the magnitude of) wavelet coefficients ajk as the resolution level j increases.
The resolution level is also referred to as the frequency since the wavelet functions ψjk necessarily
oscillate more rapidly with increasing j – this follows from the scaling relation ψjk = 2j/2 ψ(2j t − k).
That being said, some wavelet coefficients will decay more slowly than others, because of particular
features in the signal/image being represented. To illustrate this, let us return to the example presented
in Lecture 22: the function f (x) on [0, 1] defined as follows,

 8(x − 0.6)2 + 1, 0 ≤ x < 0.6,
f (x) = (23)
 8(x − 0.6)2 + 3, 0.6 ≤ x < 1.

A plot of f (x) along with its approximation f5 ∈ V5 is shown in the next figure. Obviously, f (x) is
discontinuous at x = 0.6.
The wavelet coefficient tree constructed from f5 – originally presented in Lecture 23 – is presented
below the graph. The decay of the wavelet coefficients with increasing frequency is quite evident
from this display. But another noteworthy feature of this coefficient tree is how the coefficients
b49 = −0.20, b34 = −0.13 stand out from the other coefficients in the levels j = 4 and j = 3,

335
respectively. Recalling that for the Haar wavelet basis, the location of the box enclosing the wavelet
coefficient bjk actually corresponds to the support of the associated wavelet function ψjk , we see that
these coefficients are associated with wavelets that overlap with the discontinuity at 0.6. The support
of ψ49 is [9/16, 10/16) = [0.5625, 0.625), and contains the discontinuity.
5

4.5

3.5

2.5

1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x

Graph of f (x) defined in text, along with its approximation f5 (x).

a00 = 2.55
b00 = −0.40
b10 = 0.49 b11 = −0.49
b20 = 0.24 b21 = 0.11 b22 = −0.41 b23 = −0.13
0.09 0.07 0.05 0.03 −0.13 −0.01 −0.04 −0.06
0.04 0.03 0.03 0.02 0.02 0.02 0.01 0.01 0.00 -0.20 -0.00 -0.01 -0.00 -0.01 -0.02 -0.02

Wavelet coefficient tree associated with V5 approximation of function f (x).

These two coefficients, along with b22 = −0.41 are said to comprise a significant (wavelet) tree.
Significant trees are produced by discontinuities or cusps, i.e., irregularities, of the function f (t).
In the two-dimensional case, i.e., images, they are associated primarily with edges. In the case of
images, it is generally desired that an enhancement scheme preserve edges, i.e., that it not alter them
significantly, for example, by blurring. This is because the edges of an image contain a great deal of
information in the image, being associated with boundaries of objects, etc.. The human visual system
is also quite sensitive to degradations in edges. In some cases, observers are known to prefer noisy
images over denoised images in which significant edges (e.g., the outline of a face) have been distorted.

336
Wavelet-based compression

As mentioned earlier, signals and images generally possess a great deal of redundancy because of
correlation within the data. For example, if a signal is continuous, then its value at f (t + ∆t), for
∆t sufficiently small, will be close to f (t). As such, we might not have to store the two signal values:
storing f (t) and ∆f (t) = f (t + ∆t) − f (t) will suffice to reconstruct f (t + ∆t), possibly reducing the
storage requirements, especially if we can do this for a number of consecutive signal values. This is one
example of data compression. In our discussion of the discrete Fourier transform, we saw that a good
number of DFT coefficients are small in magnitude, essentially insignificant. If they are discarded,
there is little error incurred in the modified signal.
The same idea holds with wavelet expansions. There can be many coefficients that are insignificant
and which can be discarded without affecting the quality of the signal appreciably. Of course, the
more compression desired, the more coefficients we must discard, resulting in an increased error, or
decreased fidelity. For example, in the wavelet coefficient table above, we could decide to discard all
coefficients that are zero to two decimals.

Wavelet-based denoising

Denoising in the wavelet domain is based upon the same general principle as Fourier-based methods
discussed earlier. Very loosely put, in the case of additive noise, the noise will appear over the entire
wavelet coefficient tree – not just at high frequencies but at all frequencies, as will be shown in the
experiment presented below. But since the wavelet coefficients of a signal or image decay in magnitude
with increasing frequency, the noise will be more dominant at higher frequencies, until it essentially
makes up most, if not all, of the signal there. A simple strategy, therefore, is to eliminate these
higher frequency coefficients. The thresholding of wavelet coefficients using a single threshold T > 0
can be performed, but it is not optimal. More sophisticated schemes employ different thresholds Tj
at the different resolution levels, acknowledging the general decay of the magnitudes of the wavelet
coefficients bjk with j.

Example: Here we show some results of a particular experiment in which the wavelet coefficient tree
of a pure noise signal was computed. The coefficients N [n], 1 ≤ n ≤ 1024 (= 210 ), of the pure noise
signal were generated randomly from a Gaussian distribution with zero mean and standard deviation
σ = 0.5. A plot of the signal N [n], considered as a sampling of a function supported on [0, 1], is shown

337
below.

1.5

0.5

-0.5

-1

-1.5

-2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x

Discrete noisy signal N (n), 1 ≤ n ≤ 1024, considered as a function over [0, 1]

In the table below the figure are presented the scaling coefficient a00 and wavelet coefficients bjk
for 0 ≤ j ≤ 4. There appears to be no general decay of the magnitudes of the bjk coefficients as one
proceeds downward in the table, i.e., with increasing j. Indeed, the magnitudes of the b9,k coefficients,
the last row in the wavelet table for this signal, i.e., b9,k , 0 ≤ k ≤ 512 (not shown), have the same
range of values of the wavelet coefficients bjk shown in the table.
Since the wavelet basis is orthonormal, the (Euclidean) norms of the signal and its coefficient
vector are equal, i.e.,
9 2X
−1 j 1024
X X
a200 + b2jk = N [n]2 . (24)
j=0 k=0 n=1

Nevertheless, the “energy” of the signal, i.e., its squared norm, seems to be rather equally distributed
over all frequency bands. We shall return to this example in the next section.

a00 = −0.017
b00 = 0.0038
b10 = 0.027 b11 = −0.006
b20 = 0.011 b21 = −0.013 b22 = −0.036 b23 = 0.004
− 0.001 0.015 0.006 − 0.014 0.000 −0.013 0.015 −0.011
0.022 0.021 -0.005 -0.000 -0.008 0.021 0.008 -0.015 0.010 -0.020 -0.004 0.008 -0.023 0.001 0.003 0.015

Wavelet coefficient tree of noisy signal to wavelet level W4 .

338
Reconstruction/synthesis from the wavelet expansion

After having constructed the wavelet coefficient tree, and possibly making modifications to it for
the purpose of denoising or whatever, the question remains, “How do we reconstruct our possibly
modified function f (t)?” The answer is, “via the synthesis/construction algorithm.” You simply work
backwards. If you go back to the equations, you’ll see that the synthesis algorithm produces only ajk
coefficients. But it needs aj−1,k and bj−1,k coefficients. Very quickly, the procedure goes as follows:

1. Step j = 1: Start with a00 and b00 , representing V0 and W0 , and construct a10 and a11 . You’ve
now constructed the V1 representation.

2. Step j = 2: Take the a1k coefficients and, along with b10 and b11 , compute the coefficients a2,k ,
0 ≤ k ≤ 3. This is the V2 representation.

3. Step j: Take the aj−1,k coefficients computed in the previous step and, along with the bjk ,
compute the coefficients ajk , 0 ≤ k ≤ 2j − 1. This is the Vj representation.

4. Step J: This is the final step, in which the coefficients aJk are computed. From these coefficients
you compute AJk = 2J/2 aJk , the discrete values of the function fJ .

Example: The first four coefficients in the table presented earlier, to five digit precision, are

a00 = 2.54666, b00 = −0.39999, b10 = 0.49497, b11 = −0.49499. (25)

This corresponds to the V2 projection of f (t), i.e., J = 2, since

V2 = V0 ⊕ W0 ⊕ W1 . (26)

The above coefficients may be used to compute this projection f2 (t), as defined by the coefficients a2k ,
0 ≤ k ≤ 3. Using the synthesis algorithm, we must first compute a1k (Step j = 1):

1
a10 = √ [a00 + b00 ] = 1.51792,
2
1
a11 = √ [a00 − b00 ] = 2.08359. (27)
2
(28)

339
We now use these to compute a2k (Step j = 2):

1
a20 = √ [a10 + b10 ] = 1.42333,
2
1
a21 = √ [a10 − b10 ] = 0.72333,
2
1
a22 = √ [a11 + b11 ] = 1.12331,
2
1
a23 = √ [a11 − b11 ] = 1.82333. (29)
2

The four function values f2 [k] (J = 2) assumed by f2 on [k/4, (k + 1)/4) are

A20 = a20 · 2J/2 = a20 · 2 = 2.84666

A21 = a21 · 2J/2 = a21 · 2 = 1.44666

A22 = a22 · 2J/2 = a22 · 2 = 2.24662

A23 = a23 · 2J/2 = a23 · 2 = 3.64666. (30)

These results are in agreement with the plot of the projection f2 ∈ V2 of this function presented in
last week’s lecture. The plot is presented below.
5

4.5

3.5

2.5

1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x

f 2 ∈ V2

Note that this is as far as we can go! We were given four initial data values in Eq. (25). From
this data set, we can construct only four function values. We cannot construct f3 , the projection of
f onto V3 , since this would require 8 initial data values. We would need the additional four wavelet

340
coefficients b20 , b21 , b22 and b23 in order to construct, with the help of the a2i coefficients, the coeffi-
cients a3j defining f3 ∈ V3 .

Example: We return to the wavelet decomposition of the noise signal introduced in the previous
section. In the plots below we show reconstructions of various projections of the noise signal in the
spaces V2 -V5 . The projection N5 employs the wavelet coefficients presented in the table in the previous
subsection.
Note that for low j, i.e., j = 1 and 2, the projections/approximations Nj have very low magni-
tude. This is due in part to the zero-mean nature of the noise. The coefficients employed in these
approximations, i.e., a00 , b00 , b10 and b11 , have low magnitudes (they are usually much higher for
“nice” signals) since they are obtained by taking the scalar product of the zero-mean noise signal with
the Haar scaling and lower-frequency wavelet functions, all of which are piecewise constant. This in-
dicates that these low-j approximations can effectively “denoise” a signal. Of course, they would also
remove the high-frequency content of the original signal which would probably not be beneficial! This
is particularly serious in the Haar wavelet case, given that the wavelet functions are discontinuous.
As more coefficients are included, however, i.e., as j increases, the amplitudes, as well as the
irregularity, of the approximations Nj increase. This must be the case, since the signal N [n] must be
perfectly reconstructed at V10 . In the next set of figures, we show the approximation N9 ∈ V9 along
with the reconstructed signal N10 ∈ V10 .
In the next figure, we show the projections/approximations Nj ∈ Vj obtained by using the so-
called “Daubechies-4” wavelet basis which will be discussed very shortly in this course. The Daubechies
wavelets are continuous functions and thereby yield approximations which are continuous.

341
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N1 ∈ V1 N2 ∈ V2
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N3 ∈ V3 N4 ∈ V4
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N5 ∈ V5 N6 ∈ V6

Projections Nj ∈ Vj , 1 ≤ j ≤ 6, of noise signal N [n], 1 ≤ n ≤ 1024, zero-mean Gaussian, σ = 0.5. Haar wavelet
basis.

342
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N9 ∈ V9 N = N10 ∈ V10

Projection N9 ∈ V9 along with reconstructed noise signal N [n], 1 ≤ n ≤ 1024, zero-mean Gaussian, σ = 0.5.
Haar wavelet basis.

343
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N1 ∈ V1 N2 ∈ V2
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N3 ∈ V3 N4 ∈ V4
2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5

-2 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x x

N5 ∈ V5 N6 ∈ V6

Projections Nj ∈ Vj , 1 ≤ j ≤ 6, of noise signal N [n], 1 ≤ n ≤ 1024, zero-mean Gaussian, σ = 0.5. Daubechies-4


wavelet basis.

344
Lecture 28

Multiresolution analysis: A general treatment

We now turn to a more general mathematical treatment of multiresolution analysis (MRA) and
wavelets. You have seen most of the ideas presented below in the context of the Haar system, a
particular example of an MRA. Indeed, the Haar system was a perfect starting point since the main
ideas of MRA could be presented in a rather easy way. A more general mathematical treatment will
show that many wavelet systems are possible.

Multiresolution analysis

Let Vj , j = · · · , −2, −1, 0, 1, 2, · · · , be an infinite sequence of closed subspaces of the function space
L2 (R). This collection {Vj } is called a multiresolution analysis with scaling function φ if the
following conditions hold:

1. nesting: Vj ⊂ Vj+1 for all j ∈ Z.

You’ve already seen this property in the context of the Haar system, where Vj is the
set of all L2 (R) functions that are piecewise-constant on intervals [k/2j , (k + 1)/2j ).
For other MRA’s, the definition of the Vj will be different.
[
2. density: Vj = L2 (R).
j∈Z

This essentially states, in proper set-theoretic language, that “ lim Vj = L2 (R).”


j→∞
\
3. separation: Vj = {0}.
j∈Z

The only element common to all subspaces Vj is the zero function. Think about it in
the context of the Haar system: Here, the space Vj consists of all piecewise-constant
functions over intervals of length 2−j . As j → −∞, these intervals get larger and
larger. The only L2 (R) function that is piecewise-constant over the entire real line R
is the zero function.

4. scaling: f (x) ∈ Vj ⇐⇒ f (2x) ∈ Vj+1 .

345
This provides a connection between consecutive subspaces in terms of scaling: When
you contract a function f ∈ Vj in the x-direction toward the y-axis by a factor of 2,
you have produced a new function that lies in the next higher refinement space Vj+1 .
A mathematical statement equivalent to the above is (Exercise):

f (x) ∈ Vj ⇐⇒ f (2−j x) ∈ V0 .

Subspaces Vj satisfying 1-4 are known as approximation spaces.

5. orthonormal basis: The function φ ∈ V0 , and the set of functions {φ(x − k), k ∈ Z} is an
orthonormal basis of V0 (with respect to the inner product in L2 (R).

Different MRAs will have different scaling functions, φ, and different systems of subspaces Vj . In
fact, as we shall see below, we need only to specify a scaling function φ and the rest will follow. Once
again, we remind ourselves of the Haar MRA which is determined by the scaling function

 1, 0 ≤ x < 1.
φHaar (x) = (31)
 0, otherwise.

The space V0 spanned by integer translates {φHaar (x − k), k ∈ Z} is naturally the space of L2 (R)
functions that are piecewise-constant over the intervals [k, k + 1), k ∈ Z. The fact that the space
Vj is composed of functions that are piecewise-constant over the intervals [k/2j , (k + 1)/2j ) naturally
follows.

In what follows, we consider a general multiresolution analysis (MRA).

Theorem: Let {Vj } be an MRA with scaling function φ. Then for any j ∈ Z, the set of functions,

{φjk = 2j/2 φ(2j x − k), k ∈ Z}, (32)

forms an orthonormal basis of the subspace Vj .

Proof: We must show that any function f ∈ Vj is expressible as a linear combination of functions φjk
defined above. Let f ∈ Vj . From the scaling property (4) for MRA, it follows that g(x) = f (2−j x) ∈ V0 .
But from property (5) for MRA, the functions φ(x − k) form an orthonormal basis of V0 , implying

346
that g(x) admits an expansion of the form
X
g(x) = a0k φ(x − k), a0k = hg(x), φ(x − k)i. (33)
k∈Z

We rewrite this result as follows,


X
f (2−j x) = a0k φ(x − k). (34)
k∈Z

Now let x = 2j t, t ∈ R, to give


X X
f (t) = a0k φ(2j t − k) = a0k 2−j/2 φjk . (35)
k∈Z k∈Z

Thus, f is a linear combination of the functions φjk .

It remains to show that the φjk form an orthonormal set: For k, l ∈ Z,


Z
hφjk , φjl i = 2j/2 φ(2j x − k)2j/2 φ(2j x − l) dx
RZ
j
= 2 φ(2j x − k)φ(2j x − l) dx. (36)
R

Now let t = 2j x, so that dt = 2j dx, etc.. Then


Z
hφjk , φjl i = φ(x − k)φ(x − l) dx
R
= δkl , (37)

since the {φ(x − k)} form an orthonormal basis in V0 . The orthonormality of the {φjk } has thus been
established.

Note: Yes, the above result is quite simple. Nevertheless, it is a very powerful one, and will be used
to establish other important results. It characterizes the fundamental property of scaling for MRA’s.

The Scaling Equation



Since the scaling function φ ∈ V0 ⊂ V1 and, from the previous theorem, the set {φ1k (x) = 2φ(2x−k)}
forms an orthonormal basis of V1 , it follows that φ admits an expansion in this basis:
X
φ(x) = hk φ1k (x), (38)
k∈Z

347
where
hk = hφ, φ1k i. (39)

This equation is usually expressed in the following form,


X √
φ(x) = hk 2φ(2x − k), (40)
k∈Z

known as the scaling equation or two-scale relation for the scaling function φ of the MRA. Since
the expansion is in the L2 -sense, the coefficients h = {hk } form a square-summable sequence in l2 . In
fact, it is easy to see, from the orthonormality of the φ1k functions, that
X
hφ, φi = 1 = |hk |2 . (41)
k∈Z

Recall that for the Haar system

1
h0 = h1 = √ , hk = 0 otherwise, (42)
2

which leads to the familiar scaling equation,

φ(x) = φ(2x) + φ(2x − 1). (43)

The scaling equation is a fundamental equation for MRA. In essence, it may be viewed as a
definition of the scaling function φ via the expansion coefficients hk . Not any set of coefficients {hk }
will do – recall that there is the requirement that the integer translates of φ(x) are orthogonal to each
other. There are some other constraints that we shall discuss later.

Just to give you an idea of another MRA, here are the coefficients that define the so-called
“Daubechies-4” wavelets,
√ √ √ √
1+ 3 3+ 3 3− 3 1− 3
h0 = √ , h1 = √ , h2 = √ , h3 = √ , hk = 0 otherwise. (44)
4 2 4 2 4 2 4 2

They are named after Ingrid Daubechies, a mathematician at Princeton University who is well-known
for her many fundamental contributions to wavelet theory, including the construction of wavelets
with prescribed regularity (i.e., continuous, k-times differentiable), including the Daubechies-4 MRA.
(There is a family of “Daubechies-n” wavelets, where signifies the number of non-zero expansion
coefficients hk .)

348
In the figure below is plotted the Daubechies-4 scaling function φ(x) – the scaling function that
satisfies Eq. (40) for the four non-zero hk coefficients in (44). At first sight, it must look rather
strange, with its jaggedness. That being said, it is, in some ways, an enormous improvement over the
Haar scaling function since it is a continuous function!
2

1.5

0.5

-0.5

-1

-1.5

-2
-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4
x

Scaling function φ(x) for the Daubechies-4 MRA, with coefficients hk given in (44).

What may appear to be even more remarkable is that integer translates of this function, i.e.,
φ(x − k) are orthogonal to each other. We plot the graphs of φ(x) and its translate φ(x − 1) on the
same set of axes below – the reader is invited to reason geometrically how the inner product of these
two functions is zero.
2

1.5

0.5

-0.5

-1

-1.5

-2
-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4
x

Scaling function φ(x) for the Daubechies-4 MRA and its translate φ(x − 1).

The scaling equation and “self-similarity”

The scaling equation (40) shows that the graph of φ(x) may be expressed as a linear combination of
contracted and translated copies of itself (in the x direction) which are also modified by the coefficients

349

hk 2. This may be viewed as a kind of “self-similarity” property of the scaling function φ(x).
Self-similarity is a well-known concept in the context of fractal sets, which you may have seen
in popular expositions of mathematics. For example, the “Sierpinski gasket,” shown below, is an
example of a fractal set. (Its topological dimension is 2, since it is a union of continuous curves, but
it’s Hausdorff dimension is log 3/ log 2 ≈ 1.585 because of its scaling property. In other words, it
is “thicker” than a curve but “not as thick” as the plane.) This set is self-similar in that parts of it
are contracted copies of the entire set. Moreover, it may be viewed as a union of contracted copies of
itself. For example, it may be expressed as a union of three copies that are contracted by factors of 2
in the x and y direction and positioned so that appropriate vertices touch.

The “Sierpinski gasket” fractal set.

The scaling equation (40) is a functional version of self-similarity – you take a function such as
φ(x), make contracted copies of its graph in the x-direction, vary the y-values of each of these graphs,
then translate them, and finally add them up. If the result is φ(x), then φ(x) is said to be self-similar.

350
Wavelets with finite support

You will also note that the support of the Daubechies-4 scaling function appears to be finite, which
indeed it is. (In other words, there is a finite interval [a, b] such that φ(x) = 0 for all x ∈
/ [a, b].) This
is, of course, also the case for the Haar scaling function. Wavelets with finite support are very useful
in the analysis and processing of signals and images. The localization of these functions increases with
frequency, i.e., their supports become smaller, permitting finer detection of important features. You
have already seen an example of such detection in the case of Haar wavelets, where the magnitudes
of the coefficients in a given band exhibited local maxima in the vicinity of a discontinuity. We shall
return to this phenomenon later.

An important result:

Theorem: If the support of the scaling function φ(x) is finite, then only a finite number of the
coefficients hk can be nonzero.

Proof: Suppose that φ(x) = 0 outside the interval [−a, a], where a > 0 is finite. Also let k1 < k2 < · · ·
be an infinite sequence of integers for which hki 6= 0. Now suppose that φ(p) 6= 0 for some p ∈ [−a, a].
Then from the scaling equation,
X √
φ(x) = hk 2φ(2x − k), (45)
k∈Z

it follows that there will be nonzero contributions to the right hand side at the points xi ∈ R defined
by 2xi − ki = p, i = 1, 2, · · · , implying that the values φ(xi ) are nonzero. But a rearrangement yields
xi = 21 (p + ki ), implying that xi → ∞ as i → ∞. This contradicts the assumption that φ(x) is zero
outside the interval [−a, a].

351
Lecture 29

Multiresolution analysis: A general treatment (cont’d)

Wavelet spaces

Previously, we discussed the scaling equation for the scaling function φ(x). We now discuss an analo-
gous equation for the associated wavelet function ψ(x).

Once again, we start with the facts that φ ∈ V0 and V0 ⊂ V1 . As we did last week, we define the
subspace W0 ∈ V1 that is orthogonal to V0 , or the orthogonal complement of V0 , defined as follows,

W0 = V0⊥ = {x ∈ V1 | hx, yi = 0 for all y ∈ V0 }. (46)

It can be shown that W0 is a closed linear subspace – if {xn } is a Cauchy sequence in W0 , then it
has a limit x ∈ W0 .

Recall that integer translates φ(x − k) of the scaling function formed an orthonormal basis of V0 .
We now look for a function ψ ∈ W0 such that its integer translates ψ(x − k) form an orthonormal
basis or W0 .
Since ψ ∈ W0 ⊂ V1 , it will admit an expansion in the φ1k basis of V1 . We shall first write this
expansion as follows,
X
ψ(x) = gk φ1k , (47)
k∈Z

and then write it in two-scale form,


X √
ψ(x) = gk 2φ(2x − k). (48)
k∈Z

It is helpful to recall the associated expansions for the scaling function φ(x), i.e.,
X
φ(x) = hk φ1k (49)
k∈Z

and
X √
φ(x) = hk 2φ(2x − k), (50)
k∈Z

Note the similarity in form of these two sets of equations.

352
Returning to the wavelet function, since ψ ∈ W0 , we must have that

hψ, φi = 0. (51)

The above inner product is computed easily, using the expansions in (47) and (49) in the V1 basis:
* +
X X
hψ, φi = gk φ1k , hl φ1l
k∈Z l∈Z
XX
= gk h̄l hφ1k , φ1l i
k∈Z l∈Z
X
= gk h̄k , (52)
k∈Z

where the final line follows from the orthonormality of the φ1k basis functions.

From the orthogonality constraint in Eq. (51), the g and h sequences must obey the following condition,
X
gk h̄k = · · · + g−1 h̄−1 + g0 h̄0 + g1 h̄1 + g2 h̄2 + · · · = 0. (53)
k∈Z

Here is a very “cheap trick” that works: Set

gk = (−1)k h̄1−k , k ∈ Z, (54)

so that, for example,

g−1 = −h̄2 , g0 = h̄1 , g1 = −h̄0 , g2 = h̄1 , etc.. (55)

Substitution into (53) yields

· · · − h̄2 h̄−1 + h̄1 h̄0 − h̄0 h̄1 + h̄2 h̄−1 + · · · = 0. (56)

In other words, this clever trick produces a cancellation of pairs of terms in the sum. To review, given
the scaling function with expansion
X √
φ(x) = hk 2φ(2x − k), (57)
k∈Z

we have produced a function ψ(x) that is orthogonal to φ(x) with expansion


X
ψ(x) = (−1)k h̄1−k φ(2x − k). (58)
k∈Z

The function ψ(x) is known as the mother wavelet function or, simply, the wavelet function of the
MRA defined by the hk coefficients.

353
For the remainder of the course, we shall be working with real-valued expansions, in which case
it is not necessary to use the complex-conjugate notation. As such, we shall simply write ψ(x) as
follows,
X
ψ(x) = (−1)k h1−k φ(2x − k). (59)
k∈Z

Recall that for the Haar MRA case, the hk scaling coefficients are
1 1
h0 = √ , h1 = √ , hk = 0 otherwise, (60)
2 2
implying that the gk coefficients for the wavelet function are
1 1
g0 = (−1)0 h1 = √ , g1 = (−1)1 h0 = − √ , gk = 0 otherwise. (61)
2 2
This leads to the following equation for the Haar mother wavelet,

ψ(x) = φ(2x) − φ(2x − 1), (62)

which we have already encountered.

We now return to the Daubechies-4 multiresolution analysis system defined by the hk coefficients
in Eq. (44). The associated wavelet function ψ(x) is plotted in the figure below. Because the scaling
function φ(x) and its dilations/translations φ(2x − k) are continuous, the function ψ(x), a linear
combination of the latter, is continuous.
2

1.5

0.5

-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
x

Mother wavelet function ψ(x) for the Daubechies-4 MRA, with coefficients hk given in (44).

A note on the “cheap trick” given above: We must acknowledge here that the method given
above produced a particular solution to Eq. (53), namely Eq. (54). Clearly, Eq. (53) represents

354
only one equation in several, and possibly an infinite number of, unknowns, gk , so a unique (nonzero)
solution will not exist. In fact, we may multiply all of the gk in Eq. (54) by a nonzero constant C to
produce solutions that satisfy (53), leading to constant multiples of the wavelet function ψ(x).
In fact, it can be shown (and will be a little later) that the following is also a solution,

gk = (−1)k h̄2p+1−k , k ∈ Z, (63)

for any integer p, representing even-integer translations of the vector gk . These solutions correspond
to integer translations of the wavelet function ψ which, as we will discover, are also orthogonal to
φ(x).
Are there any other nontrivial (i.e., nonzero) solutions? There may be but they are probably
specific to the set of hk coefficients of concern. The “cheap trick” produces a solution that is applicable
to all MRA’s, producing the desired wavelet function ψ(x).

We now continue with our discussion of multiresolution analysis.

Theorem: The set of functions {ψ0k = ψ(x−k)}, where ψ is defined in Eq. (58), forms an orthonormal
basis of W0 .

Proof: We must divide the proof into several steps.

1. We first prove that the set {ψ0k = ψ(x − k), k ∈ Z} is orthonormal. For any k, l ∈ Z, we’ll
compute the inner product,

hψ0k , ψ0l i = hψ(x − k), ψ(x − l)i. (64)

It will be convenient to expand these functions in terms of the orthogonal functions φ(2x − l).
From Eq. (48), we have
X √
ψ(x − k) = gm 2φ(2x − 2k − m). (65)
m∈Z

355
Substitution into Eq. (64) yields
* +
X √ X √
hψ(x − k), ψ(x − l)i = gm 2φ(2x − 2k − m), gn 2φ(2x − 2l − n)
m∈Z n∈Z
X
= gm gm+2k−2l (2k + m = 2l + n ⇒ n = m + 2k − 2l)
m∈Z
X
= (−1)m h1−m (−1)m h1−m−2k+2l
m∈Z
X
= h1−m h1−m−2k+2l . (66)
m∈Z

We shall now show that the above expression is zero for k 6= l and 1 for k = l. To do this, we
need to look at the corresponding orthogonality relations for the translated scaling functions,
* +
X √ X √
hφ(x − k), φ(x − l)i = hm 2φ(2x − 2k − m), hn 2φ(2x − 2l − n)
m∈Z n∈Z
X
= hm hm+2k−2l (2k + m = 2l + n ⇒ n = m + 2k − 2l)
m∈Z
= δkl . (67)

Replacement of m with 1 − m and substitution into (66) yields the desired result

hψ(x − k), ψ(x − l)i = δkl . (68)

2. We now prove that the set {ψ0k (x) = ψ(x − k), k ∈ Z} is orthogonal to V0 . It is sufficient to
show that
hφ(x − k), ψ(x − l)i = 0. (69)

From the scaling results,


* +
X √ X √
hφ(x − k), ψ(x − l)i = hm 2φ(2t − 2k − m), gn 2φ(2t − 2l − n)
m∈Z n∈Z
X
= hm gm+2k−2l (2k + m = 2l + n ⇒ n = m + 2k − 2l)
m∈Z
X
= (−1)m+2k−2l hm h1−m−2k+2l
m∈Z
X
= (−1)m hm h1−m−2k+2l . (70)
m∈Z

We shall need one more result,


X
(−1)m hm h1−m+2p = 0, for any p ∈ Z. (71)
m∈Z

356
We have already shown that the above is true for p = 0: The series becomes a “cancelling sum”
about the terms h0 h1 and h1 h0 . Recall that this was the basis of the definition of the gk in
terms of the hk . But for a shift of 2p in one of the arguments, it is also a “cancelling sum” about
m = p and m = p + 1:
X
(−1)m hm h1−m+2p = · · · + (−1)p hp h1+p + (−1)p+1 hp+1 hp + · · ·
m∈Z
= 0. (72)

Since −2k + 2l is an even number, it follows that

hφ(x − k), ψ(x − l)i = 0. (73)

3. We now prove that any element y ∈ W0 admits an expansion in the functions ψ(x − k).

The proof is performed in a somewhat roundabout way. We’ll show that the space V1 is spanned
by integer translates of φ and corresponding translates of ψ. Recall the fact that the functions
φ1k = 21/2 φ(2x − k) span V1 . We must show that for each j,
X
φ(2x − j) = ak φ(x − k) + bk ψ(x − k), (74)
k

for an appropriate set of constants ak and bk . (These constants will also depend on j, but we’ll
omit this index for simplicity of notation.)

From the orthogonality of the φ(x − k), it follows that

ak = hφ(2x − j), φ(x − k)i


* +
X √
= φ(2x − j), hl 2φ(2x − 2k − l)
l
1
= √ hj−2k (j = 2k + l ⇒ l = j − 2k). (75)
2
Likewise, we find that

bk = hφ(2x − j), ψ(x − k)i


* +
X √
= φ(2x − j), gl 2φ(2x − 2k − l)
l
1
= √ gj−2k
2
1
= √ (−1)j h1−j+2k . (76)
2

357
If the equality in (74) holds, in the L2 -sense, then the following result must hold:

1 X
|ak |2 + |bk |2 .

hφ(2x − j), φ(2x − j)i = = (77)
2
k

From the previous equations,


X  1X
|ak |2 + |bk |2 = [|hj−2k |2 + |h1−j+2k |2 ]. (78)
2
k k

Regarding the right-hand side, if j is odd, then 1 − j is even; if j is even, then 1 − j is odd. As
a result,
1X 1X
[|hj−2k |2 + |h1−j+2k |2 ] = |hl |2 . (79)
2 2
k l

But from the scaling equation (40), we have


X
hφ(x), φ(x)i = |hk |2 = 1. (80)
k

Therefore Eq. (77) is verified and the expansion in (74) holds.

In summary, any element u ∈ V1 admits a unique expansion in terms of the functions φ(2x − j)
which, in turn, admit unique expansions in terms of the φ(x − k) and ψ(x − k) functions. Since
the φ(x − k) span V0 , it follows that the ψ(x − k) span W0 .

The above result is now easily extended to the space Wj by replacing x by 2j x in the above equations
and employing the orthogonality of the φ(2j x − l) functions. We then have the following important
result:

Theorem: For any j ∈ Z, the set of functions ψjk (x) = 2j/2 ψ(2j x − k) forms an orthonormal basis
of the subspace Wj .

358
Summary of results:

1. The scaling function φ(x) satisfies the relation


X √
φ(x) = hk 2φ(2x − k). (81)
k∈Z

2. By assumption, the set of functions φ0k (x) = φ(x − k), i.e., the set of all integer translates of
φ(x), span the space V0 . From this result, and the scaling property of MRA’s, it follows that
the set of functions φjk = 2j/2 φ(2j x − k) forms an orthonormal basis of Vj .

3. The space W0 ∈ V1 , W0 = V0⊥ , is spanned by the function


X √
ψ(x) = gk 2φ(2x − k), (82)
k∈Z

where gk = (−1)k h1−k .

4. In general, Vj+1 = Vj ⊕ Wj and the set of functions ψjk = 2j/2 ψ(2j − k) forms an orthonormal
basis of Wj .

359

You might also like