Professional Documents
Culture Documents
Adaline B
Adaline B
Adaline B
•
0.2 - •
0.1
(a) I I I ", . IT , ,
0 1 2 3 4 5 6 7 n
x(n)
1.00 - • • •
0.50 • • • •
(b)
n 5 6 7
x(n) 1 1 1 1 0.5 0.5 0.5 0 5
given a particular input signal. The Adaline takes the input and the desired out-
put, and adjusts itself so that it can perform the desired transformation. Further-
more, the signal characteristics change, the Adaline can adapt automatically.
We shall now expand these ideas, and begin our investigation of the Adaline.
The term Adaline is an acronym; however, its meaning has changed some-
what over the years. Initially called the ADAptive Linear NEuron, it became
the ADAptive LINear Element, when neural networks fell out of favor in the
late 1960s. It is almost identical in structure to the general PE described in
Chapter 1. Figure 2.8 shows the Adaline structure. There are two basic mod-
ifications required to make the general PE structure into an Adaline. The first
modification is the addition of a connection with weight, which we refer
to as the bias term. This term is a weight on a connection that has its input
value always equal to 1. The inclusion of such a term is largely a matter of
experience. We show it here for completeness, but it will not appear in the
discussion of the next sections. We shall resurrect the idea of a bias term in
Chapter 3, on the backpropagation network.
The second modification is the addition of a bipolar condition on the output.
The dashed box in Figure 2.8 encloses a part of the Adaline called the adaptive
linear combiner (ALC). If the output of the ALC is positive, the Adaline output
is +1. If the ALC output is negative, the Adaline output is — 1. Because much
of the interesting processing takes place in the ALC portion of the Adaline,
we shall concentrate on the ALC. Later, we shall add back the binary output
condition.
The processing done by the ALC is that of the typical processing element
described in the previous chapter. The ALC performs a
+1
y output
-
Figure 2.8 The complete Adaline consists of the adaptive linear combiner,
in the dashed box, and a bipolar output function. The
adaptive linear combiner resembles the general PE described
in Chapter
2.2 Adaline and the Adaptive Linear Combiner 57
lation using the input and weight vectors, and applies an output function to get
a single output value. Using the notation in Figure 2.8,
y=
Calculation of w*. To begin, let's state the problem a little differently: Given
examples, of some processing function that asso-
ciates input vectors, with (or maps to) the desired output values, what
is the best weight vector, for an ALC that performs this mapping?
To answer this question, we must first define what it is that constitutes the
best weight vector. Clearly, once the best weight vector is found, we would
like the application of each input vector to result in the precise, corresponding
output value. Thus, we want to eliminate, or at least to minimize, the difference
between the desired output and the actual output for each input vector. The
approach we select here is to minimize the mean squared error for the set of
input vectors.
If the actual output value is for the input vector, then the corre-
sponding error term is — The mean squared error, or expectation
value of the error, is defined by
(2.2)
k=\
= • (2.3)
= (2.4)
In going from Eq. (2.3) to Eq. (2.4), we have made the assumption that the
training set is statistically stationary, meaning that any expectation values vary
slowly with respect to time. This assumption allows us to factor out the weight
vectors from the expectation value terms in Eq. (2.4).
Exercise 2.1: Give the details of the derivation that leads from Eq. (2.3), to
Eq. (2.4) along with the justification for each step. Why are the factors dk and
left together in the last term in Eq. (2.4), rather than shown as the product
of the two separate expectation values?
(2.5)
and Stearns use the notation, ], for the expectation value; also, the term exemplars
will sometimes be seen as a synonym for training set.
2.2 Adaline and the Adaptive Linear Combiner 59
= - 2p (2.6)
- 2p = 0
= p (2.7)
= (2.8)
_ (2.9)
All that we have done by the procedure is to show that we can find a point
where the slope of the function, is zero. In general, that point may be a
minimum or a maximum point. In the example that follows, we show a simple
case where the ALC has only two weights. In that situation, the graph of
is a paraboloid. Furthermore, it must be concave upward, since all combinations
of weights must result in a nonnegative value for the mean squared error,
This result is general and is obtained regardless of the dimension of the weight
vector. In the case of dimensions higher than two, the paraboloid is known as
a
Suppose we have an ALC with two inputs and various other quantities
defined as follows:
3 1
1 4] = 10
R=
Rather than inverting R, we use Eq. (2.7) to find the optimum weight vector:
3 1
1 4
w* + = 5
Figure 2.9 For an ALC with only two weights, the error surface is a
paraboloid. The weights that minimize the error occur at the
bottom of the paraboloidal surface.
In the next section, we shall examine a method for finding the optimum
weight vector by an iterative procedure. This procedure allows us to avoid the
often-difficult calculations necessary to determine the weights manually.
Begin by assigning arbitrary values to the weights. From that point on the
weight surface, determine the direction of the steepest slope in the downward
direction. Change the weights slightly so that the new weight vector lies farther
down the surface. Repeat the process until the minimum has been reached. This
procedure is illustrated in Figure Implicit in this method is the assumption
that we know what the weight surface looks like in advance. We do not know,
but we will see shortly how to get around this problem.
Typically, the weight vector does not initially move directly toward the
minimum point. The cross-section of the paraboloidal weight surface is usually
elliptical, so the negative gradient may not point directly at the minimum point,
at least initially. The situation is illustrated more clearly in the contour plot of
the weight surface in Figure
2. 3.
(2.13)
= (2.14)
+ 1) = + (2.15)
5. Repeat steps 1 through 4 with the next input vector, until the error has been
reduced to an acceptable value.
= 1 (bias input)
desired
output
Figure 2.13 This figure shows a standard diagram of the ALC with multiple
inputs and a bias term. Weights are indicated as variable
resistors to emphasize the adaptive nature of the device.
Calculation of the error, is shown explicitly as the addition
of a negative of the output signal to the desired output value.