Instructors Manual

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Instructor’s Manual

An Introduction to Fuzzy Sets

Analysis and Design

Witold Pedrycz and Fernando Gomide

1998
INTRODUCTION

The book has been used by us in many courses for more than 8 years. The material evolved
steadily reflecting the main tendencies in the area yet the core has been retained without any
radical changes.

The material included in this Manual consists of answers to most of the problems and samples of
final examinations prepared for this course. The problems in the book are of very diversified
character. Some of them are plain computing exercises. As such, their primary intent is to provide
the students with necessary drills and allow them to gain appreciation as to the numeric fabric of
the problem and analyze obtained results. On the other hand, there are a number of selected
problems that exhibit a strong design flavor; as such they are open-ended tasks with many
possible path to follow. These are especially challenging and rewarding. Obviously, they do not
have a single simple solution. In all these cases we have sketched a path to consider without
getting into details and not biasing the reader in any particular sense. The choice of the problems
is up to a professor offering the course. We strongly encourage (what we have found very useful)
to consider a right mix of the problems, namely those that stretch across the overall spectrum of
the available material and exhibit different levels of difficulty.

Additional teaching material will soon be available in the WEB. Instructors are kindly requested
to contact the authors to get the book homepage addresses. The homepage is intended to provide
a repository of working codes, transparencies, demo programs, application notes, and much more.

If you have any comments on the book, any typos you have found, or any suggestions on how it
can be improved, we would be grateful to hear from you. Please, send a message to any of the
addresses below or write to us in care of MIT Press.

We are sincerely grateful to MIT Press for distributing the Instructor’s Manual.

Witold Pedrycz and Fernando Gomide


pedrycz@ee.ualberta.ca
gomide@dca.fee.unicamp.br
CHAPTER 1

Basic Notions and Concepts of Fuzzy Sets

1. Clearly, E is the empty set, that is: E = φ. Thus E(x) = 0 for all x’s of an universe X.

2. Many solutions are possible, depending on the individual perception and the context one is
considering. For instance, we might have:

A20(t) (a A20(t) (b)

1 1

18 20 22 °C 18 20 22 °C

. The idea here is to stimulate discussion about how opposing concepts may coexist, becoming a
source of imprecision in actual situations. We could derive fuzzy sets to express the concepts
close to and far from by taking an extreme point as a reference and the line segment (viewed as
an interval of the real line) as the underlying universe of discourse as shown below. They could
be viewed as two points in the “unit cube” or “square”. It would be natural to thing the middle
point as being equally compatible with both concepts with degree 0.5. Geometrically, this would
be a subnormal fuzzy set, or a point in the “unit cube”, which results from the intersection of
close to and far from, with high = 0.5. The reader may guess that it would be difficult to imagine
a similar geometrical picture when thinking with ordinary sets! But note the underlying
continuous nature of the universe in the discussion above. Obviously, in this case it is difficult to
talk about “unit cube”. This make sense only in finite, discrete universes.
close to far from
1

0.5
close to and far from

a m b R
4. (a) See figure below. (b) The type is triangular. (c) something like around 6.

A(x)

5 6 7 X

5. (a) See below. Hgt(A) = 0.5 No. It is subnormal.

Con_Norm_A(x)

Norm_A(x)
1

0.5
A(x)

5 11 X
8

(b) See figurre above for concentration with p = 2. Perform similarly for the rest. We can only
say that Con_A(x) ⊂ A(x) and that Dil_A(x) ⊃ A(x).
6. It is easy to verify that S(A,B) = Card(A∩B)/Card(A). Therefore, in general S(A,B) ≠ S(B,A).
Clearly they are equal only if Card(A) = Card(B). Note that,

   
∑ ∑ ∑ ∑ ∑ B ( x)  =
1 1
S ( A, B ) =  A( x ) − A( x ) + B ( x ) =  A( x ) +
Card ( A )  x∈X  Card ( A)  A ( x ) < B ( x )
A( x )≥ B ( x ) A( x )≥ B ( x ) A( x ) ≥ B ( x ) 

  Card( A ∩ B)

1
=  min[ A( x), B( x)] = .
Card ( A)  x∈X  Card( A)

7. A0.1 = {1,2,3,4}; A0.5 = {1,2,3}; A0.8 = {1,2}; A1 = {1}. Clearly, we have


A = 0.1A0.1 ∪ 0.5 A0.5 ∪ 0.8 A0.8 ∪ 1.0 A1 .

8.By definition we have B = 0.2/2+0.1/5

Y
f
5
B
3

1 1 2 3 4

1 X

1 A

9. Solve the differential equation!

10. Difficult to depict! We can think abstractly only. See comments in the last paragraph of
section 1.9.2, page 28.
CHAPTER 2

Fuzzy Set Operations

1. See figure below for an example. It would be interesting to solve this problem using a
computer program with a graphical output. This would start reader to gain skills and insights
about representations of fuzzy sets in a programming language, and about operations.

A∪B
1
B
0.5 A

1 2 3 4 5 X

A∩B

2. Recall that the drastic product and the drastic sum are t11 and s11, respectively:

 x if y = 1  x if y = 0
 
xt11 y =  y if x = 1 xs11 y =  y if x = 0
0 otherwise 1 otherwise
 

Clearly, x t11 (1-x) = 0 and x s11 (1-x) = 1. Thus, A∩ A = φ and A ∪ A = X

1 − max[ x, y ]  1− x 1− y 
3. (a) Note that max( x, y ) = = min  ,  = min( x, y ). For (b) proceed the
1 + λ max[ x, y ] 1 + λx 1 + λy 

analogously.
4. First note that ρ(x,y,..,z) continous and increasing assure that S_sum(x,y,..z) also is continous
and monotonic increasing. Comutativity requires that ρ(x,y,..,z) be symmetric under its arguments
interchange. It remains to show that ρ(x,y,..,z) is auto-dual and satisfies the boundary conditions.
That is:

(a) 1 – S_sum(1-x, 1-y,..,1-z) = 1 – {ρ(1-x, 1-y,..,1-z)/[ ρ(x,y,..,z) + ρ(1-x, 1-y,..,1-z)]} =


= S_sum (x,y,…,z).

(b) S_sum(0,…,0) = {ρ(0, 0,..,0)/[ ρ(0,0,..,0) + ρ(1, 1,..,1)]} = 0

S_sum(1,…,1) = {ρ(1, 1,..,1)/[ ρ(1,1,..,1) + ρ(0, 0,..,0)]} = 1.

Symmetric sum is discussed in detail by W. Silvert in Symmetric summation: a class of


operations on fuzzy sets. IEEE Trans. on SMC 9, 657-659, 1979, and by D. Dubois and H. Prade
in A review of fuzzy set aggregation connectives. Information Sciences, 36, 85-121, 1985.

5. Let a = min(x1, x2, .., xn) and b = max(x1, x2, .., xn). A is an aggregation operator and therefore it
is monotonic increasing. Assuming A idempotent, we have:
a = A(a, a,….,a) ≤ A(x1, x2, .., xn) ≤ A(b, b, …, b) = b
Conversely, if we assume min(x1, x2, .., xn) ≤ A(x1, x2, .., xn) ≤ max(x1, x2, .., xn), then
a = min(a, a,….,a) ≤ A(a, a,….,a) ≤ max(a, a,….,a) = a.
Thus, all aggregation operations between the min and max are idempotent, and conversely the
functions A that satifies the inequalities above are the only aggregation operations that are
idempotent. Examples include the averaging operations.

6. The k-th largest element of {x1, x2, …., xn}.

7. Hamming: d(A,B) = 0.1 + 0.2 + 0.4 + 0.8 + 0.1 + 0.8 + 0.4 + 0.2 = 3
Euclidean: d(A,B) = 1,70 =1,3038
Tchebyschev: d(A,B) = 0.8
Poss(A,B) = max [0, 0, 0.1, 0.2, 0.4, 0.2, 0, 0, 0, 0] = 0.4
Nec(A,B) = min [1, 1, 0.9, 0.8, 0.5, 0.2, 0.6, 0.8, 1, 1] = 0.2
9. C(A,B) = ½{ Poss(A,B) + 1 – Poss( A , B)}
C( A ,B) = ½{ Poss( A ,B) + 1 – Poss(A, B)}
Adding the two terms we get C(A, B) + C( A ,B) = ½ .

10. A ⊆ B ⇒ Π(A) ≤ Π(B).

11. Comp(X, A) (u) = 1, ∀u∈[0,1]. See illustration below.

X
A

Comp(X,A) 1,0
CHAPTER 3

Information-Based Characterization of Fuzzy

Sets

1. Assume decoding based on modal values. Thus we have:

−1
∑ A ( x)a
i =1
i i
F ( x) = n
. We may formulate the optimization problem as follows:
∑ A ( x)
i =1
i

2
 ∑n A ( x )a 
 i =1 i k i
N 
Q=∑
min Q (a), where a = [a1 ,...., a n ] and
a  n k =1
- xk  . Computing the derivative of Q
 ∑ Ai ( x k ) 
 i =1 
with respect to ai we get the i-th component of the gradient vector:

 ∑n A ( x )a 
∂Q
 =
i k i
N  A (x )
= 2∑ i 1
 n - xk  i k
and we may proceed using any appropriate gradient-
∂a n

 ∑ A (x )
i k =1
 ∑ Ai ( x k ) i k

 i =1  i =1

based optimization procedure to determine the ai’s. Alternatively, if we use membership


functions, e.g. as in the CoG method, we have, assuming a discrete universe:

∑ A( x
m =1
m )x m
xˆ = , where M is the number of discrete intervals and A = U ( Ai ∧ λ i ) ,
∑ A( x m)
λ i = Poss( Ai , X ) . Assuming further a parameterized membership function, e.g. Ai ( x; a, b, c) , we

note that A( x) = max{min[ Ai ( x; a, b, c), max(min( Ai ( x; a, b, c), X ( x)))]} . Therefore, the expression of Q
and its gradients with respect to the parameters of the membership functions becomes more
complex. This case is much less transparent then the previous.

2. In case (a), we can compute the upper and lower bounds of decoding similarly as suggested in
the figure 3.9 (a). This is illustrated below for Ai. The idea is the same for the remaining Aj’s. The
overall bounds may be found by intersecting the individual bounds.

Nec(Ai, X) Ai Aj Nec(Aj, X)
X

encoding

-5 5
Poss(Aj, X) Poss(Ai, X)
Ai

decoding X’
X’’ i
i

-5 5
For the case (b) note that, as shown below, we may get Poss(Ak,X) = 1 and Nec(Ak,X) = 0 for all
k.

Ai Aj
Poss(Ak,X)=1
encoding X
Nec(Ak,X)=0

-5 5
Ai Aj

decoding
X’ =

-5 5

Case (c) is similar as case (b).

3. Choose appropriate functions to characterize entropy and energy (e.g., piecewise linear and
linear, respectively; see pages 62 and 64) and use equations (3.3) and (3.4). This problem is an
interesting computational exercise for those interested in image analysis.

n
4. First, note that H ( p1,..., pn ) = −∑ wi pi log 2 pi , assuming wi = − pi / log 2 pi and recalling that
i =1

∑p
i =1
i = 1 , we find the maximum of H(.) as follows (p = [p1,…,pn]):

n 2 n
L( p, λ ) = ∑ pi
i =1
+ λ[1 − ∑ p ] . Computing the derivative of L(.,.) with respect to pi’s we get
i =1
i

∂L
= 2 pi − λ = 0 and thus pi = λ/2, i = 1,..,n.
∂pi

In the general case, we have that, at the maximum wi (log 2 e + log 2 pi = w j (log2 e + log 2 p j ) .

5. One may formulate the following optimization problem for this purpose:
2
 A( x; p) xdx 
min Q( x, p) =
n
 *

 bi −
∫  . Note that the parameters p should be constrained to within the
p
i =1 
 ∫ A( x; p)dx 

underlying universes of discourse.


CHAPTER 4

Fuzzy Relations and Their Calculus

1. We may find the similarities between patterns by computing a composition between the given
matrix and its transpose. Assuming max-min composition we get the relation S in matrix
form:
 0 .6 0 .1 0 .6 0 .3 0 .3 
 0 .1 0 .8 0 .1 0 .1 0 . 4 

S =  0 .6 0 .1 0 .9 0 .1 0 .3  .
 
 0 .3 0 .1 0 .1 1 .0 0 .3 
 0 . 3 0 .4 0 .3 0 .3 0 . 4 

Clearly S is not reflexive, is symmetric, and is not max-min transitive.

2. We perform the max-min composition. For any t-norm, we may proceed operating with the
relations in matrix form just as we do with standard matrix multiplication, but exchanging the
role of the algebraic product by the chosen t-norm and the role of summation by the max
operation.

 0 .9 0 .3 0 .1 0 .7 0 .6 1 .0 
 0 .5 1 .0 0 .7 0 .9    0 .9 0 .3 0 .9 1 1 0 .9 
0 .1 0 .1 0 .9 1 .0 1 .0 0 . 4 
R o G =  0 . 4 1 .0 0 .2 0 . 1  o  =  0 . 4 0 .3 0 .9 1 1 0 . 4 
 0 0 .3 0 .6 0 .9 1 .0 0 
 0 . 6 0 .9 1 .0 0 . 4     0 . 6 0 .3 0 .9 0 .9 1 0 . 6 
1 . 0 0 0 0 1 .0 1 .0 

3. sup[G ( x, z ) tW ( z, y )] = sup{1 − [1 − G ( x, z )]s[1 − W ( z, y )]} = 1 − inf {[1 − G ( x, z )s[1 − W ( z, y )]}. Similarly,


z z z

inf [G ( x, z )sW ( z , y )] = inf{1 −[1 − G ( x, z )]t[1 − W ( z , y )]} = 1 − sup{[1 − G ( x, z ) t[1 − W ( z , y )]}


z z z

4. We will solve for max-min composition. We perform similarly for other t- and s-norms.
0.5 0.7 0 .3 0.1 1 1 1 0 0.5 0.7 0.3 0.1
1 1 0 .3 0.1   1 0.1
 1 0.3 0.1 0  1 03
R1 ( x, y ) = 1
 1 0 .3 0.1 , R 2 ( x, y ) = 1 03 0.1 0 , R3 ( x, y ) = 1
    1 0 .3 0.1
     
 0 . 5 0.7 0 .3 0.1 1 1 0. 1 0   0 . 5 0 .7 0.3 0.1
1 1 1 1  1 1 1 1 1 1 1 1 
  

Thus R = ∩ Rk, k=1,2,3, that is

0 . 5 0.7 0 .3 0
1 0 .3 0 .1 0

R ( x, y ) =  1 0 .3 0 .1 0
 
0 . 5 0.7 0 .1 0
1 1 1 1

5. The image is a two dimensional fuzzy relation defined in two finite spaces (universes). The
lighter the pixel (defined by x-y coordinates), the higher the corresponding membership value of
this relation. Similarly, lower membership values correspond to darker pixels of the image.

(a) By applying the operations of contrast intensification and dilation, we affect the image by
either eliminating a gray scale of the values (intensification) or enhancing this range (dilation).
By performing these operations in an iterative fashion, we make this effect more profound and
visible.

(b) This is a design problem which may involve some optimization mechanisms. In a simple
scenario, we can envision the input fuzzy sets to be specified as sets (intervals), see figure below.
They play a role of “filters” compressing the original image. The broader the filter, the higher the
compression rate yet lower quality of the reproduced image (fuzzy relation). In general, if we go
for the max-min composition, we end up with the reconstructed fuzzy relation that subsumes
(includes) the original image.
sampling strips
(sets)

6. This problem requires the use of the fuzzy set-fuzzy relation composition with x specified as
an empty set or the entire space.

7. The discussed system is governed by the expression D = [(A × B) × (C • G)] • R . It involves


two unknown fuzzy relations (R and G). Because of that, we have to confine ourselves to
numeric optimization techniques such as gradient-based methods. The ensuing problem reads as:

min D − [(A × B) × (C • G)] • R

with respect to R and G for A, B, and C given, where stands for a distance function

(Euclidean, Hamming, etc.).

n, m
8. The problem induces the following expression g ij = ∨ [A i (x k )tR(x k , y l )] where G = [gij].
k, l

The reconstruction problem requires a determination of R for A, B, and G provided. For the
given t-norm we obtain the solution of the form

c, p
R(x k , y l ) = ∧ [A i (x k )tB j (y l )ϕg ij ] .
i, j

9. The relationship country-currency establishes a Boolean relation each country uses (officially)
its own currency. The remaining relations are clearly fuzzy.
CHAPTER 5

Fuzzy Numbers

1. The objective is to express the Cartesian two-dimensional coordinates (x,y) of the fuzzy point
whose original location is defined in the polar coordinates, refer to the figure below.

A fuzzy point defined in the polar coordinates

We complete this transformation with the use the extension principle.We obtain two expressions
- for the x-coordinate
x
X(x) = sup r,φ:x =r cosφ [R(r) ∧ Φ(φ)]= sup φ [R( ) ∧ Φ(φ)]
cos φ

- for the y-coordinate


y
Y(y) = sup r,φ:y= rsin φ [R(r)∧ Φ(φ)] = supφ[R( )∧ Φ(φ)]
sin φ

( ∧ = min).
Let us consider the x-coordinate (the calulations for the second coordinate proceed in an
analogous way). In light of the existing membership functions of the angle variable, we can
confine ourselves to a certain limited range of the argument, say [0, π/2]. We derive

x
X(x) = sup φ∈[0,π / 2][R( )∧ Φ(φ)]
cos φ
The equivalent optimization problem implies a nonlinear equation to be solved with respect to
the angle variable for the value of x being fixed

x
R( ) = Φ(φ)
cos φ

(1)
that is
x
exp[−( − 1)2 ] = exp[−(φ − π / 3) 2 ]
cosφ

We look for the solutions to the above nonlinear equation that fall under the already defined
range of the values of the angle. Considering the arguments of exponential functions we get

x
− 1= ±(φ − π / 3)
cos φ

The solutions can be produced numerically. In the case of multiplicity of the solutions (c), we
pick up the one for which Φ(φ 1 ), Φ(φ 2 ), ..., Φ(φc ) attains maximum. As the solutions depend on
the values of the second variable (x) regarded here as parameter, we construct the membership
function of X by solving (1) for selected values of x and plotting such relationship.

2. The discussed cases embrace the following situations:


- fuzzy velocity; exact time. The resulting distance is the distance represented as a fuzzy number
with the triangular membership function as illustrated below.
D = T(d; 10*2, 20*2, 40*2) = T(d; 20,40,80)

membership

20 40 80 d

Distance represented as a fuzzy number

- fuzzy velocity and fuzzy time. Here the problem is more complicated as the resulting fuzzy
number is no longer characterized by a triangular membership function. We can use the extension
principle or use a linear approximation of the result (be aware of the resulting approximation
error associated with this approximation; it could be quite high considering significant spreads of
the fuzzy numbers involved in the problem). The linear approximation of distance yields the
results

D = T(v; 10, 20, 40) ⊗ T(t; 1.5, 2, 2.5)

with the triangular membership function equal to

D = T(d; 10*1.5, 20*2, 40*2.5) = T (d; 15, 40, 100)

The alternative way is to complete detailed calculations on these fuzzy numbers. We include the
calculations concerning the increasing sections of the membership functions. Using the same
symbols as the ones on p.138 we get

- if ac <z < mn then z = ac + ωc(m − a) + ωa (n − c) + ω 2 (m − a)(n − c)

where the values of the parameters are summarized in the tables


a m b
10 20 40

c n d
1.5 2.0 2.5

The above relationship is just a quadratic equation that needs to be solved with respect to ω
5ω 2 + 20ω + 15− z = 0

The solution (namely, the membership function over z) is given in the form

100 + 20z
ω = −2 +
10

(the second root is irrelevant in the framework of this problem). The membership function is
defined over z in the range [15, 40]. Note it is a nonlinear function of z. It coincides with the
linear approximation at the boundaries that is z =15 and z =40.

3. Owing to the triangular form of the membership functions of the fuzzy numbers, the mean is
computed as follows

0.5 *[T(x; 1, 3, 5) + T(x; 2, 4, 6)] = 0.5* T(x; 3, 7, 11) = T (x; 1.5, 3.5, 5.5)

In general for “n” arguments each being equal to T(x; 1, 3, 5) we derive


n n
∑ ∑
1 1
T(x;1, 3, 5) = T(x;n, 3* n, 5* n) =T(x;1, 3, 5)
n i=1
n i=1

Note that the result is just the original fuzzy set being used in the entire aggregation process. This
stands in a quite evident contrast with the phenomenon of averaging encountered in statistics.
4. The addition of two fuzzy numbers yields the following membership function of the result
B(b) = sup a ,x ∈R: b = a +x [A(a) ∧ X(x)]

There is no unique solution to this problem. One may, however, rewrite the above expression in
an equivalent format. To do so, let us introduce a fuzzy relation R defined as

R(x, b) = A(b-x)
Then
B(b) = supx ∈R [R(x, b) ∧ X(x)]

The above is nothing but a fuzzy relational equation with the sup-min composition

B = Ao R
(2)
It has to be solved with respect to X for R and B provided. The maximal solution to (2) comes in
the form
 1 if R(x, b) ≤ X(x)
X(x) = inf b∈R [R(x,b) → X(x)] = inf b∈R 
X(x) if R(x,b) > X(x)

5. The solution to the system of two equations (and two unknown fuzzy sets X and Y) is handled
in a similar way as outlined in the previous problem, namely by solving the corresponding system
of fuzzy relational equations. One should elaborate on a way in which this type of transformation
takes place. Consider the first equation in the system of the equations. The membership function
of B1 results directly from the use of the extension principle
B1 (b) = sup x,y,a1 ,a 2 ∈R:a 1 x+ a 2 y= b [X(x) ∧ Y(y) ∧ A11 (a1 ) ∧ A12 (a2 )]

We transform this expression through a number of steps:


- firstly, define two fuzzy relations
A ~ (a1 ,a 2 ) = A11(a1 ) ∧ A12 (a 2 )
X ~ (x, y) = X(x) ∧ Y(y)

-secondly, use the extension principle that leads to the expression

B1 (b) = sup x,y,a1 ,a 2 ∈R:a 1 x + a 2 y = b [X ~(x, y) ∧ A~ (a1 , a2 )]

thirdly, absorb the constraint as a part of the membership function of A~

b − a 1x
B1 (b) = sup x,y,a1 ,a 2 ∈R [X~ (x, y) ∧ A ~ (a1, )]
y

Introduce another four-argument fuzzy relation


b − a 1x
R(a1 , x, y, b) = A (a 1 ,
~
)
y

Also use the notation

b − a 1x
A (b,x, y ) = supa 1 A (a 1 ,
~~ ~
)
y

Finally, we arrive at the fuzzy relational equation of the form

B 1 (b ) = sup x, y [X ~ (x, y) ∧ A ~~ (b,x, y)]

involving X~ that is unknown. For A~~ and B1 given the maximal solution of X~ stemming from
this equation reads as in the previous problem.

Here the subscript underlines that the solution comes from the first equation. The solution to the
second equation is obtained in the same way; denote it by X~2. As these two are the maximal
solutions, we determine their intersection
X~= X~1 ∩ X~2

Note that X~ is a fuzzy relation defined in X Y. To obtain the individual fuzzy sets we project
X~ on the corresponding universes, that is
- projection on X returns the fuzzy set with the membership function equal to

sup y ∈R X ~ (x, y)

- projection on Y gives rise to the expression

sup x ∈R X ~ (x, y)

The discussed method easily generalizes to a system of equations with “n” variables (fuzzy sets).

6. The problem requires the use of the extension principle and the solutions are obtained very
much in the same way as discussed in detail in the first problem.

7 and 8 refer to problem 1.

9. The process of subtracting A from a numeric value is very much simplified because of the use
of the triangular membership function of A. We have
A=(0.8, 1.0, 1.2), B = {10}. Note also that -A = (-1.2, -1, -0.8)
First we obtain
B1 = B - A = B +(-A) = {10} + (-1.2, -1, -0.8)= (10, 10, 10) - (-1.2, -1, -0.8)= (8.8, 9.0,9.2). The
support of B1, supp(B1), is equal to 9.2-8.8 = 0.4

The second iteration yields


B2 = B1 - A = B1 +(-A) = (8.8, 9.0, 9.2) + (-1.2, -1, -0.8)= (7.6, 8.0, 8.4), supp(B2) = 0.8
Performing successive iterations we obtain
B3 = B2 - A = B2 +(-A) = (7.6, 8.0, 8.4) + (-1.2, -1, -0.8)= (6.4, 7.0, 7.6)), supp(B3) = 1.2
B4 = B3 - A = B3 +(-A) = (6.4, 7.0, 7.6) + (-1.2, -1, -0.8)= (5.2, 6.0, 6.8), supp(B4) = 1.6
B5 = B4 - A = B4 +(-A) = (5.2, 6.0, 6.8) + (-1.2, -1, -0.8)= (4.0, 5.0, 6.0), supp(B5) = 2.0
...
Evidently, the results become fuzzier, namely the supports of the successive fuzzy sets become
larger. This is an inherent phenomenon occurring in fuzzy arithmetic. It also suggests using
relatively short chains of computing (not too many iterations) in order to avoid an excessively
high (unacceptable) level of accumulation of fuzziness.

10 The use of the extension principle leads to the expression

Y(y) = sup x ∈R:4x(1 − x)= y [A(x)]

As both the mapping as well as the membership function are symmetrical, the two roots of the
equation with respect to x
4x(1-x) =y
produce the same result in terms of the resulting membership values, A(x1)=A(x2). The
nonlinear membership function has its support between 0.99 and 1.00; see figure below.

f(x)
f(A)

First iteration of the triangular fuzzy number through a quadratic map


CHAPTER 6

Fuzzy Sets and Probability

1. In this example the fuzzy set is reduced to a set. The location of A and the probability density
function (pdf) p(x) is illustrated in the figure below.

1.0 A

p(x)

0 a c b x

A distribution of A and the pdf p(x)

The calculations of the probability of A, Prob(A), its expected value E(A), and the variance V(A)
are realized following the formulas
b c
c−a
∫0 ∫a
Pr ob(A) = A(x)p(x)dx = p(x)dx =
c
b c
1 c2 − a 2
∫0
E(A) = xA(x)p(x)dx =
c ∫a
xdx =
2c
b c
1 c2 − a 2 2
∫0
Var(A) = (x − E(A(x)) 2 p(x)dx =
c∫a
(x −
2c
) dx =

c3 − a3 (c2 − a2 )2 (c2 − a 2 )2 (c − a)
= − +
3c 2c2 4c3
The computations for the second case envisioned in the problem are completed in the same way.

2. We encounter a problem of a sensor affected by a random noise. The general scenario of this
nature is commonly encountered in control, communications, etc. The main difference lies here
in the fact of two distinct forms of information granules, namely fuzziness and randomness. To
start with any computations, it becomes indispensable to transform the probabilistic facet of
uncertainty into its fuzzy set counterpart. As we encounter continuous pdf, we first discretize it
and convert into the membership function (see formula 6.10). Second, we complete addition of
the two fuzzy quantities, say C = A ⊕ B. The overall process of transformation is portrayed in
the figure below.
probability-possibility
transformation

p(z) B

A C
A
+ +

Computing fuzzy output of the sensor affected by an additive noise

Alternatively, we transform the fuzzy reading of the sensor into the probability function. Owing
to the additive nature of the noise, these two random variables are afterwards added giving rise to
the probability function D. Finally, the probability of the fuzzy event C is computed in the usual
form
Pr ob(C) = ∫ C(x)p D (x)dx

where pD denotes the pdf of D.

3. This is a design problem that requires some numeric optimization. The choice of the form of
the four linguistic labels (fuzzy sets) is very much open. We confine ourselves to triangular or
trapezoidal membership functions. This choice simplifies a lot a way in which these fuzzy sets
can be distributed along the universe of discourse. The modifications (e.g., expansion or
contraction of the membership function) are performed by equalizing of the respective fuzzy
sets. This means that we request that each fuzzy set embraces the same fraction of the
experimental data. More specifically, we are concerned about the s-count of the fuzzy set as
different elements of the data set may belong to different degrees of membership. Let us
concentrate on the fuzzy sets shown below.
A1 A2 A3 A4

min a1 a2 a3 a4 max x

Equalization of fuzzy sets

Put c=1/8. The equalization process can be sketched as follows


- scan the current argument (x) starting from min and determine a1 for which the cumulative
probability

∑ A(xk ) ≤ c
k:A(x k ) < a 1

is equal “c”. Repeat the same process for the successive fuzzy sets starting respectively from a1
(for A2) and moving toward higher values of the argument.

In general, the higher the concentration of data, the more narrow the fuzzy set constructed over
this particular region. Once the fuzzy sets have been already determined, we can easily determine
their expected value and variance.
CHAPTER 7

Linguistic Variables

1. The syntactic trees are as follows:

S S S

A
A A

A and B
B B
B not C
C not C
not C
D
E E
E
very D very E high
very E
low very
high
high

To generate the terms needed just add the production A → A or B.

2. This is straightforward. The productions are like those of page 167 and 168, except for the
notation.

3. As indicated, define very big as Big(x)2 and not very big as 1-Big(x)2 and intersect with Big(x),
e.g., via the min (∧) operator to get the meaning of slightly as required. Note that the resulting
fuzzy set is not normal and this is why normalization is necessary. To perform normalization, we
must find the maximum of Big(x) ∧ [1-Big(x)2]. Therefore, we have:

∫ (1 + (x / a ) ) ( )

 − 2 −1   −2 
 ∧ 1 − 1 + (x / a )
−2
Big _ and _ not _ very _ big ( x) =  / x .
0

The maximum is attained at x∗= 0.63 and the corresponding value is Slightly(x∗) = 0.6. Thus,

∫ (1 + (x / a ) ) ( )

 − 2 −1   −2 
 ∧ 1 − 1 + ( x / a )
−2
Slightly ( x) = 1 / 0.6  / x
0

4.Looking at figure 7.11, page 179, we note that the effect of sort of is to reduce the grade of
membership of those points which are in the center of the fuzzy set, and increasing those which
are in the periphery. For the fuzzy set Low, we may approximate sort of low as follows:
Sort_of_Low (x) = Norm_[(1-Con_Low(x)2) ∩ Dil_Low(x)]. The term (1-Con2) reduces the
membership degree of those points which are close to zero, whereas the term Dil increases the
membership degree of those points which are far from zero. Thus, for the example low, we might
interpret sort of low as the normalization of more or less low but not very very small, where Dil is
viewed as more or less and but as and.

6.As suggested, we may assume that usually x is about_a = most x is about_a and proceed as
stated in page 177, e.g., taking most as in figure 7.10. Therefore we have:
n

∑ About _ a( x ) and Truth(usually x is about_a) = Most (r).


Card [ About _ a] 1
r= = i
Card ( X) n i =1

7. The square of usually is the product of usually with itself in fuzzy arithmetic. Thus, as
discussed in the chapter 5, we define the product of two fuzzy quantities from the extension
principle. If, for short, we let U(y) denote the fuzzy set associated with usually, then we have:
U 2 ( x) = sup [U ( y ) ∧ U ( y )] = sup [U ( y )] where x, y∈ [0,1].
x= y2 x= y2

The figure below illustrates the computation.


usually2
1.0

usually

y2 y 1.0 x

Card( Heavy) 1.6


8. r = = = 0.533.. Hence, Truth(most trucks are heavy)=Most(0.533)=0.07
Card( X) 3
CHAPTER 8

Fuzzy Logic

1. From the formulation of the problem, we get false(v) =τ(1-A(x))(v) Assume a model of
linguistic truth to be linear, namely τ(v) = v defined in the unit interval. Then the following
interpretation holds
(X is A) is false = B
with
B(x) =1-A(x)
for any x defined in X.

2. The discussed linguistic modifiers yield the expressions

(X is A) is very true = B; B(x) = A2(x)


(X is A) is very false = C; C(x) = (1-A(x))2

3. The linguistic truth of the expression


τ(A and B)
is given in the form
τ A and B(v) = sup v =wtz [min( τA (w), τ A (w))]

Because of the discrete format of the universe of discourse, the calculations are carried out for
their successive entries. Put, as an example, v=0. The calculations of τ(A and B) for this
particular value of the argument imply a series of pairs of values of w and z:
w=0, z=0; w=0, z =0.1; w=0, z=0.2; ..., w=0, z=1;
w=0.1, z=0; w=0.2, z=0; ...., w =1, z =0.
By taking the minimum of the corresponding membership values (t-norm: minimum) we obtain
the membership of the truth value equal to 1.
When proceeding with some other interactive t-norms such as product, the results are distributed
non uniformly across the universe of discourse. Say, if w=0.7 and z=0.8 then
v = w*z =0.56.

4. We follow the basic formula supporting calculations of linguistic truth

τ i = sup x i :v =Ai (x ) A(x)

Because A(x) ∈ {0, 1} the resulting linguistic truth assumes only two values (either 0 or 1). The
set format of A simplifies all computations. Refer to the figure below. Take x ∉ [a, b] or
equivalently consider v lying outside the region [a, b]. In this particular range, the truth value is
equal to 0. Subsequently for x ∈ [a, b] (that is an equivalent representation of v ∈ [a, b]) the truth
value is equal to 1.

membership

Ai A

d a b e x

Computations of linguistic truth for A and Ai

5. The form of the input implies the fuzzy truth of the form

 1 if v = 0.5
τ i (v) = 
0, otherwise

Assuming that the rule is true, meaning that

τ R i (v) = v
we obtain the fuzzy truth of the conclusion in the form of a linear function, see figure below.

τ Bi

0.5

1.0 v

Fuzzy truth value of the conclusion

Finally, the inverse truth qualification leads to the fuzzy set B

0.5 if Bi (z) < 0.5


B(z) = 
 B i (z) otherwise

We compare this style of inference with two other commonly encountered models:
(a) associative memories using Hebbian learning. The fuzzy relation R is taken as a Cartesian
product of Ai and Bi. Then the max-min composition of fuzzy singleton A and R produces an
outcome of the reasoning mechanism
(b) fuzzy relational-equation approach computes R as an implication (being more specific,
Godelian implication) of Ai and Bi. The max-min composition is utilized afterwards.

As an example, let us use two fuzzy sets Ai and Bi

Ai = [1.0 0.7 0.5 0.2]


Bi = [0.3 0.8 1.0 0.6]

Using the first method, the fuzzy relation R is equal to


0.3 0.8 1.0 0.6
0.3 0.7 0.7 0.6
R =
0.3 0.5 0.5 0.5

0.2 0.2 0.2 0.2

and the conclusion arises as a fuzzy set

B = [0.3 0.5 0.5 0.5]

The second approach leads to the fuzzy relation

0.3 0.8 1.0 0.6


0.3 1.0 1.0 0.6
R =
0.3 1.0 1.0 0.5

1.0 1.0 1.0 1.0 

and, subsequently, the result of the form [0.3 1.0 1.0 1.0]
The fuzzy logic approach leads to the expression

B = [0.5 0.8 1.0 0.6]

This simple example reveals that these results can be arranged in the form of two inclusions

[0.3 0.5 0.5 0.5] < [0.5 0.8 1.0 0.6]


and
[0.3 0.5 0.5 0.5] < [0.3 1.0 1.0 1.0]
CHAPTER 9

Fuzzy Measures and Fuzzy Integrals

1. The λ-measure is computed following formula (9.5) provided on p. 208 of the book. We
solve the associated polynomial equation with respect to λ. The roots are:
6.56478, -3.14579-2.94849 i, 3.14579+2.94849 i, -0.945946, and 0.
The corresponding value of the normalization factor used in the fuzzy measure is taken as -
0.945946. In the sequel we determine the relevance of information produced by any pair of
sensors. Overall, we have
 5 5!
  = = 10
 2 2!3!

pairs of the sensors. The results of successive computations of the fuzzy measure of the pairs of
the sensors are summarized in the tabular format
sensor no. fuzzy measure
1, 2 0.767568
1, 3 0.645946
1, 4 0.443243
1, 5 0.564865
2, 3 0.885811
2, 4 0.801351
2, 5 0.852027
3, 4 0.693919
3, 5 0.765878
4, 5 0.622297

Next we repeat computations for each triple of the sensors. Here we get

 5 5!
 = = 10
3
  3!2!
possible arrangements of the sensors. Again, the results are collected in the following table
(additionally, indicated is the combination of the sensors producing the highest value of the fuzzy
measure)

sensors fuzzy measure


1, 2, 3 0.918225
1, 2, 4 0.849744
1, 2,5 0.890833
2, 3, 4 0.934432
2, 3, 5 0.958743
3, 4, 1 0.762637
3, 4, 5 0.848534
4, 1, 5 0.704565
4, 2, 5 0.934432
5, 1, 3 0.820982

2. The level of failure identified by all sensors is computed in the form of the fuzzy integral. We
arrange the sensors in a decreasing values of the degrees of failure indicated by the respective
sensor. The table below summarizes the results (the last row denotes the fuzzy measure produced
for the collection of the sensors:

h 0.90 0.45 0.42 0.37 0.11


g 0.20 0.45 0.70 0.55 0.30
g ({sensors}) 0.20 0.564865 0.760446 0.86429 1.00

The integral, see p.211, formula (9.13), is equal to min(0.45, 0.564865)=0.45

If sensor 1 and 3 are faulty, then the maximal level of failure reported by the remaining sensors is
obtained by carrying out fuzzy integration over the following entries
h 0.45 0.42 0.11
g 0.45 0.70 0.30
g ({sensors}) 0.45 0.852027 0.910236

The result is equal to 0.45. This points out that the level of identified fault (even though some
sensors were inactive) has not been changed.

3. Here we sketch the solution by focusing on the underlying idea. The relevance of the classifier
with respect to a given class is meant to be its accuracy; the closer this number to 1, the better.
We construct a fuzzy measure over the family of the classifiers; subsequently we compute the
fuzzy integral over the corresponding columns to obtain final class membership values. See
figure below.
classi

classifier1

classifier2

classifier4

fuzzy measure

fuzzy integral=
class membership
in class “i”

class assignment

Computations of the fuzzy measure and fuzzy integrals for the individual classes in the
classification problem
4. The pieces of the image are tagged (subjectively) by some values of the fuzzy measure. This
assignment is subjective: the higher the relevance of the piece, the higher the relevance, figure
below. For instance, if we anticipate that a certain piece is essential for understanding (or
envisioning) the entire image then we attach to it a high value of the fuzzy measure. On the other
hand, if some other fragment of the image does not seem to be essential then the associated value
of the fuzzy measure is made low. One of the possibilities of such allocation of the fuzzy measure
is shown below. As a matter of fact, this reflects our intuitive insight as to the entire image,
especially once it becomes reconstructed from these pieces. For the assigned values of the fuzzy
measure the roots of the polynomial equation are given as: -7.07236, -3.53154-3.15859i, -
3.53154+3.15859 i, -0.959798, 0 and thus the parameter used for further computations of the
fuzzy measure is equal to -0.959798.

0.70
0.75 0.30

0.20

0.25

. Elements of an image under analysis

5. First we compute the value of the λ - parameter of the fuzzy measure for the values of gi’s
equal to 0.05, 0.40, 0.10, 0.60, 0.40, 0.55 (note that there is an error in the dimension of the
vector; let us make the number of sources equal to 5). Solving the resulting polynomial equation,
the solution of interest is -0.773061. Let us now compute the values of the fuzzy integral for the
respective fuzzy sets; the results are included in the tables below
h 0.50 0.30 0.20 0.00 0.00
g 0.05 0.10 0.40 0.60 0.40
g ({elements}) 0.05 0.146135 0.500946 0.868589 1.00

h 0.70 0.50 0.40 0.20 0.10


g 0.40 0.05 0.10 0.60 0.40
g ({elements}) 0.40 0.434539 0.500946 0.868589 1.00

h 0.70 0.40 0.30 0.20 0.00


g 0.40 0.10 0.60 0.05 0.40
g ({elements}) 0.40 0.469078 0.851502 0.868589 1.00

Finally, let us compare the results of the these computations with the experimental results

fuzzy integral (computed) fuzzy integral (target value)


0.20 0.87
0.50 0.52
0.40 0.13

The differences are significant. As a transformation, we can think of some linguistic modifiers
applied to the fuzzy set (h) under integration. Another option would be to consider a complement
of h.
CHAPTER 10

Rule-Based Computations

1. For each x∈X, let B(x) = [µ ∧ A(x)] + (1- µ). If µ ≥ A(x) then B(x) = A(x) + (1- µ). Thus B(x)
≥ A(x). If µ ≤ A(x) then B(x) = 1 ≥ A(x). Therefore B(x) ≥ A(x) for each x∈X which means that
S(A) ≥ S(B).

Next, we can say that, if A1 = A2, then S(B1) ≥ S(B2) because if B1(x) = [µ1∧A(x)] + (1- µ1) and
B2(x) = [µ2∧ A(x)] + (1- µ2) there are three possibilities:

a) A(x) ≤ µ2 ≤ µ1 . Then B1(x) = A(x) + (1- µ1) and B2(x) = A(x) + (1- µ2). Since µ2 ≤ µ1 we have
(1-µ1) ≤ (1-µ2) and hence B2 (x) ≥ B1(x).

b) µ2 ≤ A(x) ≤ µ1. Then B1(x) = A(x) + (1- µ1) and B2(x) = µ2 + (1- µ2) = 1 ≥ B1(x).

c) µ1 ≤ µ2 ≤ A(x). Then B1(x) = µ1 + (1- µ1) =1 and B2(x) = µ2 + (1- µ2) = 1 ≥ B1(x).

We can conclude that B2 (x) ≥ B1(x) ∀ x ∈ X. Therefore S(B1) ≥ S(B2).

2. Consider, as an example, the point (a,c,e). Thus D(a,c,e) = {1,1,0} and D1(a,c,e) = 1,
D2((a,c,e) = 1, D3(a,c,e) = 0. Therefore we find Pa(a,c,e) as follows
1 1
Pa (a, c, e) = max [Q(i / 3) ∧ Di (a, c, e)] = max[0 ∧ 1, ∧ 1, 1 ∧ 0] =
i =1, 2,3 2 2

Performing similarly for every (x1, x2, x3) ∈ X1×X2×X3 we get the results summarized in the
Table 1 below. Note that, to define P(x1, x2, x3, y) we must specify f. For instance, if we choose
the Lukasiewicz implication, we have P(x1, x2, x3, y) = min[1 - Pa(x1, x2, x3) + B(y)] and the end
result is as collected in Table II. Note that Pc(y) = B(y).
Table I
x1 x2 x3 A(x1) A(x2) A(x3) d1 d2 d3 Pa
a c e 1 1 0 1 1 0 1/2
a c f 1 1 1 1 1 1 1
a d e 1 0 0 1 0 0 0
a d f 1 0 1 1 1 0 1/2
b c e 0 1 0 1 0 0 0
b c f 0 1 1 1 1 0 1/2
b d e 0 0 0 0 0 0 0
b d f 0 0 1 1 0 0 0

Table II
x1 a a a a a a a a b b b b b b b b
x2 c c c c d d d d c c c c d d d d
x3 e e f f e e f f e e f f e e f f
y g h g h g h g h g h g h g h g h
P 1 1/2 1 0 1 1 1 1/2 1 1 1 1/2 1 1 1 1

3. To prove (a) and (b) two rows of R are of interest, namely, the rows that satisfy
min[ A( x i ), C ( z k )] = 1 and min[ A( x p ), C ( z r )] = 1. Let us assume that the first is the (i,k)th row and that

second is the (p,r)th row. since C ( z ) = 1 − C ( z ) it is clear that (i,k) and (p,r) are not equal. We know
that for each j, r ijk = B ( y j ) and that r pjr = B( y ) ( r ijk and r pjr are the (ijk) and (pjr) elements of RC

and of RC , respectively), but what about r ijk and r pjr ? In RC we have C ( z k ) = 1 − C ( z k ) = 0

because zk was chosen such that C ( z k ) = 1. Thus min[A(xi), C(zk)] =0. Since B( y j ) ≥ 0 for any j,

due to the Godelian implication r ijk = 1 . Hence we have rijk = min[ r ijk , r ijk ] = r ijk . Analogously,

r pjr = r pjr . The only interesting row in R is the one for which A(x) = 1. Therefore, the maximum

of all the minima of A(x) and C(z) [resp. C (z ) ] and R is equal to B ( y ) [resp. B(y)].
To prove (c) suppose for some s,t,u A(xs) =1, and C(zu) = B(yt) = 0.5. Then it is easy to check that
r stu = r stu = 1 and hence rstu = 1. In addition, because both min[A(xs), C(zu)] and min[ A( x s ), C ( z u )]
are equal to 1, the row (s,u) is different to one of the rows (i,k) and (p,r) mentioned above. Hence,
they are not relevant for the behavior of R with respect to the input A(x) ∧ C(z) or A(x) ∧ C (z ) .
But with the input A( x) ∧ Z( z ) it follows that min[ A( x s ), Z( z u )] = 1 and hence min[ A( x s ), Z( z u ), rstu ] = 1
which means that B( y t ) = max min[ A( x s ), Z( z u ), rs,t ,u ] = 1 . But this cannot be the case because we
s ,u

have assumed that B( y t ) = 0.5 .

4. Just note that, (a) if A( x) < C ( z ) then min[ A( x), C ( z )] = A( x) ≤ B ( y ) but it may be the case that
C ( z ) > B ( y ) and (b) if A( x) > C ( z ) then min[ A( x), B ( y )] = C ( z ) ≤ B( y ) but it may be the case that
B( y ) < A( x ) . See below.

A(x) B(y) C(z)


(a)
C(z) B(y) A(x)
(b)

5. Inferring B’(y) using the composition rule reads as B' ( y ) = sup[ A' ( x) t R( x, y )] . But R( x, y ) = 1 if
x

B( y ) ≥ A( x ) and R(x,y) = 0 otherwise. Therefore, if we let AR ( y ) = {x ∈ X | B( y ) ≥ A( x)} we get

B' ( y ) = sup A' (u ) .


u∈AR ( y )

6. Since the rules are conjunctively aggregated we have:


R( x, y ) = R1 ( x, y ) ∧ R 2 ( x, y ) = [ A1 ( x ) ∨ B1 ( y )] ∧ [ A2 ( x) ∨ B 2 ( y )]

Using the compositional rule of inference with t = ∧, and assuming A normal we get
B( y ) = sup{[ A1 ( x) ∨ B1 ( y )] ∧ [ A2 ( x) ∨ B 2 ( y )] ∧ A( x)} =
x

= sup{([ A1 ( x) ∧ A2 ( x)] ∨ [ A1 ( x) ∧ B 2 ( y )] ∨ [ B1 ( y ) ∧ A2 ( x)] ∨ [ B1 ( y ) ∧ B 2 ( y )]) ∧ A( x)} =


x

= sup{[ A1 ( x) ∧ A2 ( x) ∧ A( x)] ∨ [ A1 ( x) ∧ B 2 ( y ) ∧ A( x)] ∨ [ B1 ( y ) ∧ A2 ( x) ∧ A( x)] ∨ [ B1 ( y ) ∧ B 2 ( y ) ∧ A( x)]}


x

= max{sup[ A1 ( x) ∧ A2 ( x) ∧ A( x)], sup[ A1 ( x) ∧ B 2 ( y ) ∧ A( x)], sup[ B1 ( y ) ∧ A2 ( x) ∧ A( x)],


x x x

sup[ B1 ( y ) ∧ B 2 ( y ) ∧ A( x )]} . Thus we get


x
B( y ) = Poss( A1 ∧ A2 , A) ∨ [Poss( A1 , A) ∧ B 2 ( y )] ∨ [Poss( A2 , A) ∧ B1 ( y )] ∨ [ B1 ( y ) ∧ B 2 ( y )] .

7. It is instructive to detail the derivation of the necessary conditions for the case shown in page
256 of the textbook because this problem is analogous. Thus, let us first consider the example in
which we have the fact V is D and the rules
If X is A1, then Y is B1
If Y is G2, then W is S2
If X is A3, then W is S3
These four pieces of knowledge conjunct to
H ( x, y, z ) = ( A1 ( x) ∨ B1 ( y )) ∧ (G 2 ( y ) ∨ S 2 ( z )) ∧ ( A3 ( x) ∨ S 3 ( z )) ∧ D ( x) .

Projecting H(x,y,z) on Y we get U is F where


F ( y ) = [Poss( A1 ∩ A3 , D)] ∨ [Poss( A1 , D) ∧ S 3 ( z )] ∨ [Poss( A1 ∩ A3 , D) ∧ S 3 ( z )] ∨
[Poss( A1 , D) ∧ S 2 ( z ) ∧ S 3 ( z )] ∨ [Poss( A3 , D) ∧ Poss(G 2 , B1 )] ∨ [Poss(G 2 , B1 ) ∧ S 3 ( z )] ∨ To
[Poss(A 3 , D) ∧ S 2 ( z )] ∨ [ S 2 ( z ) ∧ S 3 ( z )].

obtain a degree of inconsistency of α, it must be the case that F ( z ) ≤ 1 − α, ∀z . Since S2 and S3 are
assumed to be normal fuzzy sets, we must have
Poss( A1 , D ) ≤ 1 − α
Poss( A2 , D) ≤ 1 − α

and these conditions guarantee that Poss( A1 , A3 ) ≥ α . In addition, we must have


S 2 ( z ) ∧ S 3 ( z ) ≤ 1 − α, ∀z , that is Poss( S 2 / S 3 ) ≤ 1 − α . It must also be the case that Poss(G 2 , B1 ) ≤ 1 − α

or, equivalently, 1 − Nec(G 2 , B1 ) ≥ 1 − α what gives another necessary condition for the existence of
potential conflicts relating the values of the linking variable, namely Nec(G 2 , B1 ) ≥ α . Therefore,
there are three necessary conditions to obtain any inconsistency in this case: for some α ∈ (0,1]
1- Poss( S 2 , S 3 ) ≤ 1 − α
2- Poss( A1 , A3 ) ≥ α
3- Nec(G 2 , B1 ) ≥ α
Note the pattern of the conditions obtained. The first condition relates the consequent of the last
two rules given, whereas the second relates the values of the same variable appearing in the
antecedent and the last relates the linking variable among rule antecedent and consequent.
Therefore, we could proceed analogously with the rules given in the problem, but it is clear that,
based on the result and the pattern above, we get
1- Poss(C 2 , S 2 ) ≤ 1 − α
2- Poss( A2 , A3 ) ≥ α
3- Poss( B 4 , B5 ) ≥ α
4- Nec( B 2 , B1 ) ≥ α

8. The minimal cover for the new rule introduced is the subset that includes the two rules linked
by variable V6. The corresponding CT-matrix T’ for the subset contains only the second (R2) and
the third (R3) columns of the matrix at the top of page 257. Thus, the test vector for the new rule
is VT = [ 0 -C6 0 0 0 0 -D2 0 0 F2] and applying the matrix composition we get

 − A2 0 
 
− C 2 0 
 0 0 
 
 0 0 
 0 0   0 + Poss(C 6 , C 2 ) + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 
V T ∗ = 
 J1 − J 2  0 + 0 + 0 + 0 + 0 + 0 + Poss( D 2 , D1 ) + 0 + 0 + Poss( F2 , F1 )
 0 − D1 
 
 0 − E1 
 
 0 0 
 0 − F1 

For the fuzzy sets given, we have Poss(C 6 , C 2 ) = 0.5 ≥ α , Poss( D 2 , D1 ) = 1 ≥ α and
Poss( F2 , F1 ) = 0.2 ≤ 1 − α . Next we find, by inspection, that V is the linking variable and, for

example, assuming J1 = {1/1, 0.8/2, 0.7/3} and J2 = {0/1, 0.7/2, 1/3} we get Nec( J 1 , J 2 ) = 0.7 ≥ α .
Thus we have an indicator that a potential conflict exists with the insertion of the new rule.
CHAPTER 11

Fuzzy Neurocomputation

1. This is a small simulation project. The network can be realized and experimented with using
MATLAB, MATHEMATICA as well as any other simulation package. The main task is to write
down the expressions governing the behavior of the network based on the structure of the
network and including in this description the specific forms of the triangular norms and conorms
(product and probabilistic sum)

2. The realization of the eigen fuzzy set can be accomplished in structure visualized below.
OR

a1

OR
aj
connections of the neuron:
OR r1j, r2j ... rnj

an

A neural structure used for optimizing a eigen fuzzy set

Note that in contrast to the majority of neural networks where connections are trained, here we
look for the inputs and outputs of the network that need to be equal. The optimized (minimized)
performance index is defined as
n
Q= ∑ {S i = 1[a itr ij ] − a j}
n 2

j =1

and its minimization


min a Q
can follow a standard gradient-based scheme. As an empty set ( a = ∅ ) forms a trivial solution, it
is prudent to start from an initial point of the iterative computing that is far enough. Say, setting
a=1 or making connections close to 1 could be a good starting point for all iterations.

3. (i) The network supporting computing possibility and necessity measures, page 51, is
comprised of two neurons, figure below: the computations of possibility are carried out by the
OR neuron while the AND neuron computes the corresponding value of the necessity measure.
Note that the connections of the AND neuron are just complements of the fuzzy set to be
determined.

B
OR
Possibility
Ak

Necessity

AND
B

Computing possibility and necessity measures

The learning scenarios consist of input - output tuples

(Ak, λk, µk)

k=1, 2, ..., K where K is the size of this training set. The connections (fuzzy set B) is to be
determined. Obviously, a certain performance index guiding the determination of the learning
should express a distance between the possibility and necessity values and those produced by the
network. A standard mean squared error is a common choice.
(ii) This estimation problem calls for the neural network of the structure given in the figure
below. Note that now A is a vector of the connections of the neurons whereas Bk are the inputs.
As the necessity measure is asymmetrical, the AND neuron has all inputs complemented
(negated).

A
OR

Possibility
Bk

Necessity

AND
A

Optimization of the connections of the network - second learning scenario

4. The reliability of some modules connected in series is described as

c
R= ∏R
i=1
i

where Ri are the reliabilities of the individual components, i=1, 2, ..., c. Note that this expression
can be realized by the AND neuron implemented using the standard product (t-norm). All the
connections of this neuron are set to 0.
The reliability of the collection of “c” modules put in parallel is described in the form

c
R =1− ∏ (1 − R )
i =1
i
Note that for any t-norm we have
asb =1-(1-a)t(1-b)

This expression can be modeled using an OR neuron with the s-norm that is a probabilistic sum
asb = a+ b -ab
In other words, we obtain
asb= 1 -(1-a)(1-b)

The connections of the neuron are all set to 1. Bearing these two basic structures in mind, we can
easily model the systems of interest using AND and OR neurons with the required number of
inputs, see figures below. In all these cases, the connections of the OR neurons are set to 1. The
connections of the AND neuron are equal to 0. The inputs of the neurons are the reliabilities of
the corresponding modules.

AND

OR
OR AND

Reliability structures and their neural network realization

5. The fuzzy controller can be represented in the form of the network with a single hidden layer
consisting of AND neurons and the output layer built with the use of OR neurons. The AND
neurons are used to construct conditions of the rules. The OR neurons combine the conditions
that exhibit the same conclusion.
There are two sources of available knowledge that can be used in the design of the network
• qualitative rules They constitute a backbone of the neural network. Here the connections of
this part of the network are either equal to 0 (AND neurons) or 1 (OR neurons)
• experimental data These are used toward further refinement of the connections of the
network. In other words, experimental data help calibrate the entire network
The proposed structure of the network is included in the figure below. Note that we have added a
number of the neurons not used directly to represent the initial rules. The intent is to assure
enough structural and parametric flexibility when it comes to a detailed learning algorithm. The
initial connections of these extra neurons are set close to 1 and 0 (AND and OR neurons,
respectively). Such a configuration makes these neurons quite inactive (redundant) at least at the
beginning of the training procedure. The role of these neurons could become more profound over
the course of learning (meaning that they are indeed required). Similarly, in order to retain the
qualitative rules being originally downloaded onto the network, we may consider two values of
the learning rates
- the one, that is substantially higher, should apply to the connections of all extra neurons. In this
way we make them more susceptible to changes
- the second, far lower than the previously used, is applied to the neurons implementing the
original rules. This helps us retain their character by exposing them to quite reduced
modifications/updates.
AND OR
neurons neurons

E1

E2
U1

E3

E4

DE1 U3

DE2
U2

additional AND neurons


to accommodate eventual
conditions of the rule

A neural representation of the fuzzy controller


CHAPTER 12

Fuzzy Evolutionary Computation

1. The representation is straightforward: a chromosome consists of two integer numbers between


1 and 9, the first determines x whereas the second y. The chromosome can be depicted as below.

4 7

For well behaved, bumplike fitness functions, as the one exemplified by the table, mutation only
will do the job and crossover is not at all necessary. But, because in general crossover reduces the
dimensionality of the search space, it helps to find the best answer sooner (in average) than with
mutation alone. Note, however, that for non well behaved fitness functions we need both to
improve generations and may need additional strategies to exploit the search space more
efficiently (e.g., the diversity principle strategy in which it can be as good to be different as it is
to be fit).

2. Based on the problem characteristics, binary coding could improve the behavior of the basic
algorithm. However, we would need four (4) bits to represent each integer number; for instance,
the chromosome above would look like

0 1 0 0 0 1 1 1

Note that we must modify the basic algorithm for the GA to perform well with this representation
because there are combinations which are not numbers between 1 and 9. In addition, similarly as
in problem 1, mutation only would be enough provided that we start with a single chromosome,
no chromosome is permitted to appear more than once in each generation, a maximum number of
chromosomes survive from one generation to the next, each survivor is a candidate for survival to
the next generation along with any new chromosomes produced, the chromosome with the
highest fitness survives to the next generation, and the remaining survivors from one generation
to the next are selected at random.
3. These structure of rules is a form of the TS (Takagi-Sugeno) model. Here is a coding scheme
for this class of rule:

center

left base right base

center left base right base w0 w1 w2


001011 111001 101001 001011 111001 101001

MFCi RPCi
Membership Function Chromosome rule consequent parameters chromosome

A rule base for this kind of representation would look like the following

...... ...... ......


MFC MFC MFC MFCm RPC1 RPCmn

There may have many appropriate fitness functions. One is the following:

 a(t max − t ) + b pole balanced



Fitness (t ) =  p t = t max
 ct pendulum fell over

where a, b, c and p are constants, t is the time, and tmax is the expiration time. The condition to set
when the pendulum falls may vary. We may consider | θ |≥ θ max for this purpose.
4. Using the conventions stated in page 319 of the textbook, we have the following coding (you
may devise a different coding depending on which connection you assume as positive or negative
connection, but note that the no connections should obviously be the same):
00 00 01 10 00 01
00 00 10 01 00 00

00 00 00 00 01 00
S= 
00 00 00 00 01 10 
00 00 00 00 00 10 
 
00 00 00 00 00 00

If we use more than two neuron types, we must reconsider the coding of the T vector. For
instance, if we have four types of neurons we need at least two bits to encode them. Thus, in
place of just a binary digit we would have a two digits binary number to describe the type of each
neuron of the network. However, the connections would be encoded similarly as above.

5. There is no much difference between the stratified scheme shown in the figure 12.17 and the
one which uses evolution strategies if you think evolution strategies in place of genetic
algorithms. Of course, the detailed algorithms would differ as shown in the text.

6. It is obvious that the individual genes of a fuzzy rule can be arranged in any order. For
Instance, the rule given as example in the problem statement could be encoded either as

1 2 2 3 3 1 or 2 3 3 1 2 3 because both have the


same meaning. This facilitates the derivation of rule bases to solve many problems, e.g. in
autonomous navigation, process control, and decision making. A good point about this
representation is that it enables formation of good building blocks according to the schema
theorem (see Holland (1975) and Koza (1993) to review the building blocks and the schema
theorem concepts.). Nevertheless, note that any combination of genes containing at least one
input and one output variable represents a meaningful fuzzy rule. However, note also that when
no output variable is present in the chromosome the rule becomes meaningless. We may avoid
this drawback by e.g. inserting a randomly chosen term into the chromosome, instead of
disregarding such a rule. In addition, if no input variable is present the rule has no antecedent and
it may becomes active permanently regardless of the input state. We may avoid such a situation,
we can proceed similarly as above by adding a new randomly chosen input variable into the
chromosome.
CHAPTER 13

Fuzzy Modeling

1. In discrete case, the probabilities as well as conditional probabilities are represented in the
form of vectors and matrices. The operations used therein concern standard matrix multiplication
(in other words, Σ - product composition). When it comes to encoding we distinguish two
essential cases:
- discrete (pointwise) input. Here the probability vector assumes the form [0 0 ...0 1 0...0]
where the nonzero input refers to the current input associated with the i-th position of the
probability vector
- nonpointwise input. In this case there could be several entries of the probability vector that are
nonzero. Say, we may have [0 0 0 1/p 1/p ... 1/p 0 0 0 ]. Note that we have to adhere to the
principles of probability meaning that all entries must sum up to 1 (this is not the case in fuzzy
models).
The decoding mechanisms are these well-known in the probability calculus. Given a probability
distribution function (pdf), we convert it into a single numeric entity by considering
• mean (average)
• median
• modal, etc.
These options are well documented and discussed in depth in any comprehensive textbook on
probabilistic analysis.

2. The essence of this problem is to start from the original string of nonterminal symbols, say
aaabbb and apply production rules from the set of rules coming with the grammar G to achieve
the terminal symbol (s). First, there could be more than a single derivation for the same string.
Second, as each derivation comes with its own confidence factor determined through any t-norm,
see below, we take a maximum over all possible derivations, namely
conf(string) = maxI deri
where deri is the i-th derivation of the string; I denotes a collection of possible derivations
abccddeezzqa original string

pc
p1

pk

terminal symbol
σ
Computing a level of confidence associated with the derivation of the string; p1, p2, ... pc denote
corresponding numbers of the production rules of the grammar

3. The fuzzy sets of condition need to be distributed in such a way that they embrace the groups
of experimental data. The local models are constructed around these focal points. Note that the
type of the global model we obtain is piecewise-linear. When it comes to polynomial
approximation, we get a single (global) model. Through experimentation, we can observe that
even though the polynomial model could be highly nonlinear, we may encounter some “rippling”
effect, especially for higher-order polynomials.

4. The rule-based model of the form orients towards some selected (seed) points of the data

- if x is Ai then y =ai(x-mi) +gi

namely, the modal value of Ai equal to mi (that produces the highest activation level) and the
output is equal to gi.
Once we decide upon the form of combination of the local models, we can easily derive the
parameters of the local models using any gradient-based algorithm.
Alluding to the previous exercise, we can pick up such points around which we get a high level
of concentration of experimental data. As a matter of fact, a way of localizing such focal
elements of the model are easy in a two-dimensional case. In the multivariable case, we need to
resort ourselves to some clustering algorithms.

5. The approximation capabilities increase once we increase the number of local models by
selecting a significant number of seed points (in an extreme situation we may think of almost
any point in the data set to be used as a local model). Under such circumstances, the
generalization abilities could be almost nonexistent. If so, we have to strike a sound balance
between the number of local models and the overall generalization abilities. The optimal mixture
of approximation - generalization capabilities can be achieved through some experimentation.

6. The derivation of the gradient-based computing can be straightforward.


(a) the initial values of the parameters of the model can be established by having a closer look at
the meaning of the entries of the model’s equations. The Gaussian membership functions are
located around mi and have a spread equal to σi. These two values can be estimated from the
distribution of the data. As the linear local models are described by a free form of the linear
function (the linear models in the previous problem are more confined as passing through a given
point), there is no detailed guidelines as to the initial values of these parameters and these could
be determined through some optimization.
(b) There two distinct families of parameters. The first class concerns receptive (Gaussian) fields.
The other deal with the linear functions. As they cope with different aspects of the overall model,
they exhibit a different level of plasticity
• lower level of plasticity comes with the receptive fields. They tend to establish a more general
look at the data. Subsequently, they tend to be more “stable” and less plastic and adjustable
(low values of the learning rate)
• higher level of plasticity is encountered at the linear part of the fuzzy model. These
parameters can be adjusted more vigorously (higher values of the learning rate)
The identification of these two levels of plasticity of learning associates with the two different
values of the learning rates applied within the learning process
(c) The regularization component helps make the linguistic terms distinct; this retains their
semantics. It appears in an additive form in the overall performance index which could be viewed
as a weighted sum of these two contributing factors. To stress semantics of the linguistic terms,
we require that they do not overlap too much. In a simplest way, the level of overlap can be
quantified via the use of possibility measure, say Poss(Ai, Ai+1) where these two fuzzy sets are just
linguistic terms to be optimized. The regularization factor can embrace the possibility values
computed for all the fuzzy sets.
CHAPTER 14

Methodology

4. Computationally speaking, the fuzzy controller of this problem has the same structure as a
rule-based system depicted in figure 10.11, page 244 of the textbook. The details are as shown
below.

A1 B1 C1

1.0 1.0 1.0

2/3

A2 B2 C2

1.0 1.0 1.0

1/3

4 8

1.0

2/3
1/3

1 5 10
From the figure above we see that, assuming the the universe U discrete to simplify calculations,
if we use the mean-of-maxima the control is u = (3 + 4 + 5) / 3 = 4 and if we use the center-of-
gravity we get
1 2 2 2 1 1 1
2 +3 +4 +5 +6 +7 +8
u= 3 3 3 3 3 3 3 = 4.7
1 2 2 2 1 1 1
+ + + + + +
3 3 3 3 3 3 3

Note: the numerical data was taken from Fuzzy logic controllers by H. Berenji in An Introduction
to Fuzzy Logic Applications in Intelligent Systems, R. Yager & L. Zadeh (eds), Kluwer, 1992.

6. See the figure depicted below.

x2

contours of the objective function


1.0

optimal solution
0.5
x1+ x2= 1

0.5 1.0 x1

7. The optimal solution can be obtained via e.g. the simplex method. See Luenberger (1973) for
the details of the simplex algorithm. It can also be trivially solved by any linear optimization
routine. The optimal solution is x1 = 1/5, x2 = 0, x3 = 8/5, the value of the objective function being
27/5. You may use MATLAB Optimization Toolbox.

8. We could proceed as follows, using the original Bellman and Zadeh approach. Thus, we may
define F ( x) = x / I − S but note that while I = 0, S is unbounded. But since we know that x should
be around 1, with around being defined by the fuzzy set A(0.5,1,2) we may (subjectivelly) assume
the value of f(x) at x = 2 to define S = 4. Thus F(x) = x/4. The figure below illustrates the solution.
A F
1.0

1.0 x∗ 2.0 x

9. We must proceed as discussed in section 14.3.3.2, that is, assuming that


3 f = (2,3,4) 4 f = (2.5,4,5.5) 18 f = (16,18,19) 2 f = (1,2,3) 1 f = (0.5,1,2) 7 f = (6,7,9) t1 f = (2.5,3,3.5)

and t 2 f = (0.5,1,1.5) , we obtain the following auxiliary problems (see page 393 of the textbook):

maximize 5x1 + 6x2


subject to 3x1+ 4x2 ≤ 18 + 3(1 - α)
2x1+ x2 ≤ 7 + (1 - α)
x1 , x2 ≥ 0, α ∈ (0,1].

maximize 5x1 + 6x2


subject to 4x1+ 5.5x2 ≤ 16 + 2.5(1 - α)
3x1+ 2 x2 ≤ 6 + 0.5 (1 - α)
x1 , x2 ≥ 0, α ∈ (0,1].

Solving the two standard linear optimization problems we get:

Problem 1: x1 = 2 + 0.2(1 - α); x2 = 3 + 0.6(1 - α); objective function = 28 + 4.6(1 - α)


Problem 2: x1 = 0.11 - 0.26(1 - α); x2 = 2.82 + 0.64(1 - α); objective function = 17.53 + 2.56(1 -
α).
The graphical solutions are easily obtained after we guess appropriate values of α. See figure
14.16. Note that we can also easily plot the optimal solutions as a function of α.

10. Clearly the answer is no! Just assume any nonlinear membership functions to model the
optimization problem data, e.g., Gaussian fuzzy coefficients, S shaped functions for tolerances in
the inequality constraints and objective functions, etc.

You might also like