Using Fuzzy Partitions To Create Fuzzy Systems From Input-Output Data and Set The Initial Weights in A Fuzzy Neural Network

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

614 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO.

4, NOVEMBER 1997

Using Fuzzy Partitions to Create Fuzzy


Systems from Input–Output Data and Set the
Initial Weights in a Fuzzy Neural Network
Yinghua Lin, George A. Cunningham III, and Stephen V. Coggeshall

Abstract— We create a set of fuzzy rules to model a system decision tree but it differs from a decision tree because the
from input–output data by dividing the input space into a set partition is not used to make decisions in the testing stage.
of subspaces using fuzzy partitions. We create a fuzzy rule for For more accuracy, the fuzzy rule-based system obtained
each subspace as the input space is being divided. These rules
are combined to produce a fuzzy rule based model from the from the partition can be used to determine the structure and
input–output data. If more accuracy is required, we use the fuzzy set the initial weights of a fuzzy neural network. The fuzzy
rule-based model to determine the structure and set the initial neural network is trained to obtain better performance than is
weights in a fuzzy neural network. This network typically trains possible from the initial fuzzy rule-based system.
in a few hundred iterations. Our method is simple, easy, and Fuzzy neural networks are well known and widely used
reliable and it has worked well when modeling large “real world”
systems. [2], [9]–[13], [28] and some work has been published [1], [8],
[21], [30] on obtaining the proper network structure and initial
Index Terms—Fuzzy control, fuzzy systems, neural networks. weights to reduce training time. In our experience, the cited
work does not scale well to problems with a large number
I. INTRODUCTION of inputs. Another method for deriving a set of fuzzy rules
from a partition of the input data and then using these rules to
O UR objective is to model a system from input–output
data. We are particularly interested in methods to rapidly
develop models for large nonlinear systems with hundreds of
determine the structure and set the initial weights of a fuzzy
neural network was introduced by Sun in [27]. However, both
the partition tree and the fuzzy neural network introduced in
possible inputs. The ubiquitous backpropagation neural net
our paper are different from those in [27]. As discussed below,
suffers from well-known problems of slow convergence and
the method outlined in this paper is applicable to problems
local minima as the number of inputs is increased [32]. Crisp
where [27] is not suitable.
methods such as binary decision trees [7], [23], [29] are fast,
but the results have not been accurate enough to be useful in
our work. Radial basis functions [6], [20], [31] are excellent for II. PARTITIONING THE INPUT SPACE
problems with a small number of inputs, but the computational Suppose we have a set of input–output data from a system
complexity is overwhelming for a large number of inputs. with inputs, , and one output . We
Other fuzzy approaches to this problem (such as [25] and want to model this data with a function over a
[28]) also become computationally intractable as the number compact set. We divide the input space into a set of subspaces
of inputs grows. or bins and create a fuzzy rule for each subspace. The fuzzy
We use a set of input–output training data to dynamically rules from all the subspaces yield a fuzzy rule-based system
partition the input space into a set of subspaces or bins. As to model the original input–output data. The fuzzy rule for
we divide the space, we create a fuzzy rule for each bin. bin is
These fuzzy rules are combined to create a fuzzy rule-based
model for the input–output data. Our partition is created like a IF is and is and
binary tree. We start with the entire input space and cut it into and is THEN is (1)
two subspaces. The subspaces are recursively divided until
where are input fuzzy membership
some restrictions are met. The partition is used to determine
functions for rule and is the output label of rule . The
a set of fuzzy rules that model the relationship between the
input fuzzy membership functions
input–output data. The partition was motivated by a binary
are determined by the boundaries for each bin as the input
space is divided. The output label is determined by weighted
Manuscript received July 8, 1994; revised October 21, 1996. average of the output data corresponding to the data in bin
Y. Lin is with the Center for Nonlinear Studies, Los Alamos National
Laboratory, Los Alamos, NM 87545 USA. and those in bins adjacent to bin .
G. A. Cunninghan III is with the Department of Electrical Engineering, We begin by viewing the entire compact set defined by
New Mexico Institute of Mining and Technology, Socorro, NM 87801 USA. the range of the inputs as one bin. This bin is the root of
S. V. Coggeshall is with the Applied Theoretical Physics Division, Los
Alamos National Laboratory, Los Alamos, NM 87545 USA. our partition. We divide the initial bin into two smaller bins.
Publisher Item Identifier S 1063-6706(97)07505-X. Then, we divide these smaller bins recursively until: 1) the
1063–6706/97$10.00  1997 IEEE
LIN et al.: USING FUZZY PARTITIONS TO CREATE FUZZY SYSTEMS FROM INPUT–OUTPUT DATA 615

Fig. 1. Six possible cuts for a 2-D input space.

difference between the two generated bins can be ignored or (three per input dimension), with each cut dividing the original
2) the number of training data points within a bin reaches a space into two bins. For each set of two bins, we compute the
minimum number. As we divide, we create a fuzzy rule for output labels and , and choose the one cut that gives
each bin in the system. Thus, we dynamically create a fuzzy the maximum difference in output labels. Note that
rule-based system to represent the input–output data as we do we never divide the original space into four bins by cutting
the fuzzy partition. both input dimensions at once.
The minimum and maximum value of each input variable
defines an interval . Our goal is to divide
or cut the interval at the place that yields the maximum A. The Input Fuzzy Membership Functions for Each Bin
difference between the two fuzzy rule output labels We need to define the input fuzzy membership functions
and . Our experimental work has shown that 35%, 50%, used in (1) for each rule or bin.
and 65% of the length of the interval are good positions We use trapezoidal shapes for the input fuzzy membership
for trial cuts in many cases. We cut each interval at these trial functions, and we define these membership functions by the
locations, calculate the difference between the two fuzzy rule boundaries of the bin. We label the width of the top of the
output labels for each cut, and chose the cut that yields the trapezoid by and the width of the bottom of the trapezoid
maximum difference. Of course, the trial locations of 35%, by as illustrated in Fig. 2(b).
50%, and 65% of the interval are not the best locations for all Suppose is the original input interval for input .
applications. To find the best or even a better cut location is We define a bin by . The width of bin
still an open problem. Information gain [22] and the Gini index is given by . We choose and so that .
of diversity [4] are two common tests used in classical decision If then the membership function is rectangular, and
trees to test a division point, but they do not address choosing we have a crisp set. In this case we have an ordinary binary
the best place to divide. In [27], Sun uses the summation partition as the partition is the combinations of all bins. If
of density measure and typicality measure as the , then we have a fuzzy set and we say that we
object function for dividing the space. The main problem of have a fuzzy partition. We define the middle of the bin by
using as object function is that both and . We center the trapezoid over the middle of
are the measures for unsupervised cluster. For classification the bin , as illustrated in Fig. 2(b). If the edge of the bin
problems, a good cluster may represent different classes and coincides with the end of the original input interval, that is,
different clusters may contain one class. if or , we modify the membership function
Since we do three cuts per dimension, a problem with and make it rectangular at the edge of the original interval, as
dimensions requires 3 different cuts. We do all of the cuts shown in Fig. 2(a) and (c), respectively. If the bin is equal
and then choose the best one. This idea is illustrated in Fig. 1 to the original input interval, that is, , then
for a two-dimensional (2-D) space with six cuts. We assume the fuzzy membership function for that input equals one in
that . We have illustrated the six cuts the entire interval. From our experience, we initially choose
616 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 4, NOVEMBER 1997

for the minimum allowable difference between and .


If , then we stop dividing . The value
for can be defined as a percentage of the standard
deviation of the output training data sequence. In [27], Sun
uses the mean-square error of the grids as the criteria for
stopping the partition process. The main weak point of the
criteria is that the number of grids are exponentially increased
(a) with respect to the number of dimension. Another way to
evaluate a partition is to check the final mean-square error
of the system [26], but this approach is not efficient. As we
discussed before, we also set a minimum number of
training data points in each bin. When the number of training
data points in a bin is less than or equal to , we stop
dividing that bin. We emphasize that our methods use the data
to partition the space and, hence, generate the fuzzy rules to
model the system. At each step we are dealing with only
(b)
one cut and the resulting two bins. This makes our method
very fast and able to handle problems with a large number
of inputs.

C. Defuzzification
For the test phase, we use centroid defuzzification

(c) (3)
Fig. 2. The shapes of the fuzzy membership functions for the inputs.

where is the number of subspaces of the input space,


and . As with the trial locations to cut, which is the number of fuzzy rules of the fuzzy system,
the best choice of and for any given problem is an open and is the output fuzzy
research question. membership function of rule for the given test data. Here,
When we have the fuzzy membership functions for the is the fuzzy membership function of th input and th
input variables, we define the output label of a fuzzy rule bin for the testing vector, and is computed using (2).
corresponding to bin , , as

III. A SIMPLE EXAMPLE


We use a nonlinear system with two inputs and and
(2) a single output , defined by

(4)

In this equation, are for a tutorial example. Equation (4) is graphed in Fig. 3.
training data points, is We take 441 points from
the degree of input fuzzy membership function of th training and use (4) to obtain 441 input–output data.
vector for th bin, is the input fuzzy membership We set the fuzzy decision tree parameters ,
function of for th training vector and th bin, and , and the parameters for the input fuzzy mem-
is decided, as in Fig. 2. bership functions as and where is
For each cut, (2) gives the output labels for the two resulting the width of a bin and and are the two parallel sides of
bins. We compute the difference of these values for each of our trapezoids. We make our trial cuts at 35%, 50%, and 65%
trial cuts. The details of this idea is shown through a simple of the interval. With these parameters, the fuzzy decision tree
example in the Appendix. divides the input space into the nine bins shown in Fig. 4. At
the same time, we obtain a nine-rule fuzzy system, as shown
in Fig. 5. This approximates the original surface with a mean
B. The Criteria to Stop Partitioning square error of 0.16 as shown in Fig. 6. We can set the
We divide a bin into two smaller bins and with to smaller values and obtain better mean square error with an
output labels and . If , then the two fuzzy increase in the number of fuzzy rules. When we change the
rules for and can be combined into one rule. This means to 0.35, we obtain a fuzzy system with 16 rules, and
that it was not necessary to divide . We choose a the mean-square error reduces to 0.056.
LIN et al.: USING FUZZY PARTITIONS TO CREATE FUZZY SYSTEMS FROM INPUT–OUTPUT DATA 617

Fig. 3. A graph of the nonlinear equation y = x2 sin(x1 ) + x1 cos(x2 ),


0  
x 1 ; x2 .
Fig. 4. The nine subspaces of the input space derived from the fuzzy decision
IV. A FUZZY NEURAL NETWORK tree with MINdif = 0:6 and MINn = 10.

A. The Architecture of the Fuzzy Neural Network experience, a normalized fuzzy neural network is better than
This fuzzy neural network (shown in Fig. 7) is introduced the one which is not normalized when we model a system
in [17]. It similar to a four-layer (input, fuzzification, infer- without training. Equation (7) with generalized Gaussians as
ence, and defuzzification) radial basis function network. The input fuzzy membership function is differentiable, so it can
structure of the fuzzy neural network is decided by the number be trained with backpropagation. A neural network which
of inputs (the number of neurons in the input layer) and is not normalized trains much faster than the one which is
the number of fuzzy rules (the number of neurons in the normalized. For all of these reasons, we use (3) in the partition
inference layer) applied to the system. There are stage and change to (7) when we use backpropagation to train
neurons in the fuzzification layer. The first neurons (one per the network.
input variable) in the fuzzification layer incorporate the first
rule, the second neurons incorporate the second rule, and B. Setting the Initial Structure of the Fuzzy Neural Network
so on. There is one neuron in the last layer. Our fuzzy neural The structure of the fuzzy-neural network is decided by
network can be represented by the following three equations. the number of inputs and the number of fuzzy rules. The
The output of the fuzzification layer is significant inputs of a system can be decided using the partition
(5) tree discussed in this paper. If an input is cut for several times
during the partition, then it is an important. On the other hand,
where and are weights and are node parameters if an input is never cut then it is not an important. However,
as shown in the figure. The output of the inference layer is for a system with hundreds of inputs the partition is very slow;
the local minimum of the partition may not identify a set of
(6) important inputs. So for systems with hundreds of inputs we
use a fuzzy curve and fuzzy surface technique [16] to identify
where the are from (5). The output of the fuzzy neural the significant inputs. Here we suppose that the number of
network is inputs are known. The number of rules are the number
of subspaces in our partition which is . From (5)–(7), we
need to set , , , and to initiate the fuzzy neural
network.
The initial weights are set in the following way. The output
labels are the same as those
. The initial ’s are set to two. The bin boundaries
(7) are used to set the initial ’s. If a bin is in the middle of a
input space, the center of the Gaussian membership function
It has been proved that (7) can represent any continuous corresponds to the bin center, as illustrated in Fig. 8(b). If a
function over a compact set as closely as we bin is at the boundary of the input space, the Gaussian center
desire [19]. of that input is placed at the boundary of the input, as shown
Both (3) and (7) are fuzzy neural networks and both are uni- in Fig. 8(a) and (c). The Gaussian values at the boundaries of
versal. Equation (3) with trapezoids as input fuzzy membership bins, but not at the edge of the inputs, are set to 0.5. Once
function is computationally simpler than (7) with generalized we have the initial fuzzy neural network, we train the weights
Gaussians. Equation (3) is normalized and (7) is not. From our and the parameters of the network by backpropagation.
618 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 4, NOVEMBER 1997

Fig. 5. The nine fuzzy rules representing the nonlinear function y = x2 sin(x1 ) + x1 cos(x2 ); 0  x1 ; x2  .

V. THE EXAMPLE CONTINUED


We continue with the nonlinear system defined by (4). We
use the nine-rule fuzzy system with a mean-square error of
0.16 to initialize the fuzzy neural network. Since the nine-rule
fuzzy system is normalized and the fuzzy neural network is not
normalized the initial mean-square error for the neural network
is quite large. However, since the structure of the network
reflects the data in the problem, it converges very rapidly using
backpropagation. After 100 iterations, the mean-square error
is 0.016. Fig. 9 shows the output from the nine-rule fuzzy
neural network. The good result of Fig. 9 is mainly due to the
training of the membership functions.

VI. COMPARISONS
Fig. 6. The output of the nine-rule fuzzy system approximating the nonlinear
We test our method on some examples in the paper by system y = x2 sin(x1 ) + x1 cos(x2 ); 0  
x1 ; x2  . The mean-square
Sugeno and Yasukawa [25]. The first example is a nonlinear error is 0.16.
equation defined as . In this instance,
two random inputs and were inserted to test input 0.581. Three inputs— , , and —are cut
identification. When all four inputs are used in our fuzzy
during the partition. After 800 training iterations, the resulting
partition, only and are divided. The fuzzy partition
fuzzy neural network yields a mean-square error of 0.157. [25]
yields a six-rule fuzzy system with a mean-square error of
developed a six-rule model using , , and
0.351. We then set the initial weights of our fuzzy neural
network with fuzzy system from fuzzy partition. After 400 as inputs with a mean-square error of 0.190.
training iterations, the fuzzy neural network yields a mean- The third example is data on daily stock prices for stock
square error of 0.005. Sugeno and Yasukawa [25] produced a A from [25]. It has ten input variables and 100
two-input six-rule model with a performance measure of 0.010. data points. The performance measure is not noted in [25].
The second example is Box and Jenkins gas furnace data We use 80 data points as training data and 20 data points as
taken originally from [3]. The process is a gas furnace with a testing data. At training time, the partition algorithm divide the
single input (gas flow rate) and a single output (CO input space into 14 bins. The 14-rule fuzzy system from the
concentration) . As in [25], we consider the variables partition is used to initialize our fuzzy neural network. The
as input candidates. network training converges after 400 iterations. The results
We obtain a 12-rule fuzzy system with a mean-square error of from both training and testing is shown in Fig. 10.
LIN et al.: USING FUZZY PARTITIONS TO CREATE FUZZY SYSTEMS FROM INPUT–OUTPUT DATA 619

Fig. 7. The architecture of the fuzzy neural network.

VII. AN APPLICATION
We used our method to predict the performance in a financial
industry credit problem [15]. There are about 130 inputs that
may affect the output. We use fuzzy curves [16] to determine
the 11 most important inputs. We use 10 000 records at training
time and about 60 000 nonoverlapped records at testing time.
The fuzzy decision tree divides the 11-dimensional input space
into 35 bins. The resulting fuzzy neural network converges (a)
after 500 training iterations. The test results show that our
method gives the best result of all new nonlinear techniques
applied, including radial basis functions [6], [20], [31] standard
clustering algorithms [14], [24] and classical decision trees
[5]. Verification by independent experts shows that the results
obtained with our method are a significant improvement over
the current industry standard.
(b)

VIII. CONCLUSION
We use a binary fuzzy partition to divide the input space
into bins. We create one fuzzy rule for each bin and we
build the fuzzy rule-based model of the system as the space
is divided. This fuzzy system can be used to initialize a
fuzzy neural network. Because the initial weights of the fuzzy
neural network are set by the original fuzzy rule-based system,
which is built from the data, the neural network converges (c)
very rapidly. These techniques have proven useful tools for Fig. 8. The initial Gaussian shapes of the fuzzy neural network assuming
modeling large “real world” nonlinear systems. that the original input interval is [0; 4].
620 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 4, NOVEMBER 1997

Fig. 9. The output of the nine-rule fuzzy neural network approximating the
nonlinear system y = x2 sin(x1 ) + x1 cos(x2 ); 0  x1 ; x2   . The
mean-square error is 0.016 after 100 iterations.
Fig. 11. Two fuzzy membership functions of x1 for the simple example.

Fig. 12. Differences in the output labels for the six cuts in the simple
example.

data points: , , ,
, and . From the data points, we
find that , , , and . We first
cut at .
This divides the space into two bins and
Fig. 10. Fuzzy rule-based system performance on stock price data, ( ) . We compute ,
actual price, and (1 1 1 1 1 1) model output. , , and .
This yields the two fuzzy membership functions for shown
APPENDIX in Fig. 11, and one fuzzy membership function for which
Suppose we have a problem with two inputs and equals one in the entire interval of because is not cut
and one output . Since this is a 2-D input space, until the fourth trial. For the cut at , , and are
we will have six cuts (three in each dimension), as obtained as shown in the equation shown at the top of the
illustrated in Fig. 1. Suppose we have five input–output page. In the same way, we obtain and for the other five
LIN et al.: USING FUZZY PARTITIONS TO CREATE FUZZY SYSTEMS FROM INPUT–OUTPUT DATA 621

cuts. Fig. 12 shows the positions of the six cuts and and [27] C. T. Sun, “Rule-base structure identification in an adaptive-network-
the differences of the output labels of the six cuts. We take based fuzzy inference system,” IEEE Trans. Fuzzy Syst., vol. 2, pp.
64–73, Feb. 1994.
the cut to divide the space as this cut yields the [28] H. Takagi, N. Suzuki, T. Koda, and Y. Kojima, “Neural networks
maximum difference in the output labels. We then proceed designed on approximate reasoning architecture and their applications,”
recursively treating each bin as if it were the entire space. IEEE Trans. Neural Networks, vol. 3, pp. 752–760, Sept. 1992.
[29] Q. R. Wang and C. Y. Suen, “Analysis and design of a decision tree
based on entropy reduction and its application to large character set
REFERENCES recognition,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-6,
pp. 406–417, July 1984.
[1] M. G. Bello, “Enhanced training algorithms and integrated train- [30] L. F. A. Wessels and E. Barnard, “Avoiding false local minima by proper
ing/architecture selection for multilayer perceptron networks,” IEEE initialization of connections,” IEEE Trans. Neural Networks, vol. 3, pp.
Trans. Neural Networks, vol. 3, pp. 864–875, Nov. 1992. 899–905, Nov. 1992.
[2] H. R. Berenji and P. Khedkar, “Learning and tuning fuzzy logic [31] L. Xu, A. Krzyżak, and A. Yuille, “On radial basis function nets
controllers through reinforcement,” IEEE Trans. Neural Networks, vol. and kernel regression: Statistical consistency, convergence rates, and
3, no. 5, pp. 724–740, Oct. 1992. receptive field size,” Neural Networks, vol. 7, no. 4, pp. 609–628, 1994.
[3] G. E. P. Box and G. M. Jenkins, Time Series Analysis, Forecasting and [32] J. M. Zurada, Introduction to Artificial Neural Systems. New York:
Control. San Francisco, CA: Holden Day, 1970. West, 1992.
[4] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification
and Regression Trees. Belmont, CA: Wadsworth Int., 1984.
[5] W. Buntine, “Tree classification software,” in 3rd Nat. Technol. Transfer Yinghua Lin received the B.S. degree in physics
Conf. Expo., Baltimore, MD, Dec. 1992, pp. 1–10. from Zhejiang Normal University, Jinhua, China,
[6] S. Chen and B. Mulgrew, “Overcoming co-channel interference using in 1982, the M.S. degree in computer science from
an adaptive radial basis function equalizer,” Signal Processing, vol. 28, Ball State University, Muncie, IN, in 1991, and the
pp. 91–107, 1992. Ph.D. degree in computer science from New Mexico
[7] P. A. Chou, “Optimal partitioning for classification and regression trees,” Institute of Mining and Technology, Socorro, NM,
IEEE Trans. Pattern Anal. Machine Intell., vol. 13, pp. 340–354, Apr. in 1994.
1991. From 1982 to 1989, he was a Faculty Member
[8] G. P. Drago and S. Ridella, “Statistically controlled activation weight in the Department of Physics at Zhejiang Normal
initialization (SCAWI),” IEEE Trans. Neural Networks, vol. 3, pp. University, Jinhua, China. From January 1994 to
627–628, July 1992. December 1994 he was a Graduate Research As-
[9] Y. Hayashi, J. Buckley, and E. Czogala, “Fuzzy neural network with sistant in Los Alamos National Laboratory, Los Alamos, NM, working on
fuzzy signals and weights,” Int. J. Intell. Syst., vol. 8, pp. 527–537, 1993.
banking projects and other financial project. From December 1994 to August
[10] S. Horikawa, T. Furuhashi, and Y. Uchikawa, “On fuzzy modeling
1995, he was a Post-Doctoral Researcher in Los Alamos National Laboratory
using fuzzy neural networks with the back-propagation algorithm,” IEEE
continuously working on banking projects. Since August 1995 he has worked
Trans. Neural Networks, vol. 3, no. 5, pp. 801–814, Sept. 1992.
[11] H. Ishibuchi, R. Fujioka, and H. Tanaka, “Neural networks that learn for the Centre for Adaptive Systems Applications (CASA), Los Alamos, NM,
from fuzzy if-then rules,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 85–97, where has built models for credit performance, marketing, and economic time
May 1993. series.
[12] H. Ishibuchi, H. Tanaka, and H. Okada, “Fuzzy neural networks with
fuzzy weights and fuzzy biases,” in Proc. Int. Conf. Neural Networks,
San Francisco, CA, Mar. 1993, pp. 1650–1655. George A. Cunningham III received the B.S.
[13] J. R. Jang, “Self-learning fuzzy controllers based on temporal back degree in engineering from Case Institute of Tech-
propagation,” IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 714–723, nology, Cleveland, OH, in 1965, and the M.A.
Sept. 1992. (mathematics) and Ph.D. (electrical engineering)
[14] T. Kohonen, “The self-organizing map,” Proc. IEEE, vol. 78, pp. degrees from the University of Washington, Seattle,
1464–1480, Sept. 1990. in 1972 and 1986, respectively.
[15] Y. Lin, “Modeling with fuzzy systems,” Los Alamos Nat. Lab., Los He has held academic positions in the Computer
Alamos, NM, Tech. Rep. LA-UR 94-1530, 1994. Science Department, University of North Florida,
[16] Y. Lin, G. Cunningham, and S. Coggeshall, “Input variable identi- Jacksonville, and the Engineering Department, Har-
fication—Fuzzy curves and fuzzy surfaces,” Fuzzy Sets Syst., to be vey Mudd College, Claremont, CA. He has also
published. worked as an Engineer for the Boeing Company,
[17] Y. Lin and G. Cunningham, “Building a fuzzy system from input-output Seattle, WA, and the City of Seattle, WA. He has also been an independent
data,” J. Intell. Fuzzy Syst., vol. 2, no. 3, pp. 243–250, 1994. Certified Public Accountant (CPA). He is currently an Associate Professor
[18] Y. Lin, G. Cunningham, S. Coggeshall, and R. Jones, “Nonlinear system and Department Chair of Electrical Engineering at the New Mexico Institute
input variable identification: Two stage fuzzy curves and surfaces,” IEEE of Mining and Technology, Socorro, NM. His research interests are in system
Trans. Syst., Man, Cybern., to be published. modeling and control, with a particular interest in undergraduate engineering
[19] Y. Lin and G. Cunningham, “A new approach to fuzzy-neural system education.
modeling,” IEEE Trans. Fuzzy Syst., vol. 3, pp. 190–198, May 1995.
[20] M. T. Musavi, W. Ahmed, K. H. Chan, K. B. Faris, and D. M. Hummels,
“On the training of radial basis function classifiers,” Neural Networks,
vol. 5, pp. 595–603, 1992. Stephen V. Coggeshall received the B.S. (math,
[21] H. Narazaki and A. L. Ralescu, “An improved synthesis method for physics) and B.A. (music) degrees from South-
multilayered neural networks using qualitative knowledge,” IEEE Trans. ern Illinois University, Carbondale, IL, in 1980,
Fuzzy Syst., vol. 1, pp. 125–137, May 1993. and the M.A. (music) and Ph.D. degrees (nuclear
[22] J. R. Quinlan, “Induction of decision trees,” Mach. Learning, vol. 1, no. engineering) from the University of Illinois, Urbana-
1, pp. 81–106, 1986. Champaign, in 1984.
[23] J. R. Quinlan and R. L. Rivest, “Inferring decision trees using the From 1984 to 1990 he was a Staff Member at Los
minimum description length principle,” Inform. Computat., vol. 80, pp. Alamos National Laboratory (LANL), Los Alamos,
227–248, 1989. NM, working in the field of laser fusion. After
[24] R. Patil, “Modeling using neural networks and decision trees,” Los a sabbatical in Germany, working on Lie Groups
Alamos Nat. Lab., Los Alamos, NM, Tech. Rep. LA-UR 94-1529, May applied to differential equations, he returned to Los
1994. Alamos in 1991 and began working on adaptive computing and neural
[25] M. Sugeno and T. Yasukawa, “A fuzzy-logic-based approach to qual- networks. In the lab he led a variety of adaptive computing projects related
itative modeling,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 7–31, Feb. to finance. In July 1995 he co-founded the Center for Adaptive Systems
1993. Applications (CASA) where he has been leading teams developing consumer
[26] , “Structure identification of fuzzy model,” Fuzzy Sets Syst., vol. analytics-based models for a variety of applications, including fraud, credit
28, pp. 15–33, 1988. performance, and marketing.

You might also like