Professional Documents
Culture Documents
A Learning Algorithm For Piecewise Linear Regressi
A Learning Algorithm For Piecewise Linear Regressi
net/publication/250167920
CITATIONS READS
9 1,069
4 authors, including:
Diego Liberati
Italian National Research Council
83 PUBLICATIONS 1,278 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Marco Muselli on 15 January 2015.
Abstract
A new learning algorithm for solving piecewise linear regression problems
is proposed. It is able to train a proper multilayer feedforward neural
network so as to reconstruct a target function assuming a different linear
behavior on each set of a polyhedral partition of the input domain.
The proposed method combine local estimation, clustering in weight
space, classification and regression in order to achieve the desired result.
A simulation on a benchmark problem shows the good properties of this
new learning algorithm.
1 Introduction
Real-world problems to be solved by artificial neural networks are normally
subdivided in two groups according to the range of values assumed by the out-
put. If it is Boolean or nominal, we speak of classification problems; otherwise,
when the output is coded by a continuous variable, we are facing with a regres-
sion problem. In most cases, the techniques employed to train a connectionist
model depend on the kind of problem we are dealing with.
However, applications can be found, which lie on the borderline between
classification and regression; these occur when the input space can be subdi-
vided into disjoint regions Xi characterized by different behaviors of the func-
tion f to be reconstructed. The target of the learning problem is consequently
twofold: by analyzing a set of samples of f , possibly affected by noise, it has
to generate both the collection of regions Xi and the behavior of the unknown
function f in each of them.
If the region Xi corresponding to each sample in the training set were known,
we could add the index i of the region as an output, thus obtaining a classifi-
cation problem which has the target of finding the effective form of each Xi .
On the other side, if the actual partition Xi were known, we could solve sev-
eral regression problems to find the behavior of the function f within each Xi .
Because of this mixed nature, classical techniques for neural network training
cannot be directly applied, but specific methods are necessary to deal with this
kind of problems.
Perhaps, the simplest situation one can think of is piecewise linear regres-
sion: in this case the regions Xi are polyhedra and the behavior of the function
f in each Xi can be modeled by a linear expression. Several authors have
treated this kind of problem [2, 3, 4, 8], providing algorithms for reaching the
desired result. Unfortunately, most of them are difficult to extend beyond two
dimensions [2], whereas others consider only local approximations [3, 4], thus
missing the effective extension of regions Xi .
In this contribution a new training algorithm for neural networks solving
piecewise linear regression problems is proposed. It combines clustering and
supervised learning to obtain the correct values for the weights of a proper
multilayer feedforward architecture.
A 1 A 2 A s G a te L a y e r
w z 1 w z 2 z s w
S S S
1 0 2 0 s0
w w w H id d e n L a y e r
w w w
s1
w w
2 1 2 n
1 1
w 1 2
1 n 2 2 s2 sn
In p u t L a y e r
x 1 x 2 x n
layer has output equal to its input zi , if all the constraints (1) are satisfied for
j = 1, . . . , li , and equal to 0 in the opposite case. All the other units perform
a weighted sum of their inputs; the weights of the output neuron, having no
bias, are always set to 1.
the quality of the estimate improves when the size c of the sets Ck increases; a
tradeoff must therefore be attained in selecting a reasonable value for c.
Denote with vk the weight vector of the linear unit produced through the
linear regression on the samples in Ck . If the generation of the samples in the
training set is not affected by noise, most of the vk coincide with the desired
weight vectors wi . Only mixed sets Ck yield spurious vectors vk , which can
be considered as outliers. Nevertheless, even in presence of noise, a clustering
algorithm (Step 2) can be used to determine the sets Vi of vectors vk associated
with the same wi . A proper version of the K-means algorithm [6] can be
adopted to this aim if the number s of regions is fixed beforehand; otherwise,
adaptive techniques, such as the Growing Neural Gas [7], can be employed to
find at the same time the value of s.
The sets Vi generated by the clustering process induce a classification on
the input patterns xk belonging to the training set S. As a matter of fact, if
vk ∈ Vi for a given i, the set Ck is fitted by the linear neuron with weight vector
wi and consequently xk is located into the region Xi . The effective extension
of this region can be determined by solving a linear multicategory classification
problem (Step 3), whose training set S 0 is built by adding as output to each
input pattern xk the index ik of the set Vik to which the corresponding vector
vk belongs.
To avoid the presence of multiply classified points or of unclassified pat-
terns in the input space, proper techniques [1] based on linear and quadratic
programming can be employed. In this way the s matrices Ai for the gate layer
are generated; they can include redundant rows that are not necessary in the
determination of the polyhedral regions Xi . These rows can be removed by
applying standard linear programming techniques.
16 16
14 14
12 12
10 10
8 8
y
y
6 6
4 4
2 2
0 0
−2 −2
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
x x
a) b)
Finally, weight vectors wi for the neural network in Fig. 1 can be directly
obtained by solve s linear regression problems (Step 4) having as training sets
the samples (x, y) ∈ S with x ∈ Xi , where X1 , . . . Xs are the regions built by
the classification process.
4 Simulation results
The proposed algorithm for piecewise linear regression has been tested on a
one-dimensional benchmark problem, in order to analyze the quality of the
resulting neural network. The unknown function to be reconstructed is the
following
−x if −4 ≤ x ≤ 0
f (x) = x if 0 < x < 2 (2)
2 + 3x if 2 ≤ x ≤ 4
As one can note, this is a good approximation to the unknown function (2).
Errors can only be detected at the boundaries between two adjacent regions
Xi ; they are mainly due to the effect of mixed sets Ck on the classification
process.
References
[1] E. J. Bredensteiner and K. P. Bennett, Multicategory classification
by support vector machines. Computational Optimizations and Applica-
tions, 12 (1999) 53–79.
[2] V. Cherkassky and H. Lari-Najafi, Constrained topological mapping
for nonparametric regression analysis. Neural Networks, 4 (1991) 27–40.
[3] C.-H. Choi and J. Y. Choi, Constructive neural networks with piecewise
interpolation capabilities for function approximation. IEEE Transactions
on Neural Networks, 5 (1994) 936–944.
[4] J. Y. Choi and J. A. Farrell, Nonlinear adaptive control using networks
of piecewise linear approximators. IEEE Transactions on Neural Networks,
11 (2000) 390–401.