Financial Time Series Forecasting by Combining Anfis With Various Aggregation Operators

FINANCIAL TIME SERIES
FORECASTING BY COMBINING ANFIS

WITH VARIOUS AGGREGATION
OPERATORS
A project report submitted in partial fulfillment of the requirements for

M.Tech. Project
M.Tech.
by
Amit Samdarshi (2010IPG-009)
ABV INDIAN INSTITUTE OF INFORMATION

TECHNOLOGY AND MANAGEMENT
GWALIOR-474015
2015
CANDIDATE’S DECLARATION
I hereby certify that I have properly checked and verified all the items as prescribed in
the checklist and ensure that my thesis/report is in proper format as specified in the
guideline for thesis preparation. I also declare that the work containing in this report is
my own work. I understand that plagiarism is defined as any one or combination of the
following:
• To steal and pass off (the ideas or words of another) as ones own
• To use (anothers production) without crediting the source
• To commit literary theft
• To present as new and original an idea or product derived from an existing source.
I understand that plagiarism involves an intentional act by the plagiarist of using some-
one elses work completely/partially and claiming authorship of the work/ideas.I have
given due credit to the original authors/sources for all the words, ideas, diagrams, graph-
ics, computer programmes, experiments, results, websites, that are not my original con-
tribution. I have used quotation marks to identify verbatim sentences and given credit
to the original authors/sources.
I affirm that no portion of my work is plagiarized, and the experiments and results
reported in the report/dissertation/thesis are not manipulated. In the event of a com-
plaint of plagiarism and the manipulation of the experiments and results, I shall be fully
responsible and answerable. My faculty supervisor(s) will not be responsible for the
same.
Signature:
Name:
Roll No. :
Date :
i
ACKNOWLEDGEMENTS
I am highly indebted to Prof. Anupam Shukla and Dr. Joydip Dhar obliged for
giving me the autonomy of functioning and experimenting with ideas. I would like to
take the opportunity to express my profound gratitude to her not only for her academic
guidance but also for her interest in my project and constant support coupled with confi-
dence boosting and motivating sessions which proved very fruitful and were instrumental
in infusing self-assurance and trust within me.
Finally, I am grateful to all my friends, whose constant encouragement served to renew

my spirit, refocus my attention and energy and helped me in carrying out this work.
Date:
Amit Samdarshi
ii
Contents
LIST OF TABLES vi
1 INTRODUCTION AND MOTIVATION 2

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Basic Terminologies and Tools . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Aggregation Operators . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Subtractive clustering . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Fuzzy C-means clustering (FCM) . . . . . . . . . . . . . . . . . . 3
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 LITERATURE REVIEW AND GAP ANALYSIS 5

2.1 Various Works in the field of Stock Prediction . . . . . . . . . . . . . . . 5
2.2 OWA Weights and Various OWA Operators . . . . . . . . . . . . . . . . 6
2.2.1 Calculation of Weight Vector . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Ordered Weighted Averaging (OWA) . . . . . . . . . . . . . . . . 7
2.2.3 Induced Ordered Weighted Averaging (IOWA) . . . . . . . . . . . 8
2.2.4 Generalized Ordered Weighted Averaging (GOWA) . . . . . . . . 8
2.2.5 Quasi Ordered Weighted Averaging (GOWA) . . . . . . . . . . . 9
2.3 Gap Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 OBJECTIVE AND METHODOLOGY 11

3.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
iii
3.2.1 Data Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Fuzzy C Means Clustering . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Subtractive Clustering . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.4 Adaptive Neuro-Fuzzy Inference System (ANFIS) Framework . . 14
3.3 Data and Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 RESULTS AND DISCUSSION 21

4.1 Discussion from Phase I . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Discussion from Phase II . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Intermediate Results and Discussion from Phase III . . . . . . . . . . . . 23
4.4 Intermediate Results and Discussion from Phase IV . . . . . . . . . . . . 26
5 CONCLUSION 36
5.1 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 FUTURE SCOPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
iv
List of Figures
3.1 Fuzzy Membership Curve for the Year 2006 using Subtractive Clustering 15
3.10 ANFIS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.11 Flow Diagram of Proposed Methodology . . . . . . . . . . . . . . . . . . 20
4.1 ANFIS Implemented on MATLAB . . . . . . . . . . . . . . . . . . . . . 25

4.2 Comparison of RMSE by all Operators for Year 2012 . . . . . . . . . . . 25
4.3 Comparison of RMSE by OWA and Induced-OWA operators for all years 26
4.4 Impact of Range of Influence on RMSE (OWA/2012) . . . . . . . . . . . 28
4.5 RMSE for different Orness values for OWA operator in 2014 . . . . . . . 30
4.6 RMSE for different Orness values for IOWA operator in 2014 . . . . . . . 30
4.7 RMSE for different Orness values for GOWQA operator in 2014 . . . . . 31
4.8 RMSE for different Orness values for GOWHA operator in 2014 . . . . . 31
4.9 RMSE for different Orness values for GOWCA operator in 2014 . . . . . 32
4.10 RMSE for different Orness values for different operators in 2014 . . . . . 33
v
4.11 Graphical Representation of Best RMSE’s for different Years for IOWA
and OWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.12 Graphical Representation of Corresponding RMSE’s for different Years
for IOWA and OWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.13 IOWA Testing of Stock Market Indices for Recession Year 2009 using
Fuzzy C Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.14 IOWA Testing of Stock Market Indices for Recession Year 2009 using
Subtractive Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
vi
List of Tables
2.1 Weight Vector for different Orness α values . . . . . . . . . . . . . . . . . 7
4.1 Percentage RMS Difference Corresponding to different OWA Operators . 23

4.2 Table of RMSE for different Orness values for different operators in 2014 29
4.3 Best RMSE’s for different Years for IOWA and OWA, ROI in Parenthesis 29
4.4 Corresponding RMSE’s for different Years for IOWA and OWA, ROI in
Parenthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vii
List of Abbreviations
ANFIS . . . . . . . Adaptive Neuro-Fuzzy Inference System
FCM . . . . . . . . . Fuzzy C-Means Clustering
IOWA . . . . . . . . Induced Ordered Weighted Averaging
GOWA . . . . . . . Generalized Ordered Weighted Averaging
GPOWA . . . . . Generalized Probabilistic Ordered Weighted Averaging
GOWHA . . . . . Generalized Ordered Weighted Harmonic Averaging
GOWGA . . . . . Generalized Ordered Weighted Geometric Averaging
GOWQA . . . . . Generalized Ordered Weighted Quadratic Averaging
GOWCA . . . . . Generalized Ordered Weighted Cubic Averaging
POWA . . . . . . . Probabilistic Ordered Weighted Averaging
OWA . . . . . . . . . Ordered Weighted Averaging
QOWA . . . . . . . Quasi Ordered Weighted Averaging
1
Chapter 1
INTRODUCTION AND
MOTIVATION
1.1 Introduction
These days, economies have become very market-driven and correct prediction of the
mood of the market has become the key to excel in business, investments, policy making
and many other areas. Investors have always been showing greater interest in stock mar-
ket investment because of its high returns. However, stock market investment is always
riskier due to it uncertain and unpredictable nature. Investors, finance managers and
researchers have been trying to find new ways to predict the stock market more accu-
rately. In recent times tremendous research work has been published for the prediction
of uncertain nature of stock market. In recent years, artificial intelligence (AI) tools such
as genetic algorithms, fuzzy logic and neural networks have been successfully used in the
prediction of stock market. Since AI techniques can solve complex problems easily, this
is not the case with classical techniques [9]. So AI methods have become very popular
among researchers and many researchers have employed AI techniques for the prediction
of stock market [1, 2, 9, 10, 11, 12, 14, 17]. In most of the modern research works in
the field of Stock Prediction, Aggregation has been a step of large importance, but the
amount of experimentation in improvisation of prediction by implementation of various
aggregation operators that have been developed in recent years, was lacking. Hence,
2
our major emphasis will be on the implementation of different aggregation operators in
stock prediction problem.
1.2 Basic Terminologies and Tools
1.2.1 Aggregation Operators
An Aggregation operator is used to reduce computational complexity of high dimen-

sional data. It takes n dimensional complex data and results in a simpler form of data.
Hence makes the computational process easier. The most commonly known aggregation
operator is Order weighted averaging, OWA operator. The OWA first introduced by
Yager [7], has been widely used by researchers. In recent years, enormous related stud-
ies have been developed. There are many variants of OWA such as IOWA [4], POWA
[3], GOWA [5], GPOWA [5], QOWA [8] etc.
1.2.2 Subtractive clustering
Subtractive clustering algorithm is used when it’s hard to have a fine idea the number
of clusters we should have for a dataset. Subtractive clustering, is fast and one-pass
method for finding out the number of clusters along with cluster centers for a given
dataset. The cluster obtained by subclust or genfis2 function is used to initialize the it-
erative optimization clustering method and also model identification method (like anfis).
subclust function gives the clusters using the subtractive clustering. genfis2 builds upon
subclust function for providing a faster and single-pass method for taking input-output
data for training and generating a Sugeno FIS (fuzzy inference system) that can model
the data behavior.
1.2.3 Fuzzy C-means clustering (FCM)
Clustering is used to produce a concise representation of data from large data set by
dividing data into small clusters, so that the members from the same cluster are quite
similar, and the members from different clusters are quite different from each other. FCM
3
was proposed by Dunn [18]. FCM is a technique in which one element of data belongs
to one or more clusters. FCM uses an objective function which is to be minimized. The
objective function is given as follows:
c
X c X
X n
J(U, c1 , , cc ) = Ji = Ui j m di j 2 (1.1)
i=1 i=1 j=1
where Uij is the membership matrix having elements whose value lies between 0 and
1, dij = kci xjk is the Euclidean distance between the j th data point and the ith cluster
center, ci is the cluster center of ith fuzzy cluster and m (greater than 1) is a weighting
exponent.
1.3 Motivation
Stock market prediction is an area where innovation directly reciprocates into profits.
A lot of work has been done in this area but for considerably precise predictions a lot
is still needed to be done. Stock market prediction was initially considered purely a
topic of research for mathematicians. Soon this norm was broken with introduction of
artificial intelligence, intuition and soft computing to this area of research.
In the subsequent sections well see that there are many variants of OWA that can
be used for aggregation purpose. Not all of them have been tested in stock prediction
research works. Hence, we would like to explore and implement all the aggregation op-
erators mentioned above and find out which variant of OWA helps us to reach the most
precise results.
4
Chapter 2
LITERATURE REVIEW AND

GAP ANALYSIS
2.1 Various Works in the field of Stock Prediction

In recent years, artificial intelligence (AI) tools such as genetic algorithms, fuzzy logic
and neural networks have been successfully used in the prediction of stock market.
Since AI techniques can solve complex problems easily, this is not the case with classical
techniques. So AI methods have become very popular among researchers and many
researchers have employed AI techniques for the prediction of stock market [1, 2, 9, 10,
11, 12, 14, 17].
In most of the modern research works in the field of Stock Prediction, Aggregation has
been a step of large importance, but the amount of experimentation in improvisation of
prediction by implementation of various aggregation operators that have been developed
in recent years, was lacking. Hence, our major emphasis will be on the implementation
of different aggregation operators in stock prediction problem.
In past years, to improve predictions in stock markets, various fuzzy time-series
models uses various weighted technologies to predict the recurrence of the stock index
patterns. For example Yu [17] proposed weighted fuzzy time-series and Cheng et al.
[15] proposed trend weighting method [14], the two weighted fuzzy time-series methods
and the weighted transitional matrix method. Now, to improvise the forecasting Chen
5
and Chung [12, 13] gave a new fuzzy time series model with genetic algorithms to
determine the ideal length of linguistic intervals. Subsequently, Huarng and Yu [10] came
up with a fuzzy time-series model that employed ANN to extract the nonlinear fuzzy
relationships. However, to improve the accuracy, more than one variable are considered
to build models. Cheng and Wei [16] proposed a volatility model based on a multi-
stock index for Taiwan Stock Exchange (TIAEX) forecasting. Whereas, Cheng et al.
[1] have presented their work for TAIEX, that used Order Weighted Averaging (OWA),
Subtractive Clustering (SC) and Adaptive Neuro-Fuzzy Inference System (ANFIS). They
came up with predictions that were considerable from TIAEX perspective. We have
taken the work done by Cheng et al. [1] as primary reference and used the BSE30
dataset from year 2006 to 2014 for testing.
2.2 OWA Weights and Various OWA Operators
2.2.1 Calculation of Weight Vector
In 1988, Yager came up with two important characteristic measures of the weight vector
W of an OWA, which later proved to be very important. First measure is the orness of
aggregation denoted by the α.
n
X
α(W ) = (1/n − 1) (n − i) ∗ wi (2.1)
i=1
The other characteristic measure is the Dispersion of the weight vector. Dispersion
is defined as:
n
X
Disp(W ) = − wi ∗ ln(wi ) (2.2)
i=1
Orness has a special property. Say α is the orness of W = [w1 , w2 , ..wn ], then the
orness of W’ i.e. [wn , wn−1 , ..w1 ] α(W 0 ) will be:
α(W 0 ) = 1 − α (2.3)
6
Orness W1 W2 W3
α = 0.5 0.3333 0.3333 0.3333
α = 0.6 0.4384 0.3232 0.2384
α = 0.7 0.5540 0.2920 0.1540
α = 0.8 0.6819 0.2358 0.6819
α = 0.9 0.8263 0.1470 0.0263
Table 2.1: Weight Vector for different Orness α values
In 2001, Fuller and Majlender [20] transformed the Yagers OWA to a polynomial
form, using Lagrange multipliers. This polynomial form is used for calculation of weight.
j−1 n−j
ln wj = ln wn + ln w1 (2.4)
n−1 n−1
q
w1n−j wnj−1
n−1
wj = (2.5)
w1 [(n − 1)α + 1 − nw1 ]n = [(n − 1)α]n−1 [((n − 1)α − n)w1 + 1], (2.6)
1
if w1 = w2 = ... = wn = n
=⇒ disp(W ) = ln n(α = 0.5)then,
((n − 1)α − n)w1 + 1

wn = (2.7)
(n − 1)α + 1 − nw1
2.2.2 Ordered Weighted Averaging (OWA)
The OWA first introduced by Yagar [7], has been widely used by researchers. Fuller
and Majlender [19] determined the optimal weighing vector and solved a optimization
problem with constraints by using Lagrange multipliers. Fuller and Majlender [20],
employed the Kuhn-Tucker second order sufficiency conditions to optimize and derive
OWA weights. An OWA operator is defined as follows: An OWA operator of dimension
n is a mapping f: Rn → R that has an associated weighting vector W = [w1 , w2 , ..., wn ]T
with the following properties: wi ∈ [0, 1] for i ∈ I = 1, 2, 3, ..., n and ni=1 wi = 1 such
P
7
that
n
X
f (a1 , a2 , , an ) = w i bi (2.8)
i=1
2.2.3 Induced Ordered Weighted Averaging (IOWA)
The definition of IOWA is presented here as given by Yager and Filev in [4], in particular,
their convention for ties is used. Given a weighting vector w and an inducing variable z
the Induced Ordered Weighted Averaging (IOWA) function is
X
IOW Aw (< x1 , z1 >, , < xn , zn >) = i = 1n wi x(i) (2.9)
where the (.) notation denotes the inputs (xi , zi ) reordered such that z(1) z(2) ...z(n) . The
input pairs xi , zi may be two independent features of the same input, or can be related
by some function, i.e. zi = fi (xi ).
2.2.4 Generalized Ordered Weighted Averaging (GOWA)
A GOWA operator [5] of dimension n is a mapping GOWA: Rn → R that has an

associated weighting vector W of dimension n such that the sum of the weights is 1 and
wj ∈ [0, 1], then :
n 1/λ
X
λ
GOW Aw (a1 , a2 , , an ) = w i bi (2.10)
i=1
th
where bj is the j largest of the ai , and λ is a parameter such that λ ∈ (−∞, ∞).
Generalized Geometric Aggregation (GGA)
When λ → 0, the GOWA operator becomes the Generalized geometric ordered weighted
geometric averaging (GOWGA) operator
n
Y
GOW Aw (a1 , a2 , , an ) = bi wi (2.11)
i=1
Note that if wj = 1/n, for all ai , we get the Generalized geometric aggregation (GGA)
8
Generalized Harmonic Aggregation (GHA)
When λ → −1, the GOWA operator becomes the Generalized Harmonic ordered weighted
geometric averaging (GOWHA) operator
1
GOW Aw (a1 , a2 , , an ) = Pn wi (2.12)
i=1 bi
Note that if wj = 1/n, for all ai , we get the Generalized Harmonic aggregation (GHA)
Generalized Quadratic Aggregation (GQA)
When λ → 2, the GOWA operator becomes the Generalized ordered weighted quadratic
averaging (GOWQA) operator
n 1/2
X
2
GOW Aw (a1 , a2 , , an ) = w i bi (2.13)
i=1
Note that if wj = 1/n, for all ai , we get the Generalized Quadratic aggregation (GQA)
2.2.5 Quasi Ordered Weighted Averaging (GOWA)
A QOWA operator [8] of dimension n is a mapping GOWA: Rn → R that has an

associated weighting vector W of dimension n such that the sum of the weights is 1 and
wj ∈ [0, 1], then:
Xn
−1
QOW Aw (a1 , a2 , , an ) = f ( wi f (bi )) (2.14)
i=1
th
where bi is the i largest element in the collection of the aggregated objects a1 , ..., an .
Linear QOWA
If f(x) = x, then we get the OWA operator.
QOW Aw (a1 , a2 , , an ) = OW Aw (a1 , a2 , , an ) (2.15)
Reciprocal QOWA
If f(x) = 1/x, then we get the OWA operator.

1
QOW Aw (a1 , a2 , , an ) = Pn wi = OW HAw (a1 , a2 , , an ) (2.16)
i=1 bi
9
Polynomial QOWA
If f (x) = xλ , then we get the OWA operator.

n 1/λ
X
λ
QOW Aw (a1 , a2 , , an ) = w i bi = GOW Aw (a1 , a2 , , an ) (2.17)
i=1
Exponential QOWA
If f (x) = ex , then the equation becomes:

n
X
QOW Aw (a1 , a2 , , an ) = log( wi e(bi ) ) (2.18)
i=1
2.3 Gap Analysis

During all the improvisations in stock prediction research works, aggregation technique
has been the most neglected part as the simplest OWA operator has been used every-
where. As we saw in the previous section, there are many variants of OWA operators
possible but not as widely used as the simplest OWA operator. It might be the case that
of some variant of OWA would have lead us to some more accurate predictions. Hence,
there is lack of experimentation in stock market prediction area, as far as aggregation
operators are concerned.
10
Chapter 3
OBJECTIVE AND
METHODOLOGY
3.1 Objective
Our objective is to identify the aggregation operator that helps us in getting most
accurate predictions of stock market index using Adaptive Neuro-Fuzzy Inference System
and Subtractive Clustering.
3.2 Methodology
3.2.1 Data Aggregation
In last chapter, we have discussed a lot about data aggregation and different aggregation
operators. Averaging is very important to get comprehensive impression of a set of data
in a single value. The most popular aggregation operator is the OWA operator. We
have used may variations of OWA in our project. These are as follows: OWA, IOWA,
POWA, GOWA, GOWGA, GOWHA, GOWQA, QOWA etc.
11
3.2.2 Fuzzy C Means Clustering
Fuzzy C Means Clustering is an important step in fuzzy-time series problems. It divides

the data into small clusters, so that the members from the same cluster are quite similar,
and the members from different clusters are quite different from each other. FCM was
proposed by Dunn [18]. FCM is a technique in which one element of data belongs to
one or more clusters. FCM uses an objective function which is to be minimized.
Step 1: Initially we have to recognize the appropriate data points to become centroid
for the very first Iteration. We have chosen the 3 centroids to cluster the data into Low,
High, Medium.
Step 2: We sorted the data and out of N data entries the we assigned Clow = (N/6)th DataP oint,
Cmedium = (3N/6)th DataP oint and Chigh = (5N/6)th DataP oint. These are selected be-
cause on dividing data in 3 parts we get median as N/6, 3N/6 and 5N/6.
Step 3: We calculate a Distance Matrix based on the corresponding distance of a data-

point to a centroid i.e. dij = xi − cj . Here x is the member data.
Step 4: Once we are done with the creation of distance matrix, we can calculate Fuzzy
Membership Matrix.
1/d11 1/m−1
µ1 (x1 ) = Pp (3.1)
k=1 1/d11 1/(m−1)
We also make sure that the following condition holds
p
X
µi (xi ) = 1 (3.2)
i=1
Step 5: Based on the acquired membership matrix, we found out new centroids:
P m
i [µi (xj )] xi
Cj = P (3.3)
i [µi (xj )]
Step 6: We Kept on repeating Step 3 to Step 5 until the variation in the centroids
saturates.
12
3.2.3 Subtractive Clustering
Subtractive clustering algorithm is used when it’s hard to have a fine idea the number
of clusters we should have for a dataset. Subtractive clustering, is fast and one-pass
method for finding out the number of clusters along with cluster centers for a given
dataset. The cluster obtained by subclust or genfis2 function is used to initialize the it-
erative optimization clustering method and also model identification method (like anfis).
subclust function gives the clusters using the subtractive clustering. genfis2 builds upon
subclust function for providing a faster and single-pass method for taking input-output
data for training and generating a Sugeno FIS (fuzzy inference system) that can model
the data behavior.
There are four main parameters of subtractive clustering in genfis2. These are as follows:
• Range of Influence: It is the fraction of the width of the dataset to be considered

as neighborhood for a data point that is being estimated for a potential contender
for a cluster center.
• Squash Factor: It can be defined as a value that we multiply to the radius

of influence for determining the neighborhood of concerned cluster center under
which the presence of any other cluster centers has to be discouraged.
• Accept Ratio: It can be computed by the fraction of potential first center, above
which an other data point can be accepted.
• Reject Ratio: It is the condition for rejecting a potential data point for being a
cluster center [21], that is obtained from a fraction of potential first cluster center,
below that a data point should be rejected for being a cluster center.
Steps of Subtractive Clustering can be summed up as follows:
• Step 1: Let us consider n points x1 , x2 , ..xn in M dimensional dataspace. [21]

We consider each data point can be a potential cluster center so we calculate the
measure of potential of ith data point xi as:
n
X
Pi = e−α |xi − xj |2 (3.4)
j=1
13
where α = 4/ra2
ra is a positive constant that is the radius of neighborhood to be considered. It
depends on the range of influence. A data point that has many neighbors will have
a high potential value.
• Step 2: The data point with highest potential becomes the first cluster center.
[21] Let us say x∗1 is the first cluster center with potential value P1∗ then potential
of each point is revised as:
Pi ← P1∗ eβ |xi − x∗1 |2 (3.5)
where β = 4/rb2
• Step 3: Similarly, After k cluster centers have been obtained, for each data point
potential can be given by:
Pi ← Pi − Pk∗ eβ |xi − x∗k |2 (3.6)
where β = 4/rb2 x∗k isthelocationof kth cluster center that has potential Pk∗ .
• Step 4: The process of making new clusters continues till we reach a position where
all the remaining potentials are below a certain fraction of the first cluster center
potential. Hence, we get the optimum number of clusters based on the range of
influence fixed. Please note that the range of influence is the most deciding factor
as it directly impacts the number of clusters. Lower range of influence results
in high number of clusters and high range of influence result in less number of
clusters.
3.2.4 Adaptive Neuro-Fuzzy Inference System (ANFIS) Frame-

work
Our ANFIS comprises of 5 layers:
14
Figure 3.1: Fuzzy Membership Curve for the Year 2006 using Subtractive Clustering
Layer 1: Every node i in this layer is a square node with node function.
O1, i = µAi (x), f ori = 1, 2 (3.7)
15
O1, i = µBi−2 (y), f ori = 3, 4 (3.8)
where x, y are the inputs to node i, Ai(Bi2) is a linguistic labels for inputs. In
16
other words O1,i is the membership grade of Ai(Bi2). Normally membership function
17
Figure 3.10: ANFIS Architecture
for µAi (x) and µBi (x) are chosen to be generalized bell function:
1
µAi (x) , µBi (x) = x−ci 2bi
(3.9)
1+ ai
where ai , bi , ci are the parameters of membership function. These are also known as the
premise parameters.
Q
Layer 2: Every node in this layer is a circular node labeled , whose output is the
product of all the incoming signals:
O2,i = wi = µAi (x) µBi (x) , i = 1, 2; (3.10)
each node output represents the firing strength of a rule.
Layer 3: Every node in this layer is circular node labeled N. The ith node calculates
the ratio of the ith rules firing strength to the sum of firing strengths of all the rules:
wi
O3,i = wi = , i = 1, 2; (3.11)
w1 + w2
18
Layer 4: Every node i in this layer is a square node with a node function.
O4,i = wi .fi , i = 1, 2; (3.12)
where wi is the output of layer 3.
Layer 5: This is a single circular node labeledP, which computes the overall output
as the summation of all incoming signals:
P
X w i fi
O5,i = wi .fi = P = fout (3.13)
i
wi
3.3 Data and Technology

• 7 Year Data of Bombay Stock Exchange BSE30 index.
• Tools of MATLAB Software:
· Fuzzy Tool Box
· Statistical Tool Box
· Artificial Neural Networks Tool Box
19
Figure 3.11: Flow Diagram of Proposed Methodology
20
Chapter 4
RESULTS AND DISCUSSION
4.1 Discussion from Phase I

We have confined our field of study to stock market prediction by use of an aggregation
operator, Fuzzy C-Means Clustering (FCM) and Adaptive Neuro-Fuzzy Inference System
(ANFIS). FCM helped in proper interval determination, while ANFIS combines ANN
and fuzzy time series to provide excellent prediction. The aggregation operator Ordered
Weighted Averaging (OWA) help in decreasing complexity substantially. We have read
literature that applies OWA, FCM and ANFIS in some or the other way but very little
has been done to test different variants of OWA. Hence We have read about a number
of variants if OWA and also some other aggregation operators. We are still not sure
about the applicability of probabilistic OWA on stock market prediction problems but
will surely try to find this out in subsequent phases. We have also collected the sources
that provides raw data of BSE30.
4.2 Discussion from Phase II

We have accomplished some very crucial things in this phase. We started with data
collection for which we had to go through many websites that provide financial data of
stock market index. We relied on Yahoo Finance that provided us the raw data from
year 2006 to 2012. Hence we are taking 7 years of BSE30 data under our consideration.
21
Now, we started the work of data prepossessing. As we see that the data is collected
from recognized sources, we had not to worry about the missing value problems. The
important part of this step was the selection of the stock index that is most important.
For this we had many contenders like, opening value, peak value, closing value etc. Out
of these, the one that was most important was the closing value, as it gives the complete
sentiment of the market. Hence we kept the closing data and removed all the redundant
data.
Now we had to start the work of data aggregation, as it improves the complexity of
overall operation many folds. As aggregation is the prime area of experimentation in
our project, we had to implement all the aggregation operators we have talked about.
We started with the basic implementation of OWA and it gave us the platform to test
more and more operators with similar characteristics but entirely different behavior.
Then we implemented the IOWA operator where we sorted the data in the order of their
significance. Then we moved to POWA, here we realize that the applicability of POWA
in our problem is very limited as we can not be sure about the probability distribution
of data and any vague or unscientific assumption can result in absurd results. Hence we
moved to GOWA that itself is a source of many other operators. The experimentation
we did with the value of λ resulted in GOWGA, GOWHA, GOWQA and GOWCA. At
last we implemented the QOWA operator, where we saw that the behavior of QOWA op-
erator. QOWA differs entirely on the value of f(x), it can take us back to OWA, OWHA
and GOWA on choosing suitable f(x). Implementation of the exponential QOWA was
the last experimentation we performed as far as aggregation is concerned.
Finally, we started with the work of data clustering. As we are using ANFIS, FCM
was the best clustering algorithm among all. We implemented FCM right from the
scratch, we first of all sorted the data and selected centroids for very first iteration.
According to the selected centroids we built the distance matrix. The distance matrix
helped us to generated the fuzzy membership matrix. This operations were performed
22
Aggregator Percentage RMS Differnce
Ordered Weighted Averaging 1.158
Induced Ordered Weighted Averaging 1.393
Probabilistic Ordered Weighted Averaging N/A
Generalized Ordered Weighted Geometric Averaging 203.556
Generalized Ordered Weighted Harmonic Averaging 33.146
Generalized Ordered Weighted Quadratic Averaging 1.157
Generalized Ordered Weighted Cubic Averaging 1.159
Reciprocal - Quasi Ordered Weighted Averaging 33.146
Linear - Quasi Ordered Weighted Averaging 1.158
Exponential - Quasi Ordered Weighted Averaging N/A
Table 4.1: Percentage RMS Difference Corresponding to different OWA Operators
iteratively till we get saturated value of centroids.
4.3 Intermediate Results and Discussion from Phase

III
The core implementation of the methodologies starts with this phase. In this subsection,
we will discuss how the things unfolded in the third phase of development. After second
phase we were left with the aggregated data of stock market indices using various aggre-
gation operators. We then tried to take the aggregated data of of three days as a near
approximation of fourth day. The results of this analysis was very helpful in determining
the most worth aggregation operators from prediction point of view. We calculated the
percentage RMS Difference between OWA of three days and the original value of fourth
day. The results that we achieved after this analysis can be seen in the following figure.
On discussing the results of the analysis done in the above table, we could clearly
see the striking similarities between few pairs of aggregation operators. The Generalized
Ordered Weighted Harmonic Operator is exactly same as the Recipocal Quasi Ordered
Weighted Averaging Operator, same is the case with OWA and Linear-QOWA. We can
23
also notice that POWA and Exponential-QOWA can be discarded on the basis of non-
applicability of the operator on the current system. GOWGA is a very good aggregator
but in this case itgives absurdly high rate of error hence it can also be discarded from
further consideration. So we are left with Basic-OWA, GOWQA, GOWCA, GOWHA
and IOWA.
Subsequently, we started the Fuzzy C Means clustering of the aggregated data for all
the years that are under consideration. Although we have written the code for FCM
from scratch, but for providing the input to ANFIS tool provided in MATLAB, we used
Genfis3. The membership curve extracted on applying Fuzzy C Means Clustering on
year 2012, is as follows.
After FCM we were fully equipped to implement ANFIS which is the most latter part
of our research work. Using the training data, testing data, Generate-FIS MATLAB
functions and ANFIS-edit interface, we could finally reach the schematic diagram of our
ANFIS on MATLAB. We have to know some important things before getting to the
ANFIS Diagram. Here we have 5 layer ANFIS with two inputs and one output. Every
year that contains n working days, we have used n-100 days for training and 100 days
for testing. Following is the schematic overview of ANFIS implemented in MATLAB:
Now we compared the RMSE’s for the year 2012 using all considerable operators.
The results of these operations are as follows:
By the above figure we can clearly see that OWA, GOWQA, GOWCA give approx-
imately same RMSE. While the GOWHA is more or less out of the picture, as it could
not provide best RMSE for any value of Orness. The most interesting and useful be-
haviour is shown by IOWA. IOWA gives best RMSE at α = 0.8. Hence, we compared
OWA and Induced OWA for all the years. The results are as follows:
Our goal is to get the most precise prediction with least Root Mean Square Error.
RMSE is calculated as follows:
s
(actual(t) − f orecast(t))2
P
i
RM SE = (4.1)
n
24
Figure 4.1: ANFIS Implemented on MATLAB
Figure 4.2: Comparison of RMSE by all Operators for Year 2012
By analysing Figure 4.6 we can clearly notice that the Induced ordered weighted
averaging is giving much better performance thn the Basic-OWA operator. The RMSE
25
Figure 4.3: Comparison of RMSE by OWA and Induced-OWA operators for all years
of IOWA has been consistently lower than that of OWA. We can not be very sure about
the reson for this behaviour. One of the explanations we can think of is that the method
of prioritizing the stock data of previous days and assignment of weights is quite different.
OWA gives higher priority to to the index that has more value, on the other hand, IOWA
provides more value to the index that is more recent.
4.4 Intermediate Results and Discussion from Phase

IV
After successful implementation of ANFIS using FCM and various aggregation opera-
tors we came to a position where we can have at least a vague idea of the performance
of various aggregation operators in stock market problems. But still, there were scope
of improvements. Fuzzy C Means clustering was resulting in considerable RMSE but
we wanted to reduce it even further. Hence, we decided to implement the Subtractive
26
Clustering with ANFIS using genfis2 function given in MATLAB.
On implementation of subtractive clustering we measured the performance of predic-

tion on testing data using RMSE. Surprisingly, RMSE of testing data improved to a
very large extent. For most of the years the RMSE remained below 20 points. This was
remarkable when compared to the traditional OWA operator on FCM.
Now we decided, to implement it on the latest stock market data. Hence, we took
the BSE30 data from BSE India website and did the testing on data from 2006 to 2014.
No we were confident that the RMSE of our prediction will be around 20 points in any
year. Having implemented Subtractive clustering, we were forgetting a very important
point. Subtractive clustering when used for fuzzy rulebase creation, has many parame-
ters which can have a huge impact on the prediction.
Actually, subtractive clustering has 4 important parameters. These are as follows:
• Range of Influence
• Squash Factor
• Accept Ratio
• Reject Ratio
We have to note that the range of influence is the most deciding factor as it directly
impacts the number of clusters. Hence, it comes up as the most significant parameter in
deciding the performance of prediction. In Figure 4.4 we can see the impact of change in
Range of Influence in the RMSE of prediction using traditional OWA for the year 2012.
Using the Subtractive clustering, we can calculate the RMSE of different considerable
operators in Figure 4.10 and finally decide which operator is the best replacement for
traditional OWA. We compare the RMSE for different value of orness for the year.
By the above figure we can clearly see that OWA, GOWQA, GOWCA give approx-
imately same RMSE. While the GOWHA is more or less out of the picture, as it could
27
Figure 4.4: Impact of Range of Influence on RMSE (OWA/2012)
not provide best RMSE for any value of Orness. The most interesting and useful be-
haviour is shown by IOWA. IOWA gives best RMSE at α = 0.8. But still we haven’t
considered the Range of Influence parameter of subtractive clustering.
In Figure 4.11 we can compare the best values of IOWA and the OWA RMSE for
some range of influence. The comparison will is done on all the years 2006 to 2014. The
tabular representation of the desire comparison is as follows:
Since, we have taken the best values of RMSE’s for both the operators, OWA and
IOWA, we have to take an estimate of the corresponding RMSE’s for both the operators
for having a rational comparison. Here corresponding RMSE stands for the RMSE that
is measured on same range of influence, we can see this in Figure 4.12.
After comparison using BSE30 data of 2009, we can declare it beyond doubt that
IOWA with Subtractive Clustering performs better than any other combination when
prediction is done by ANFIS. The comparison can be seen in Figure 4.13 and Figure
4.14.
28
α OWA IOWA GOWHA GOWQA GOWCA
0.5 70.324 62.457 94.002 69.414 71.214
0.6 64.965 49.887 100.515 65.651 69.165
0.7 78.072 28.718 96.243 76.521 77.021
0.8 75.982 7.596 121.494 75.128 74.812
0.9 83.299 14.272 113.453 82.241 83.901
Table 4.2: Table of RMSE for different Orness values for different operators in 2014
Years Best Case OWA Best Case IOWA

2006 30.760(0.4) 5.132(0.8)
2007 71.721(0.8) 10.841(0.8)
2008 158.210(0.4) 19.513(0.5)
2009 47.210(0.4) 7.405(0.6)
2010 41.760(0.8) 6.263(0.7)
2011 86.802(0.3) 9.572(0.3)
2012 27.866(0.3) 4.655(0.8)
2013 59.916(0.9) 10.043(0.5)
2014 48.221(0.8) 7.596(0.5)
Table 4.3: Best RMSE’s for different Years for IOWA and OWA, ROI in Parenthesis
29
Figure 4.5: RMSE for different Orness values for OWA operator in 2014
Figure 4.6: RMSE for different Orness values for IOWA operator in 2014
30
Figure 4.7: RMSE for different Orness values for GOWQA operator in 2014
Figure 4.8: RMSE for different Orness values for GOWHA operator in 2014
31
Figure 4.9: RMSE for different Orness values for GOWCA operator in 2014
Years Corresponding OWA Best Case IOWA

2006 32.578(0.8) 5.132(0.8)
2007 71.721(0.8) 10.841(0.8)
2008 158.221(0.5) 19.513(0.5)
2009 52.912(0.6) 7.405(0.6)
2010 42.735(0.7) 6.263(0.7)
2011 86.801(0.3) 9.572(0.3)
2012 34.731(0.8) 4.655(0.8)
2013 67.522(0.5) 10.043(0.5)
2014 70.324(0.5) 7.596(0.5)
Table 4.4: Corresponding RMSE’s for different Years for IOWA and OWA, ROI in
Parenthesis
32
Figure 4.10: RMSE for different Orness values for different operators in 2014
33
Figure 4.11: Graphical Representation of Best RMSE’s for different Years for IOWA
and OWA
Figure 4.12: Graphical Representation of Corresponding RMSE’s for different Years for
IOWA and OWA
34
Figure 4.13: IOWA Testing of Stock Market Indices for Recession Year 2009 using Fuzzy
C Means Clustering
Figure 4.14: IOWA Testing of Stock Market Indices for Recession Year 2009 using
Subtractive Clustering
35
Chapter 5
CONCLUSION
5.1 CONCLUSION
Conclusively, we can say that we could deliver the research output we were supposed
to, to a very large extent. Starting a problem from the scratch takes a good amount of
hardship and this problem was not an exception. We started with an idea of coming up
with such an aggregator that can give a better performance in prediction, when compared
to the traditional OWA. The base-paper we referred to, is based on Taiwanese Stock
Market and we were not sure about its applicability on the Indian counterpart.
We started approaching the problem by mining for different variants of OWA operator,
luckily we could find out considerable number of aggregators to start our research work.
Once we got all the operators, we had to get the weight vector for application of OWA.
We referred to the Yager’s formula for weight vector and aggregated the data of all the
years with different operators using a window of 3 days and used it as an approximation
for the stock index of the 4th day. This helped us to eliminate some useless contenders
for being the best performing operator.
After applying ANFIS using all the potential best performing operators for the year 2014,
we compared the RMSE for different operators and concluded that the only operator that
can outperform the traditional OWA in stock market problems is IOWA. We compared
both the operators for all the years using Subtractive clustering and concluded that
IOWA reduces the RMSE many folds.
36
5.2 FUTURE SCOPE
We have come a long way, from 100+ to single digit RMSE. As the estimator is perform-
ing well, it can be built as a software with easy to use user interface. Even though we
have reached a good RMSE, the estimator can be further improve by experimentation on
other parameters of subtractive clustering, such as Squash factor, Accept Ratio, Reject
ratio. We can also check the performance by applying grid partitioning.
37
Bibliography
[1] Cheng, C.H., Wei, L. Y., Liu, J.W., Chen, T. L., OWA based ANFIS model for
TAIEX forecasting, Economic Modelling 30 (2013) 442448.
[2] G. Kaur, J. Dhar, R.K. Guha, Financial Time Series Forecasting by combining
Adaptive Network based Fuzzy Inference system with OWA Operator, In Press.
[3] J.M. Merig, Fuzzy multi-person decision making with fuzzy probabilistic aggrega-
tion operators, International Journal of Fuzzy Systems, 13(3) (2011) 163-174.
[4] Ronald R. Yager, Dimitar P. Filev, Induced Ordered Weighted Averaging Opera-
tors, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions, 29
(2) (1999) 141 150.
[5] Yager R., Kacprzyk J., Beliakov G., Recent Developments In The Ordered Weighted
Averaging Operators, Springer (2011), ISBN 3642179096.
[6] M. Detyniecki, Fundamentals on Aggregation Operators, Manuscript, Computer

Science Division, Universicty of California, Berkeley (2001).
[7] R. Yagar, On ordered weighted averaging aggeriation operators in multi-criteria

decision making, 260 IEEE Transactions on Systems, Man, and Cybernetics 18
(1988) 183190.
[8] Liu, Jinpei; Lin, Sheng; Chen, Huayou; Zhou, Ligang, The Continuous Quasi-
OWA Operator and its Application to Group Decision Making, Group Decision
Negotiation; 22(4) (2001) 715-723.
38
[9] Huarng, K.H., Effective lengths of intervals to improve forecasting in fuzzy time
series, Fuzzy Sets and Systems 123 (2001) 155162.
[10] Huarng, K.H., Yu, H.K., The application of neural networks to forecast fuzzy time
series, Physica A 336 (2006) 481491
[11] Huarng, K.H., Yu, H.K., Hsu, Y.W., A multivariate heuristic model for fuzzy time-
series forecasting, IEEE Transactions on Systems, Man, and Cybernetics. Part B,
Cybernetics 37 (4) (2007) 836846.
[12] Chen, S.-M., Chung, N.-Y., Forecasting enrollments of students by using fuzzy
time series and genetic algorithms, International Journal of Information and Man-
agement Sciences 17 (2006) 117.
[13] Chen, S.M., Chung, N.Y., Forecasting enrollments using high-order fuzzy time series
and genetic algorithms, International of Intelligent Systems 21 (2006) 485501.
[14] Cheng, C.H., Chen, T.L., Chiang, C.H., Trend-weighted fuzzy time-series model for
TAIEX forecasting, Lecture Notes in Computer Science 4234 (2006) 469477.
[15] Cheng, C.H., Wang, J.W., Li, C.H., Forecasting the number of outpatient visits
using a new fuzzy time series based on weighted-transitional matrix, Expert Systems
with Applications 34 (2008) 25682575.
[16] Cheng, C.H., Wei, L.Y., Volatility model based on multi-stock index for TAIEX
forecasting, Expert Systems with Applications 36 (3, Part 1) (2009) 61876191.
[17] Yu, H.K., Weighted fuzzy time-series models for TAIEX forecasting, Physica A 349
(2005) 609624.
[18] J. Dunn, A fuzzy relative of the isodata process and its use in detecting compact,
well-separated clusters, Journal of Cybernetics 3 (1973) 3257.
[19] R. Fuller and P. Majlender, An analytic approach for obtaing maximal entropy
OWA operator weights, Fuzzy Sets and Systms 124 (2001) 5357.
39
[20] R. Fuller and P. Majlender, On obtaining minimal variability OWA operator
weights, Fuzzy Sets and Systems 136 (2003) 203215.
[21] S. Chopra, R. Mitra and V. Kumar, Identification of Rules Using Subtractive Clus-
tering with Application to Fuzzy Controllers, Third International Conference on
Machine Learning and Cybernetics, Shanghai (2004) 41254130.
40

Financial Time Series Forecasting by Combining Anfis With Various Aggregation Operators

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Financial Time Series Forecasting by Combining Anfis With Various Aggregation Operators

Uploaded by

Copyright:

Available Formats

FINANCIAL TIME SERIES

FORECASTING BY COMBINING ANFIS

A project report submitted in partial fulfillment of the requirements for

Amit Samdarshi (2010IPG-009)

ABV INDIAN INSTITUTE OF INFORMATION

• To use (anothers production) without crediting the source

• To commit literary theft

Finally, I am grateful to all my friends, whose constant encouragement served to renew

1 INTRODUCTION AND MOTIVATION 2

2 LITERATURE REVIEW AND GAP ANALYSIS 5

3 OBJECTIVE AND METHODOLOGY 11

4 RESULTS AND DISCUSSION 21

4.1 ANFIS Implemented on MATLAB . . . . . . . . . . . . . . . . . . . . . 25

2.1 Weight Vector for different Orness α values . . . . . . . . . . . . . . . . . 7

4.1 Percentage RMS Difference Corresponding to different OWA Operators . 23

ANFIS . . . . . . . Adaptive Neuro-Fuzzy Inference System

FCM . . . . . . . . . Fuzzy C-Means Clustering

IOWA . . . . . . . . Induced Ordered Weighted Averaging

GOWA . . . . . . . Generalized Ordered Weighted Averaging

GPOWA . . . . . Generalized Probabilistic Ordered Weighted Averaging

GOWHA . . . . . Generalized Ordered Weighted Harmonic Averaging

GOWGA . . . . . Generalized Ordered Weighted Geometric Averaging

GOWQA . . . . . Generalized Ordered Weighted Quadratic Averaging

GOWCA . . . . . Generalized Ordered Weighted Cubic Averaging

POWA . . . . . . . Probabilistic Ordered Weighted Averaging

OWA . . . . . . . . . Ordered Weighted Averaging

QOWA . . . . . . . Quasi Ordered Weighted Averaging

1.2 Basic Terminologies and Tools

1.2.1 Aggregation Operators

An Aggregation operator is used to reduce computational complexity of high dimen-

1.2.2 Subtractive clustering

1.2.3 Fuzzy C-means clustering (FCM)

LITERATURE REVIEW AND

2.1 Various Works in the field of Stock Prediction

2.2 OWA Weights and Various OWA Operators

2.2.1 Calculation of Weight Vector

Table 2.1: Weight Vector for different Orness α values

((n − 1)α − n)w1 + 1

2.2.2 Ordered Weighted Averaging (OWA)

2.2.3 Induced Ordered Weighted Averaging (IOWA)

2.2.4 Generalized Ordered Weighted Averaging (GOWA)

A GOWA operator [5] of dimension n is a mapping GOWA: Rn → R that has an

Generalized Geometric Aggregation (GGA)

Generalized Quadratic Aggregation (GQA)

2.2.5 Quasi Ordered Weighted Averaging (GOWA)

A QOWA operator [8] of dimension n is a mapping GOWA: Rn → R that has an

If f(x) = x, then we get the OWA operator.

QOW Aw (a1 , a2 , , an ) = OW Aw (a1 , a2 , , an ) (2.15)

If f(x) = 1/x, then we get the OWA operator.

If f (x) = xλ , then we get the OWA operator.

If f (x) = ex , then the equation becomes:

2.3 Gap Analysis

3.2.1 Data Aggregation

Fuzzy C Means Clustering is an important step in fuzzy-time series problems. It divides

Step 3: We calculate a Distance Matrix based on the corresponding distance of a data-

• Range of Influence: It is the fraction of the width of the dataset to be considered

• Squash Factor: It can be defined as a value that we multiply to the radius

Steps of Subtractive Clustering can be summed up as follows:

• Step 1: Let us consider n points x1 , x2 , ..xn in M dimensional dataspace. [21]

Pi ← P1∗ eβ |xi − x∗1 |2 (3.5)

Pi ← Pi − Pk∗ eβ |xi − x∗k |2 (3.6)

3.2.4 Adaptive Neuro-Fuzzy Inference System (ANFIS) Frame-

Our ANFIS comprises of 5 layers:

O1, i = µAi (x), f ori = 1, 2 (3.7)