Professional Documents
Culture Documents
Financial Time Series Forecasting by Combining Anfis With Various Aggregation Operators
Financial Time Series Forecasting by Combining Anfis With Various Aggregation Operators
M.Tech.
by
2015
CANDIDATE’S DECLARATION
I hereby certify that I have properly checked and verified all the items as prescribed in
the checklist and ensure that my thesis/report is in proper format as specified in the
guideline for thesis preparation. I also declare that the work containing in this report is
my own work. I understand that plagiarism is defined as any one or combination of the
following:
• To steal and pass off (the ideas or words of another) as ones own
• To present as new and original an idea or product derived from an existing source.
I understand that plagiarism involves an intentional act by the plagiarist of using some-
one elses work completely/partially and claiming authorship of the work/ideas.I have
given due credit to the original authors/sources for all the words, ideas, diagrams, graph-
ics, computer programmes, experiments, results, websites, that are not my original con-
tribution. I have used quotation marks to identify verbatim sentences and given credit
to the original authors/sources.
I affirm that no portion of my work is plagiarized, and the experiments and results
reported in the report/dissertation/thesis are not manipulated. In the event of a com-
plaint of plagiarism and the manipulation of the experiments and results, I shall be fully
responsible and answerable. My faculty supervisor(s) will not be responsible for the
same.
Signature:
Name:
Roll No. :
Date :
i
ACKNOWLEDGEMENTS
I am highly indebted to Prof. Anupam Shukla and Dr. Joydip Dhar obliged for
giving me the autonomy of functioning and experimenting with ideas. I would like to
take the opportunity to express my profound gratitude to her not only for her academic
guidance but also for her interest in my project and constant support coupled with confi-
dence boosting and motivating sessions which proved very fruitful and were instrumental
in infusing self-assurance and trust within me.
Date:
Amit Samdarshi
ii
Contents
LIST OF TABLES vi
iii
3.2.1 Data Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Fuzzy C Means Clustering . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Subtractive Clustering . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.4 Adaptive Neuro-Fuzzy Inference System (ANFIS) Framework . . 14
3.3 Data and Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 CONCLUSION 36
5.1 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 FUTURE SCOPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
iv
List of Figures
3.1 Fuzzy Membership Curve for the Year 2006 using Subtractive Clustering 15
3.2 Fuzzy Membership Curve for the Year 2007 using Subtractive Clustering 15
3.3 Fuzzy Membership Curve for the Year 2008 using Subtractive Clustering 15
3.4 Fuzzy Membership Curve for the Year 2009 using Subtractive Clustering 16
3.5 Fuzzy Membership Curve for the Year 2010 using Subtractive Clustering 16
3.6 Fuzzy Membership Curve for the Year 2011 using Subtractive Clustering 16
3.7 Fuzzy Membership Curve for the Year 2012 using Subtractive Clustering 17
3.8 Fuzzy Membership Curve for the Year 2013 using Subtractive Clustering 17
3.9 Fuzzy Membership Curve for the Year 2014 using Subtractive Clustering 17
3.10 ANFIS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.11 Flow Diagram of Proposed Methodology . . . . . . . . . . . . . . . . . . 20
v
4.11 Graphical Representation of Best RMSE’s for different Years for IOWA
and OWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.12 Graphical Representation of Corresponding RMSE’s for different Years
for IOWA and OWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.13 IOWA Testing of Stock Market Indices for Recession Year 2009 using
Fuzzy C Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.14 IOWA Testing of Stock Market Indices for Recession Year 2009 using
Subtractive Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
vi
List of Tables
vii
List of Abbreviations
1
Chapter 1
INTRODUCTION AND
MOTIVATION
1.1 Introduction
These days, economies have become very market-driven and correct prediction of the
mood of the market has become the key to excel in business, investments, policy making
and many other areas. Investors have always been showing greater interest in stock mar-
ket investment because of its high returns. However, stock market investment is always
riskier due to it uncertain and unpredictable nature. Investors, finance managers and
researchers have been trying to find new ways to predict the stock market more accu-
rately. In recent times tremendous research work has been published for the prediction
of uncertain nature of stock market. In recent years, artificial intelligence (AI) tools such
as genetic algorithms, fuzzy logic and neural networks have been successfully used in the
prediction of stock market. Since AI techniques can solve complex problems easily, this
is not the case with classical techniques [9]. So AI methods have become very popular
among researchers and many researchers have employed AI techniques for the prediction
of stock market [1, 2, 9, 10, 11, 12, 14, 17]. In most of the modern research works in
the field of Stock Prediction, Aggregation has been a step of large importance, but the
amount of experimentation in improvisation of prediction by implementation of various
aggregation operators that have been developed in recent years, was lacking. Hence,
2
our major emphasis will be on the implementation of different aggregation operators in
stock prediction problem.
Subtractive clustering algorithm is used when it’s hard to have a fine idea the number
of clusters we should have for a dataset. Subtractive clustering, is fast and one-pass
method for finding out the number of clusters along with cluster centers for a given
dataset. The cluster obtained by subclust or genfis2 function is used to initialize the it-
erative optimization clustering method and also model identification method (like anfis).
subclust function gives the clusters using the subtractive clustering. genfis2 builds upon
subclust function for providing a faster and single-pass method for taking input-output
data for training and generating a Sugeno FIS (fuzzy inference system) that can model
the data behavior.
Clustering is used to produce a concise representation of data from large data set by
dividing data into small clusters, so that the members from the same cluster are quite
similar, and the members from different clusters are quite different from each other. FCM
3
was proposed by Dunn [18]. FCM is a technique in which one element of data belongs
to one or more clusters. FCM uses an objective function which is to be minimized. The
objective function is given as follows:
c
X c X
X n
J(U, c1 , , cc ) = Ji = Ui j m di j 2 (1.1)
i=1 i=1 j=1
where Uij is the membership matrix having elements whose value lies between 0 and
1, dij = kci xjk is the Euclidean distance between the j th data point and the ith cluster
center, ci is the cluster center of ith fuzzy cluster and m (greater than 1) is a weighting
exponent.
1.3 Motivation
Stock market prediction is an area where innovation directly reciprocates into profits.
A lot of work has been done in this area but for considerably precise predictions a lot
is still needed to be done. Stock market prediction was initially considered purely a
topic of research for mathematicians. Soon this norm was broken with introduction of
artificial intelligence, intuition and soft computing to this area of research.
In the subsequent sections well see that there are many variants of OWA that can
be used for aggregation purpose. Not all of them have been tested in stock prediction
research works. Hence, we would like to explore and implement all the aggregation op-
erators mentioned above and find out which variant of OWA helps us to reach the most
precise results.
4
Chapter 2
5
and Chung [12, 13] gave a new fuzzy time series model with genetic algorithms to
determine the ideal length of linguistic intervals. Subsequently, Huarng and Yu [10] came
up with a fuzzy time-series model that employed ANN to extract the nonlinear fuzzy
relationships. However, to improve the accuracy, more than one variable are considered
to build models. Cheng and Wei [16] proposed a volatility model based on a multi-
stock index for Taiwan Stock Exchange (TIAEX) forecasting. Whereas, Cheng et al.
[1] have presented their work for TAIEX, that used Order Weighted Averaging (OWA),
Subtractive Clustering (SC) and Adaptive Neuro-Fuzzy Inference System (ANFIS). They
came up with predictions that were considerable from TIAEX perspective. We have
taken the work done by Cheng et al. [1] as primary reference and used the BSE30
dataset from year 2006 to 2014 for testing.
In 1988, Yager came up with two important characteristic measures of the weight vector
W of an OWA, which later proved to be very important. First measure is the orness of
aggregation denoted by the α.
n
X
α(W ) = (1/n − 1) (n − i) ∗ wi (2.1)
i=1
The other characteristic measure is the Dispersion of the weight vector. Dispersion
is defined as:
n
X
Disp(W ) = − wi ∗ ln(wi ) (2.2)
i=1
Orness has a special property. Say α is the orness of W = [w1 , w2 , ..wn ], then the
orness of W’ i.e. [wn , wn−1 , ..w1 ] α(W 0 ) will be:
α(W 0 ) = 1 − α (2.3)
6
Orness W1 W2 W3
α = 0.5 0.3333 0.3333 0.3333
α = 0.6 0.4384 0.3232 0.2384
α = 0.7 0.5540 0.2920 0.1540
α = 0.8 0.6819 0.2358 0.6819
α = 0.9 0.8263 0.1470 0.0263
In 2001, Fuller and Majlender [20] transformed the Yagers OWA to a polynomial
form, using Lagrange multipliers. This polynomial form is used for calculation of weight.
j−1 n−j
ln wj = ln wn + ln w1 (2.4)
n−1 n−1
q
w1n−j wnj−1
n−1
wj = (2.5)
w1 [(n − 1)α + 1 − nw1 ]n = [(n − 1)α]n−1 [((n − 1)α − n)w1 + 1], (2.6)
1
if w1 = w2 = ... = wn = n
=⇒ disp(W ) = ln n(α = 0.5)then,
The OWA first introduced by Yagar [7], has been widely used by researchers. Fuller
and Majlender [19] determined the optimal weighing vector and solved a optimization
problem with constraints by using Lagrange multipliers. Fuller and Majlender [20],
employed the Kuhn-Tucker second order sufficiency conditions to optimize and derive
OWA weights. An OWA operator is defined as follows: An OWA operator of dimension
n is a mapping f: Rn → R that has an associated weighting vector W = [w1 , w2 , ..., wn ]T
with the following properties: wi ∈ [0, 1] for i ∈ I = 1, 2, 3, ..., n and ni=1 wi = 1 such
P
7
that
n
X
f (a1 , a2 , , an ) = w i bi (2.8)
i=1
The definition of IOWA is presented here as given by Yager and Filev in [4], in particular,
their convention for ties is used. Given a weighting vector w and an inducing variable z
the Induced Ordered Weighted Averaging (IOWA) function is
X
IOW Aw (< x1 , z1 >, , < xn , zn >) = i = 1n wi x(i) (2.9)
where the (.) notation denotes the inputs (xi , zi ) reordered such that z(1) z(2) ...z(n) . The
input pairs xi , zi may be two independent features of the same input, or can be related
by some function, i.e. zi = fi (xi ).
n 1/λ
X
λ
GOW Aw (a1 , a2 , , an ) = w i bi (2.10)
i=1
th
where bj is the j largest of the ai , and λ is a parameter such that λ ∈ (−∞, ∞).
When λ → 0, the GOWA operator becomes the Generalized geometric ordered weighted
geometric averaging (GOWGA) operator
n
Y
GOW Aw (a1 , a2 , , an ) = bi wi (2.11)
i=1
Note that if wj = 1/n, for all ai , we get the Generalized geometric aggregation (GGA)
8
Generalized Harmonic Aggregation (GHA)
When λ → −1, the GOWA operator becomes the Generalized Harmonic ordered weighted
geometric averaging (GOWHA) operator
1
GOW Aw (a1 , a2 , , an ) = Pn wi (2.12)
i=1 bi
Note that if wj = 1/n, for all ai , we get the Generalized Harmonic aggregation (GHA)
When λ → 2, the GOWA operator becomes the Generalized ordered weighted quadratic
averaging (GOWQA) operator
n 1/2
X
2
GOW Aw (a1 , a2 , , an ) = w i bi (2.13)
i=1
Note that if wj = 1/n, for all ai , we get the Generalized Quadratic aggregation (GQA)
Linear QOWA
Reciprocal QOWA
9
Polynomial QOWA
Exponential QOWA
10
Chapter 3
OBJECTIVE AND
METHODOLOGY
3.1 Objective
Our objective is to identify the aggregation operator that helps us in getting most
accurate predictions of stock market index using Adaptive Neuro-Fuzzy Inference System
and Subtractive Clustering.
3.2 Methodology
In last chapter, we have discussed a lot about data aggregation and different aggregation
operators. Averaging is very important to get comprehensive impression of a set of data
in a single value. The most popular aggregation operator is the OWA operator. We
have used may variations of OWA in our project. These are as follows: OWA, IOWA,
POWA, GOWA, GOWGA, GOWHA, GOWQA, QOWA etc.
11
3.2.2 Fuzzy C Means Clustering
Step 2: We sorted the data and out of N data entries the we assigned Clow = (N/6)th DataP oint,
Cmedium = (3N/6)th DataP oint and Chigh = (5N/6)th DataP oint. These are selected be-
cause on dividing data in 3 parts we get median as N/6, 3N/6 and 5N/6.
Step 4: Once we are done with the creation of distance matrix, we can calculate Fuzzy
Membership Matrix.
1/d11 1/m−1
µ1 (x1 ) = Pp (3.1)
k=1 1/d11 1/(m−1)
We also make sure that the following condition holds
p
X
µi (xi ) = 1 (3.2)
i=1
Step 5: Based on the acquired membership matrix, we found out new centroids:
P m
i [µi (xj )] xi
Cj = P (3.3)
i [µi (xj )]
Step 6: We Kept on repeating Step 3 to Step 5 until the variation in the centroids
saturates.
12
3.2.3 Subtractive Clustering
Subtractive clustering algorithm is used when it’s hard to have a fine idea the number
of clusters we should have for a dataset. Subtractive clustering, is fast and one-pass
method for finding out the number of clusters along with cluster centers for a given
dataset. The cluster obtained by subclust or genfis2 function is used to initialize the it-
erative optimization clustering method and also model identification method (like anfis).
subclust function gives the clusters using the subtractive clustering. genfis2 builds upon
subclust function for providing a faster and single-pass method for taking input-output
data for training and generating a Sugeno FIS (fuzzy inference system) that can model
the data behavior.
There are four main parameters of subtractive clustering in genfis2. These are as follows:
• Accept Ratio: It can be computed by the fraction of potential first center, above
which an other data point can be accepted.
• Reject Ratio: It is the condition for rejecting a potential data point for being a
cluster center [21], that is obtained from a fraction of potential first cluster center,
below that a data point should be rejected for being a cluster center.
13
where α = 4/ra2
ra is a positive constant that is the radius of neighborhood to be considered. It
depends on the range of influence. A data point that has many neighbors will have
a high potential value.
• Step 2: The data point with highest potential becomes the first cluster center.
[21] Let us say x∗1 is the first cluster center with potential value P1∗ then potential
of each point is revised as:
where β = 4/rb2
• Step 3: Similarly, After k cluster centers have been obtained, for each data point
potential can be given by:
where β = 4/rb2 x∗k isthelocationof kth cluster center that has potential Pk∗ .
• Step 4: The process of making new clusters continues till we reach a position where
all the remaining potentials are below a certain fraction of the first cluster center
potential. Hence, we get the optimum number of clusters based on the range of
influence fixed. Please note that the range of influence is the most deciding factor
as it directly impacts the number of clusters. Lower range of influence results
in high number of clusters and high range of influence result in less number of
clusters.
14
Figure 3.1: Fuzzy Membership Curve for the Year 2006 using Subtractive Clustering
Figure 3.2: Fuzzy Membership Curve for the Year 2007 using Subtractive Clustering
Figure 3.3: Fuzzy Membership Curve for the Year 2008 using Subtractive Clustering
Layer 1: Every node i in this layer is a square node with node function.
15
Figure 3.4: Fuzzy Membership Curve for the Year 2009 using Subtractive Clustering
Figure 3.5: Fuzzy Membership Curve for the Year 2010 using Subtractive Clustering
Figure 3.6: Fuzzy Membership Curve for the Year 2011 using Subtractive Clustering
where x, y are the inputs to node i, Ai(Bi2) is a linguistic labels for inputs. In
16
Figure 3.7: Fuzzy Membership Curve for the Year 2012 using Subtractive Clustering
Figure 3.8: Fuzzy Membership Curve for the Year 2013 using Subtractive Clustering
Figure 3.9: Fuzzy Membership Curve for the Year 2014 using Subtractive Clustering
other words O1,i is the membership grade of Ai(Bi2). Normally membership function
17
Figure 3.10: ANFIS Architecture
for µAi (x) and µBi (x) are chosen to be generalized bell function:
1
µAi (x) , µBi (x) = x−ci 2bi
(3.9)
1+ ai
where ai , bi , ci are the parameters of membership function. These are also known as the
premise parameters.
Q
Layer 2: Every node in this layer is a circular node labeled , whose output is the
product of all the incoming signals:
Layer 3: Every node in this layer is circular node labeled N. The ith node calculates
the ratio of the ith rules firing strength to the sum of firing strengths of all the rules:
wi
O3,i = wi = , i = 1, 2; (3.11)
w1 + w2
18
Layer 4: Every node i in this layer is a square node with a node function.
Layer 5: This is a single circular node labeledP, which computes the overall output
as the summation of all incoming signals:
P
X w i fi
O5,i = wi .fi = P = fout (3.13)
i
wi
19
Figure 3.11: Flow Diagram of Proposed Methodology
20
Chapter 4
21
Now, we started the work of data prepossessing. As we see that the data is collected
from recognized sources, we had not to worry about the missing value problems. The
important part of this step was the selection of the stock index that is most important.
For this we had many contenders like, opening value, peak value, closing value etc. Out
of these, the one that was most important was the closing value, as it gives the complete
sentiment of the market. Hence we kept the closing data and removed all the redundant
data.
Now we had to start the work of data aggregation, as it improves the complexity of
overall operation many folds. As aggregation is the prime area of experimentation in
our project, we had to implement all the aggregation operators we have talked about.
We started with the basic implementation of OWA and it gave us the platform to test
more and more operators with similar characteristics but entirely different behavior.
Then we implemented the IOWA operator where we sorted the data in the order of their
significance. Then we moved to POWA, here we realize that the applicability of POWA
in our problem is very limited as we can not be sure about the probability distribution
of data and any vague or unscientific assumption can result in absurd results. Hence we
moved to GOWA that itself is a source of many other operators. The experimentation
we did with the value of λ resulted in GOWGA, GOWHA, GOWQA and GOWCA. At
last we implemented the QOWA operator, where we saw that the behavior of QOWA op-
erator. QOWA differs entirely on the value of f(x), it can take us back to OWA, OWHA
and GOWA on choosing suitable f(x). Implementation of the exponential QOWA was
the last experimentation we performed as far as aggregation is concerned.
Finally, we started with the work of data clustering. As we are using ANFIS, FCM
was the best clustering algorithm among all. We implemented FCM right from the
scratch, we first of all sorted the data and selected centroids for very first iteration.
According to the selected centroids we built the distance matrix. The distance matrix
helped us to generated the fuzzy membership matrix. This operations were performed
22
Aggregator Percentage RMS Differnce
Ordered Weighted Averaging 1.158
Induced Ordered Weighted Averaging 1.393
Probabilistic Ordered Weighted Averaging N/A
Generalized Ordered Weighted Geometric Averaging 203.556
Generalized Ordered Weighted Harmonic Averaging 33.146
Generalized Ordered Weighted Quadratic Averaging 1.157
Generalized Ordered Weighted Cubic Averaging 1.159
Reciprocal - Quasi Ordered Weighted Averaging 33.146
Linear - Quasi Ordered Weighted Averaging 1.158
Exponential - Quasi Ordered Weighted Averaging N/A
23
also notice that POWA and Exponential-QOWA can be discarded on the basis of non-
applicability of the operator on the current system. GOWGA is a very good aggregator
but in this case itgives absurdly high rate of error hence it can also be discarded from
further consideration. So we are left with Basic-OWA, GOWQA, GOWCA, GOWHA
and IOWA.
Subsequently, we started the Fuzzy C Means clustering of the aggregated data for all
the years that are under consideration. Although we have written the code for FCM
from scratch, but for providing the input to ANFIS tool provided in MATLAB, we used
Genfis3. The membership curve extracted on applying Fuzzy C Means Clustering on
year 2012, is as follows.
After FCM we were fully equipped to implement ANFIS which is the most latter part
of our research work. Using the training data, testing data, Generate-FIS MATLAB
functions and ANFIS-edit interface, we could finally reach the schematic diagram of our
ANFIS on MATLAB. We have to know some important things before getting to the
ANFIS Diagram. Here we have 5 layer ANFIS with two inputs and one output. Every
year that contains n working days, we have used n-100 days for training and 100 days
for testing. Following is the schematic overview of ANFIS implemented in MATLAB:
Now we compared the RMSE’s for the year 2012 using all considerable operators.
The results of these operations are as follows:
By the above figure we can clearly see that OWA, GOWQA, GOWCA give approx-
imately same RMSE. While the GOWHA is more or less out of the picture, as it could
not provide best RMSE for any value of Orness. The most interesting and useful be-
haviour is shown by IOWA. IOWA gives best RMSE at α = 0.8. Hence, we compared
OWA and Induced OWA for all the years. The results are as follows:
Our goal is to get the most precise prediction with least Root Mean Square Error.
RMSE is calculated as follows:
s
(actual(t) − f orecast(t))2
P
i
RM SE = (4.1)
n
24
Figure 4.1: ANFIS Implemented on MATLAB
By analysing Figure 4.6 we can clearly notice that the Induced ordered weighted
averaging is giving much better performance thn the Basic-OWA operator. The RMSE
25
Figure 4.3: Comparison of RMSE by OWA and Induced-OWA operators for all years
of IOWA has been consistently lower than that of OWA. We can not be very sure about
the reson for this behaviour. One of the explanations we can think of is that the method
of prioritizing the stock data of previous days and assignment of weights is quite different.
OWA gives higher priority to to the index that has more value, on the other hand, IOWA
provides more value to the index that is more recent.
26
Clustering with ANFIS using genfis2 function given in MATLAB.
Now we decided, to implement it on the latest stock market data. Hence, we took
the BSE30 data from BSE India website and did the testing on data from 2006 to 2014.
No we were confident that the RMSE of our prediction will be around 20 points in any
year. Having implemented Subtractive clustering, we were forgetting a very important
point. Subtractive clustering when used for fuzzy rulebase creation, has many parame-
ters which can have a huge impact on the prediction.
• Range of Influence
• Squash Factor
• Accept Ratio
• Reject Ratio
We have to note that the range of influence is the most deciding factor as it directly
impacts the number of clusters. Hence, it comes up as the most significant parameter in
deciding the performance of prediction. In Figure 4.4 we can see the impact of change in
Range of Influence in the RMSE of prediction using traditional OWA for the year 2012.
Using the Subtractive clustering, we can calculate the RMSE of different considerable
operators in Figure 4.10 and finally decide which operator is the best replacement for
traditional OWA. We compare the RMSE for different value of orness for the year.
By the above figure we can clearly see that OWA, GOWQA, GOWCA give approx-
imately same RMSE. While the GOWHA is more or less out of the picture, as it could
27
Figure 4.4: Impact of Range of Influence on RMSE (OWA/2012)
not provide best RMSE for any value of Orness. The most interesting and useful be-
haviour is shown by IOWA. IOWA gives best RMSE at α = 0.8. But still we haven’t
considered the Range of Influence parameter of subtractive clustering.
In Figure 4.11 we can compare the best values of IOWA and the OWA RMSE for
some range of influence. The comparison will is done on all the years 2006 to 2014. The
tabular representation of the desire comparison is as follows:
Since, we have taken the best values of RMSE’s for both the operators, OWA and
IOWA, we have to take an estimate of the corresponding RMSE’s for both the operators
for having a rational comparison. Here corresponding RMSE stands for the RMSE that
is measured on same range of influence, we can see this in Figure 4.12.
After comparison using BSE30 data of 2009, we can declare it beyond doubt that
IOWA with Subtractive Clustering performs better than any other combination when
prediction is done by ANFIS. The comparison can be seen in Figure 4.13 and Figure
4.14.
28
α OWA IOWA GOWHA GOWQA GOWCA
0.5 70.324 62.457 94.002 69.414 71.214
0.6 64.965 49.887 100.515 65.651 69.165
0.7 78.072 28.718 96.243 76.521 77.021
0.8 75.982 7.596 121.494 75.128 74.812
0.9 83.299 14.272 113.453 82.241 83.901
Table 4.2: Table of RMSE for different Orness values for different operators in 2014
Table 4.3: Best RMSE’s for different Years for IOWA and OWA, ROI in Parenthesis
29
Figure 4.5: RMSE for different Orness values for OWA operator in 2014
Figure 4.6: RMSE for different Orness values for IOWA operator in 2014
30
Figure 4.7: RMSE for different Orness values for GOWQA operator in 2014
Figure 4.8: RMSE for different Orness values for GOWHA operator in 2014
31
Figure 4.9: RMSE for different Orness values for GOWCA operator in 2014
Table 4.4: Corresponding RMSE’s for different Years for IOWA and OWA, ROI in
Parenthesis
32
Figure 4.10: RMSE for different Orness values for different operators in 2014
33
Figure 4.11: Graphical Representation of Best RMSE’s for different Years for IOWA
and OWA
Figure 4.12: Graphical Representation of Corresponding RMSE’s for different Years for
IOWA and OWA
34
Figure 4.13: IOWA Testing of Stock Market Indices for Recession Year 2009 using Fuzzy
C Means Clustering
Figure 4.14: IOWA Testing of Stock Market Indices for Recession Year 2009 using
Subtractive Clustering
35
Chapter 5
CONCLUSION
5.1 CONCLUSION
Conclusively, we can say that we could deliver the research output we were supposed
to, to a very large extent. Starting a problem from the scratch takes a good amount of
hardship and this problem was not an exception. We started with an idea of coming up
with such an aggregator that can give a better performance in prediction, when compared
to the traditional OWA. The base-paper we referred to, is based on Taiwanese Stock
Market and we were not sure about its applicability on the Indian counterpart.
We started approaching the problem by mining for different variants of OWA operator,
luckily we could find out considerable number of aggregators to start our research work.
Once we got all the operators, we had to get the weight vector for application of OWA.
We referred to the Yager’s formula for weight vector and aggregated the data of all the
years with different operators using a window of 3 days and used it as an approximation
for the stock index of the 4th day. This helped us to eliminate some useless contenders
for being the best performing operator.
After applying ANFIS using all the potential best performing operators for the year 2014,
we compared the RMSE for different operators and concluded that the only operator that
can outperform the traditional OWA in stock market problems is IOWA. We compared
both the operators for all the years using Subtractive clustering and concluded that
IOWA reduces the RMSE many folds.
36
5.2 FUTURE SCOPE
We have come a long way, from 100+ to single digit RMSE. As the estimator is perform-
ing well, it can be built as a software with easy to use user interface. Even though we
have reached a good RMSE, the estimator can be further improve by experimentation on
other parameters of subtractive clustering, such as Squash factor, Accept Ratio, Reject
ratio. We can also check the performance by applying grid partitioning.
37
Bibliography
[1] Cheng, C.H., Wei, L. Y., Liu, J.W., Chen, T. L., OWA based ANFIS model for
TAIEX forecasting, Economic Modelling 30 (2013) 442448.
[2] G. Kaur, J. Dhar, R.K. Guha, Financial Time Series Forecasting by combining
Adaptive Network based Fuzzy Inference system with OWA Operator, In Press.
[3] J.M. Merig, Fuzzy multi-person decision making with fuzzy probabilistic aggrega-
tion operators, International Journal of Fuzzy Systems, 13(3) (2011) 163-174.
[4] Ronald R. Yager, Dimitar P. Filev, Induced Ordered Weighted Averaging Opera-
tors, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions, 29
(2) (1999) 141 150.
[5] Yager R., Kacprzyk J., Beliakov G., Recent Developments In The Ordered Weighted
Averaging Operators, Springer (2011), ISBN 3642179096.
[8] Liu, Jinpei; Lin, Sheng; Chen, Huayou; Zhou, Ligang, The Continuous Quasi-
OWA Operator and its Application to Group Decision Making, Group Decision
Negotiation; 22(4) (2001) 715-723.
38
[9] Huarng, K.H., Effective lengths of intervals to improve forecasting in fuzzy time
series, Fuzzy Sets and Systems 123 (2001) 155162.
[10] Huarng, K.H., Yu, H.K., The application of neural networks to forecast fuzzy time
series, Physica A 336 (2006) 481491
[11] Huarng, K.H., Yu, H.K., Hsu, Y.W., A multivariate heuristic model for fuzzy time-
series forecasting, IEEE Transactions on Systems, Man, and Cybernetics. Part B,
Cybernetics 37 (4) (2007) 836846.
[12] Chen, S.-M., Chung, N.-Y., Forecasting enrollments of students by using fuzzy
time series and genetic algorithms, International Journal of Information and Man-
agement Sciences 17 (2006) 117.
[13] Chen, S.M., Chung, N.Y., Forecasting enrollments using high-order fuzzy time series
and genetic algorithms, International of Intelligent Systems 21 (2006) 485501.
[14] Cheng, C.H., Chen, T.L., Chiang, C.H., Trend-weighted fuzzy time-series model for
TAIEX forecasting, Lecture Notes in Computer Science 4234 (2006) 469477.
[15] Cheng, C.H., Wang, J.W., Li, C.H., Forecasting the number of outpatient visits
using a new fuzzy time series based on weighted-transitional matrix, Expert Systems
with Applications 34 (2008) 25682575.
[16] Cheng, C.H., Wei, L.Y., Volatility model based on multi-stock index for TAIEX
forecasting, Expert Systems with Applications 36 (3, Part 1) (2009) 61876191.
[17] Yu, H.K., Weighted fuzzy time-series models for TAIEX forecasting, Physica A 349
(2005) 609624.
[18] J. Dunn, A fuzzy relative of the isodata process and its use in detecting compact,
well-separated clusters, Journal of Cybernetics 3 (1973) 3257.
[19] R. Fuller and P. Majlender, An analytic approach for obtaing maximal entropy
OWA operator weights, Fuzzy Sets and Systms 124 (2001) 5357.
39
[20] R. Fuller and P. Majlender, On obtaining minimal variability OWA operator
weights, Fuzzy Sets and Systems 136 (2003) 203215.
[21] S. Chopra, R. Mitra and V. Kumar, Identification of Rules Using Subtractive Clus-
tering with Application to Fuzzy Controllers, Third International Conference on
Machine Learning and Cybernetics, Shanghai (2004) 41254130.
40