Professional Documents
Culture Documents
Heazlewood 2015
Heazlewood 2015
Ian Heazlewood 1, Joe Walsh 1, Mike Climstein 2, Jyrki Kettunen 3, Kent Adams 4 and
Mark DeBeliso 5
1
Charles Darwin University
2
University of Sydney
3
Arcada University of Applied Sciences
4
California State University
5
Southern Utah University
1 Introduction
larly useful in applications where the underlying process is complex, especially pat-
tern recognition and classification problems that are based on predictive and concur-
rent validity.
Neural networks used in predictive applications, such as the multilayer perceptron
(MLP) and radial basis function (RBF) networks, are supervised in the sense that the
model-predicted results can be compared against known values of the target variables.
These target variables are identified on a priori criteria by the researcher. The term
neural network applies to a loosely related family of models, characterized by a large
parameter space and flexible structure, descending from studies of brain functioning.
As the family grew, most of the new models were designed for non-biological appli-
cations, though much of the associated terminology reflects its origin in biology [1,
2]. A neural network is a massively parallel distributed processor that has a natural
propensity for storing experiential knowledge and making it available for use and is
analogous to human brain function. Specifically, it resembles the brain in two re-
spects; knowledge is acquired by the network through a learning process and Inter-
neuron connection’ strengths, known as synaptic weights and analogous to human
synapses, are used to store the knowledge.
A neural network can approximate a wide range of statistical models without re-
quiring that you hypothesize in advance certain relationships between the dependent
and independent variables, a non a priori model. Instead the form of the relationships
is determined during the learning process [1, 3]. A type of neural processing phenom-
enology in this context. The trade-off for this flexibility is that the synaptic weights of
a neural network are not easily interpretable. Thus, if you are trying to explain an
underlying process that produces the relationships between the dependent and inde-
pendent variables, it would be better to use a more traditional statistical model, such
as discriminant analysis or logistic regression. However, if model interpretability is
not important, you can often obtain good model results more quickly using a neural
network [1, 3]. Although neural networks impose minimal demands on model struc-
ture and assumptions, unlike inferential statistics, it is useful to understand the general
neural architecture or neural network structure. The multilayer perceptron (MLP) and
radial basis function (RBF) networks are functions of predictors (also called inputs or
independent variables) that minimize the prediction error of target variables (also
called outputs) [1, 3].
tion equation and test theory by observing whether cases are classified correctly as
predicted. Investigate differences between or among groups and determine the most
parsimonious way to distinguish among groups. 2. Determine the percent of variance
in the dependent variable explained by the independents. Determine the percent of
variance in the dependent variable explained by the independents over and above the
variance accounted for by control variables, using sequential discriminant analysis. 3.
Assess the relative importance of the independent variables in classifying the depend-
ent variable and discard variables, which are little related to group distinctions.
2 Methods
ņ The sample consisted of 3687 male (age = 53.72 years, s.d. = +/-10.05 years) and
3488 female (age = 49.39 years, s.d. = +/-9.15 years) master’s athletes and repre-
sented a volunteer/convenient sample in the study and a cross-sectional non-
experimental research design from the potential population of approximately
33,000 masters athletes competing at the 2009 World Masters Games.
MLP and RBF neural networks, discriminant analysis and logistic regression were
applied, based on a set of dependent/covariate variables, which were the nine partici-
pant motivations factors of health orientation, weight concern, personal goal
achievement, competition, recognition, affiliation, psychological coping, life meaning
and self-esteem. The categorical or classification dichotomous variable utilised in the
different models was gender, specifically the male and female athletes who competed
at the 2009 World Masters Games and who completed the sports psychometric in-
strument specifically designed to measure the nine different participant motivation
factors [9]. The instrument assessed the participant motivation factors derived using
factor scores from a 56 item question bank and athletes responding on a seven point
Likert type scale for each question [9]. The sport psychological instrument was com-
pleted via an online survey using the Limesurveytm interactive survey system prior to,
and during competition at the 2009 World Masters Games. The four statistical meth-
ods were compared for their classification accuracy to successfully discriminate be-
tween male and female athlete responses. Neural networks, specifically the multilayer
perceptron (MLP) and radial basis function (RBF) networks, were applied and com-
pared with stepwise method discriminant analysis and stepwise logistic regression for
classification accuracy. The neural network multilayer perceptron architecture was
based on:
x Selecting one hidden layer where the hidden layer contains unobservable network
nodes (units). Each hidden unit is a function of the weighted sum of the inputs.
The function is the activation function, and the values of the weights are deter-
mined by the estimation algorithm.
x The selected activation function was the hyperbolic tangent, where the activation
function links the weighted sums of units in a layer to the values of units in the
succeeding layer.
x Hyperbolic tangent function has the form Ȗ(c) = tanh(c) = (ecíeíc) / (ec+eíc).
(1)
98 I. Heazlewood et al.
x It takes real-valued arguments and transforms them to the range (–1, 1). When
automatic architecture selection is used in SPSS, this is the activation function for
all units in the hidden layers.
The identity function was selected and this function has the form: Ȗ(c) = c. It takes
real-valued arguments and returns them unchanged. When automatic architecture
selection is used, this is the selected activation function for units in the output layer if
there are any scale-dependent variables. Training the network was based on the batch
method. This method updates the synaptic weights only after passing all training data
records, which means batch training uses information from all records in the training
dataset. Batch training is often preferred because it directly minimizes the total error
and is most useful for smaller datasets. In contrast to MLP networks, in the RBF net-
works it is only the output units that have a bias term.
Discriminant analysis was based on using the nine factors/variables as the starting
point in the stepwise method, which is based on statistical criteria to enter the model
at each calculation step. Gender was used as the independent dichotomous variable in
the analysis. It must be emphasised that the comparison of discriminant analysis with
neural network analysis were based on the identical nine factors. The stepwise method
was applied to generate a hierarchy of importance in terms of predictor variables and
to assess which variables contributed to significant difference between genders.
Applying logistic regression the dependent variable was the nominal dichotomous
variable gender where dummy coding zero for male and one for female. The predictor
or independent variables were the nine factors of participant motivation utilised in the
MLP, RBF and discriminant analysis. The logistic regression variable selection
method was the forward selection likelihood ratio stepwise selection method with
entry testing based on the significance of the score statistic, and removal testing based
on the probability of a likelihood-ratio statistic based on the maximum partial likeli-
hood estimates.
3 Results
Table 1. Classification accuracy for males, females and combined genders based on MLP, RBF,
discriminant analysis and logistic regression.
4 Conclusion
The multilayer perceptron (MLP) networks, was more effective in predicting group
membership based on gender using the nine factors that represented the multiple di-
mensions of participant motivation within male and females athletes competing at the
2009 World Masters Games (WMG) and displayed a reasonable level of predictive
validity and marginally more predictive than radial basis function (RBF) networks.
However, when MLP is compared to the general linear multivariate methods of dis-
criminant analysis and logistic regression MLP only marginally outperforms these
methods. The MLP and RBF utilised the nine factors in the analysis whereas stepwise
discriminant analysis and logistic regression required only six discriminating variable
to provide a solution nearly as accurate as MLP.
100 I. Heazlewood et al.
In terms of which participant motivation factors were the best discriminators be-
tween the genders, both discriminant analysis and logistic regression produced identi-
cal hierarchies concerning order of importance of factors, which were in order from
the most to least important affiliation, competition, self-esteem, recognition, weight
concern and health orientation. The variables excluded from the model as not contrib-
uting significantly were life meaning, psychological coping and goal achievement.
The order of importance identified by MLP and RBF were different from discriminant
analysis and logistic regression, as two slightly difference orders of importance were
derived.
The order of importance MLP was completion, self-esteem, affiliation, recognition,
weight concern, health orientation goal achievement, psychological coping and life
meaning, whereas for RBF the order was competition, affiliation, recognition, psy-
chological coping, weight concern, life meaning, self-esteem, goal achievement and
health orientation. This indicates although the two methods were very different con-
cerning order of importance they were somewhat similar in classification accuracy.
One of the problems with neural networks is they can produce different solutions
from the same data base as they open ended learning structures and hence the possi-
bility of multiple solutions based on the identical data base. To overcome this prob-
lem in terms of replicating results the researcher has to use the same initialization
value for the random number generator, the same data order, and the same variable
order, in addition to using the same procedure settings [3]. Alternatively, as stated in
the introduction if the researcher is trying to explain an underlying process that pro-
duces the relationships between the dependent and independent variables, it would be
better to use a more traditional statistical model, such as discriminant analysis or lo-
gistic regression and in this research these methods produced essentially identical
solutions.
References
1. Fausett, L.: Fundamentals of Neural Networks: Architectures, Algorithms and Applica-
tions. Upper Saddle River NJ: Prentice Hall, (1994)
2. SPSS Inc.: SPSS Statistics Base User’s Guide 17.0. Users Guide. Chicago, IL: SPSS Inc,
(2007)
3. SPSS Inc.: SPSS Neural NetworksTM 17.0. Chicago, IL: SPSS Inc, (2007)
4. Norusis, M.: Advanced Statistics Guide: SPSSX. Chicago, IL: SPSS Inc, (1985)
5. StatSoft, Inc.: Electronic Statistics Textbook. Tulsa, OK: StatSoft. WEB:
http://www.statsoft.com/textbook/, (2010)
6. Hair, J., Block, W., Babin, B., Anderson, R., Tatham, R.: Multivariate Data Analysis. (6th
Ed.). Upper Saddle River: Pearson - Prentice Hall,( 2006)
7. SPSS Inc.: IBM SPSS Statistics 22. Chicago, IL: SPSS Inc, (2013)
8. SPSS Inc.: SPSS Regression 17.0. Chicago, IL: SPSS Inc, 2007.
9. Heazlewood, I., Keshishian, H.: A Comparison of Classification Accuracy for Karate
Ability Using Neural Networks and Discriminant Function Analysis Based on Physiologi-
cal and Biomechanical Measures Of Karate Athletes. Refereed Proceedings of the Tenth
Australasian Conference on Mathematics and Computers in Sport. July 5-7, 2010. Crowne
Plaza, Darwin, Northern Territory, Australia. Pp. 197-204. (2010)
A Comparison of Classification Accuracy for Gender Using ... 101
10. Heazlewood, I., Walsh, J., Burke, S., Climstein, M., Kettunen, J., Adams, K., DeBeliso,
M.: A Comparison of Classification Accuracy for Gender Using Neural Networks Multi-
layer Perceptron (MLP) and Radial Basis Function (RBF) Procedures and Discriminant
Function Analysis Based On Nine Sports Psychological Constructs to Measure Motiva-
tions to Participate in Masters Sports. Proceedings of 2012 Pre-Olympic Congress-IACSS
2012, pp. 88-94: Liverpool, England, UK, July 24-25, 2012 ISBN 978-1-84626-094-0.
(2012)
11. Heazlewood, I., Walsh, J.: A Comparison of Classification Accuracy for Using Neural
Networks Multilayer Perceptron (MLP) and Radial Basis Function (RBF) Procedures and
Discriminant Function Analysis. Proceedings of the International Association of Computer
Science in Sport Conference (IACSS2014). Ed. Assoc. Prof. Ian Heazlewood, Assoc. Prof.
Anthony Bedford, Darwin, Australia, June 22-24. 2014. Pp. 116-120. (2014)
12. Masters, K., Ogles, B., Dolton, J.: The development of an instrument to measure motiva-
tion for marathon running: the Motivations of Marathoners Scales (MOMS). Research
Quarterly in Exercise and Sport. 1993, 64 (2):134-43. (1993)