ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 51

ICT619 Intelligent

Systems

Topic 4: Artificial Neural


Networks
Artificial Neural Networks
PART A
 Introduction
 An overview of the biological neuron
 The synthetic neuron
 Structure and operation of an ANN
 Problem solving by an ANN
 Learning in ANNs
 ANN models
 Applications

PART B
 Developing neural network applications
 Design of the network
 Training issues
 A comparison of ANN and ES
 Hybrid ANN systems
 Case Studies
ICT619 2
Developing neural network
applications

Neural Network Implementations


Three possible practical implementations of ANNs are:
1. A software simulation program running on a digital
computer
2. A hardware emulator connected to a host computer -
called a neurocomputer
3. True electronic circuits

ICT619 3
Software Simulations of ANN

 Currently the cheapest and simplest implementation


method for ANNs - at least for general purpose use.
 Simulates parallel processing on a conventional
sequential digital computer
 Replicates temporal behaviour of the network by
updating the activation level and output of each node
for successive time steps
 These steps are represented by iterations or loops
 Within each loop, the updates for all nodes in a layer
are performed.

ICT619 4
Software simulations of ANN
(cont’d)

 In multilayer ANNs, processing for a layer is


completed and its output used to calculate states of
the nodes in the following layer

 Typical additional features of ANN simulators


1. Configuring the net according to a chosen architecture and
node operational characteristic
2. Implementation of training phase using a chosen training
algorithm
3. Tools for visualising and analysing behaviour of nets

 ANN simulators are written in hi-level languages such


as C, C++ and Java.
ICT619 5
Advantages and possible problems
with software simulators

Advantages and possible problems with software


simulators
 Main attraction of ANN simulators is the relatively low
cost and wide availability of ready-made commercial
packages
 They are also compact, flexible and highly portable.
 Writing your own simulator requires programming skills
and would be time consuming (except that you don't
have to now!)
 Training of ANNs using software simulators can be
slow for larger networks (greater than a few hundred)
ICT619 6
Commercially available neural net
packages

 Prewritten shells with convenient user interfaces


 Cost a few hundred to tens of thousands of dollars
 Allow users to specify the ANN design and training
parameters
 Usually provide graphic interfaces to enable monitoring
of the net’s training and operation
 Likely to provide interfacing with other software
systems such as spreadsheets and databases.

ICT619 7
Neurocomputers

 Dedicated special-purpose digital


computer (aka accelerator boards)
 Optimised to perform operations
common in neural network simulation
 Acts as a coprocessor to a host
computer and is controlled by a
program running on the host.
 Can be tens to thousands of times
faster than simulators
 Systems are available with approx.
1000 million IPS connection updates
per second for networks with 8,192
neurons e.g ACC Neural Network
Processor

ICT619 8
Neurocomputers

Genobyte's CAM-Brain Machine was developed between 1997 and 2000


ICT619 9
True Networks in Hardware

 Closer to biological neural networks than simulations


 Consist of synthetic neurons actually fabricated on
silicon chips
 Commercially available hardwired ANNs are limited to
a few thousand neurons per chip1.
 Chips connected in parallel to achieve larger networks.
 Problems: interconnection and interference, fixed-
valued weights - work progressing on modifiable
synapses.
1 Figures more than five years old.

ICT619 10
Neural Network Development
Methodology

 Aims to add structure and organisation to ANN


applications development for reducing cost, increasing
accuracy, consistency, user confidence and
friendliness

 Split development into the following phases:


 The Concept Phase
 The Design Phase
 The Implementation Phase
 The Maintenance Phase

ICT619 11
Neural Network Development
Methodology - the Concept Phase
Involves
 Validating the proposed application
 Selecting an appropriate neural paradigm.

Application validation
Problem characteristics suitable for neural network
application are:
 Data intensive
 Multiple interacting parameters
 Incomplete, erroneous, noisy data
 Solution function unknown or expensive
 Requires flexibility, generalisation, fault-tolerance, speed

ICT619 12
ANN Development Methodology - the
Concept Phase (cont’d)
 Common examples of applications with above
attributes are
 pattern recognition (eg, printed or handwritten character,
consumer behaviour, risk patterns),
 forecasting (eg, stock market), signal (audio, video, ultrasound)
processing

 Problems not suitable for ANN-based solutions include:


 A mathematically accurate and precise solution is available
 Solution involving deduction and step-wise logic appropriate
 Applications involving explaination or reporting

 One application area that is unsuitable for ANNs is


resource management eg, inventory, accounts, sales
data analysis
ICT619 13
Selecting an ANN paradigm
 Decision based on comparison of application requirements
to capabilities of different paradigms
eg, the multilayer perceptron is well known for its pattern
recognition capabilities,
 Kohonen net more suited for applications involving data
clustering

 Choice of paradigm also influenced by the training method


that can be employed
eg. supervised training must have adequate number of
input-correct output pairs available and training may take a
relatively long time

 Technical and economic feasibility assessments should be


carried out to complete the concept phase
ICT619 14
The Design Phase
 The design phase specifies initial values and
conditions at the node, network and training levels

 Decisions to be made at the node level include:


 Types of input – binary (0,1), bipolar (-1,+1), trivalent (-
1, 0, +1), discrete, continuous-valued
 Transfer function - step or threshold, hyperbolic tangent,
sigmoid, consider possible use of lookup tables for
speeding up calculations

 Decisions to be made at the network architecture


level
 The number and size of layers and their connectivity
(fully interconnected, or sparsely interconnected, feedforward
or recurrent, other?)
ICT619 15
The Design Phase (cont’d)

 'Size' of a layer is the number of nodes in the layer


 For the input layer, size is determined by number of data
sources (input vector components) and possibly the
mathematical transformations done

 The number of nodes in the output layer is determined


by the number of classes or decision values to be output

 Finding optimal size of the hidden layer needs some


experimentation
 Too few nodes will produce inadequate mapping, while
too many may result in inadequate generalisation

ICT619 16
The Design Phase (cont’d)

Connectivity
 Connectivity determines the flow of signals between
neurons in the same or different layers

 Some ANN models, such as the multilayer perceptron,


have only interlayer connections - there is no intralayer
connection

 The Hopfield net is an example of a model with


intralayer connections

ICT619 17
The Design Phase (cont’d)

Feedback
 There may be no feedback of output values, eg, the
multilayer perceptron
or
 There may be feedback as in a recurrent network eg,
the Hopfield net

 Other design questions include


 Setting of parameters for the learning phase – eg,
stopping criterion, learning rate.
 Possible addition of noise to speed up training.

ICT619 18
The Implementation phase

Typical steps:
 Gathering the training set
 Selecting the development environment
 Implementing the neural network
 Testing and debugging the network

 Gathering the training set


 Aims to get right type of data in adequate amount
and in the right format

ICT619 19
Gathering training data (cont’d)

 How much data to gather?


 Increasing data amount increases training time but may
help earlier convergence
 Quality more important than quantity

 Collection of data
 Potential sources - historical records, instrument
readings, simulation results

 Preparation of data
 Involves preprocessing including scaling, normalisation,
binarisation, mapping to logarithmic scale, etc.

ICT619 20
Gathering training data (cont’d)

 Type of data to collect should be representative of


given problem including routine, unusual and
boundary-condition cases

 Mix of good as well as imperfect data but not


ambiguous or too erroneous.

 Amount of data to gather


 Increasing data amount increases training time but
may help earlier convergence
 Quality more important than quantity
ICT619 21
Gathering training data (cont’d)

 Collection of data
 Potential sources - historical records, instrument
readings, simulation results

 Preparation of data
 Involves preprocessing including normalisation and
possible binarisation

ICT619 22
Selecting the development
environment

Hardware and software aspects


 Hardware requirements based on
 speed of operation
 memory and storage capacity
 software availability
 cost
 compatibility

 The most popular platforms are workstations and high-


end PC's (with accelerator board option)

ICT619 23
Selecting the development
environment

Two options in choosing software


1. Custom-coded simulators – which requires more
expertise on part of the user but provides maximum
flexibility

2. Commercial development packages – which are


usually easy to use because of a more
sophisticated interface

ICT619 24
Selecting the development
environment (cont’d)

 Selection of hardware and software


environment usually based on following
considerations:
 ANN paradigm to be implemented
 Speed in training and recall
 Transportability
 Vendor support
 Extensibility
 Price

ICT619 25
Implementing the neural network

Common steps involved are:


 Selection of appropriate neural paradigm
 Setting network size
 Deciding on the learning algorithm
 Creation of screen displays
 Determining the halting criteria
 Collecting data for training and testing
 Data preparation including preprocessing
 Organising data into training and test sets

ICT619 26
Implementation - Training

 Training the net, which consists of


 Loading the training set
 Initialisation of network weights – usually to
small random values
 Starting the training process
 Monitoring the training process until training
is completed
 Saving of weight values in a file for use
during operation mode
ICT619 27
Implementation – Training
(cont’d)
Possible problems arising during training
 Failure to converge to a set of optimal weight values
 Further weight adjustments fail to reduce output error,
stuck in a local minimum
 Remedied by resetting the learning parameters and
reinitialising the weights

 Overtraining
 Net fails to generalise, i.e., fails to classify less than
perfect patterns
 Mix of good and imperfect patterns for training helps
ICT619 28
Implementation – Training
(cont’d)
 Training results may be affected by the method
of presenting data set to the network.

 Adjustments may be made by varying the layer


sizes and fine-tuning the learning parameters.

 To ensure optimal results, several variations of


a neural network may be trained and each
tested for accuracy

ICT619 29
Implementation - Testing and
Debugging
Testing can be done by:
1. Observing operational behaviour of the net.
2. Analysing actual weights
3. Study of network behaviour under specific conditions

Observing operational behaviour


 Network treated as a black box and its response to a series
of test cases is evaluated

Test data
 Should contain training cases as well as new cases
 Routine, unusual as well as boundary condition cases
should be tried ICT619 30
Implementation - Testing and
Debugging (cont’d)
Testing by weight analysis
 Weights entering and exiting nodes analysed for
relatively small and large values

 In case of significant errors detected in testing,


debugging would involve examining
 the training cases for representativeness, accuracy and
adequacy of number
 learning algorithm parameters such as the rate at which
weights are adjusted
 neural network architecture, node characteristics, and
connectivity
 training set-network interface, user-network interface
ICT619 31
The Maintenance Phase
Consists of
 placing the neural network in an operational
environment with possible integration
 periodic performance evaluation, and maintenance

 Although often designed as stand-alone systems,


some neural network systems are integrated with other
information systems using:
 Loose-coupling – preprocessor, postprocessor,
distributed component
 Tight-coupling or full integration as embedded
component

ICT619 32
The Maintenance Phase

Possible ANN operational environments:

ICT619 33
System evaluation

 Continual evaluation is necessary to


 ensure satisfactory performance in solving dynamic
problems
 check for damaged or retrained networks.

 Evaluation can be carried out by reusing


original test procedures with current data.

ICT619 34
ANN Maintenance

Involves modification necessitated by


 Decreasing accuracy
 Enhancements

System modification falls into two categories


involving either data or software.
 Data modification steps:
 Training data is modified or replaced
 Network retrained and re-evaluated.

ICT619 35
ANN Maintenance (cont’d)

 Software changes include changes in


 Interfaces
 cooperating programs
 the structure of the network.

 If the network is changed, part of the design and most


of the implementation phase may have to be repeated.

 Backup copies should be used for maintenance and


research.

ICT619 36
A comparison of ANN and ES

Similarities between ES and ANN


 Both aim to create intelligent computer systems by
mimicking human intelligence, although at different
levels

 Design process of neither ES nor ANN is automatic


 Knowledge extraction in ES is a time and labour
intensive process
 ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.

ICT619 37
A comparison of ANN and ES
(cont’d)
Differences between ANN and ES
 Differ in aspects of design, operation and use

 Logic vs. brain


 ES simulate the human reasoning process based on
formal logic
 ANNs are based on modelling the brain, both in structure
and operation

 Sequential vs. parallel


 The nature of processing in ES is sequential
 ANNs are inherently parallel
ICT619 38
A comparison of ANN and ES
(cont’d)
External and static vs. internal and dynamic
 Learning is performed external to the ES
 ANN itself is responsible for its knowledge acquisition
during the training phase.
 Learning is always off-line in ES - knowledge remains
static during operation
 Learning in ANNs, although mostly off-line, can be on-
line

 Deductive vs. inductive inferencing


 Knowledge in an ES always used in a deductive
reasoning process
 An ANN constructs its knowledge base inductively from
examples, and uses it to produce decision through
generalisation
ICT619 39
A comparison of ANN and ES
(cont’d)

Knowledge representation: explicit vs. implicit


 ES store knowledge in explicit form -possible to inspect
and modify individual rules
 ANNs knowledge stored implicitly in the interconnection
weight values

 Design issues: simple vs. complex


 Technical side of ES development relatively simple
without difficult design choices.
 ANN design process often one of trial and error

ICT619 40
A comparison of ANN and ES
(cont’d)

 User interface: white box vs. black box


 ES have explanation capability
 Difficulty in interpreting an ANN's knowledge-base
effectively makes it a black box to the user

 State of maturity and recognition: well-


established vs. early
 ES already well established as a methodology in
commercial applications
 ANN recognition and development tools at a
relatively early stage.

ICT619 41
Hybrid systems
 Neuro-symbolic computing utilises the complementary
nature of computing in neural networks (numerical) and
expert systems (symbolic).
 Neuro-fuzzy systems combine neural networks with
fuzzy logic
 ANNs can also be combined with genetic algorithm
methodology

Hybrid ES-ANN systems


 The strengths of the ES can be utilised to overcome
the weaknesses of an ANN based system and vice
versa.
 For example, ANN’s extraction of knowledge from data
 ES’s explanation capability
ICT619 42
Hybrid ES-ANN systems

 Rule extraction by inference justification in an ANN


 MACIE, an ANN based decision support system
described in (Gallant 1993)
 Extracts a single rule that justifies an inference in an
ANN

 Inference in an ANN is represented by output of a


single node
 This output is based upon incomplete input values fed
from a number of nodes as shown in the diagram
below.
ICT619 43
Hybrid ES-ANN systems (cont’d)

 A node ui is defined to be a contributing node to node


uj if wij ui  0.

ICT619 44
Hybrid ES-ANN systems (cont’d)

 In this example, the


contributing variables are
{u2, u3, u5, u6 }.
 The rule produced in this
example is:
IF u6 = Unknown
AND u2 = TRUE
AND u3 = FALSE
AND u5 = TRUE
THEN conclude u7 = TRUE.
ICT619 45
Hybrid ES-ANN systems (cont’d)

 One approach to hybrid systems divides a problem into


tasks suitable for either ES and ANN
 These tasks are then performed by the appropriate
methodology

 One example of such a system (Caudill 1991) is an


intelligent system for delivering packages
 ES performs the task of producing the best loading
strategy for packages into trucks
 ANN works out best route for delivering the packages
efficiently.
ICT619 46
Hybrid ES-ANN systems (cont’d)
 Hybrid ES-ANN systems with ANNs embedded
within expert systems
 ANN used to determine which rule to fire, given
the current state of facts.

 Another approach to hybrid ES-ANN uses an


ANN as a preprocessor
 One or more ANNs produce classifications.
 Numerical outputs produced by ANN are
interpreted symbolically by an ES as facts
 ES applies the facts for deductive reasoning

ICT619 47
Case Study

Case: Application of ANNs in bankruptcy prediction


(Coleman et al, AI Review, Summer 1991, in Zahedi
1993)
 Predicts banks that were certain to fail within a year
 Predicts certainty given to bank examiners dealing with the
bank in question.
 ANN has 11 inputs, each of which is a ratio developed by
Peat Marwick.

 Developed by NeuralWare’s Application Development


Services and Support Group (ADSS)
 Software used - the NeuralWorks Professional neural
network development system.
 Uses the standard backpropagation (multiplayer perceptron)
network.
ICT619 48
Case Study (cont’d)
 ANN has 11 inputs, each a ratio developed by Peat
Marwick.
 Inputs connected to a single hidden layer, which in turn is
connected to a single node in the output layer.
 Network outputs a single value denoting whether the bank
would or would not fail within that calendar year

 Employed the hyperbolic-tangent transfer function and a


proprietary error function created by the ADSS staff.
 Trained on a set of 1,000 examples, 900 of which were
viable banks and 100 of which were banks that had actually
gone bankrupt
 Training consisted of about 50,000 iterations of the training
set.
 Predicted 50% of banks that are viable, and 99% of banks
that actually failed.
ICT619 49
REFERENCES
 AI Expert (special issue on ANN), June 1990.
 BYTE (special issue on ANN), Aug. 1989.
 Caudill,M., "The View from Now", AI Expert, June 1992,
pp.27-31.
 Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997
 Kirrmann,H., "Neural Computing: The new gold rush in
informatics", IEEE Micro June 1989 pp. 7-9
 Lippman, R.P., "An Introduction to Computing with Neural
Nets", IEEE ASSP Magazine, April 1987 pp.4-21.
 Lisboa, P., (Ed.) Neural Networks Current Applications,
Chapman & Hall, 1992.
 Negnevitsky, M. Artificial Intelligence A Guide to Intelligent
Systems, Addison-Wesley 2005.

ICT619 50
REFERENCES (cont’d)
 Bailey, D., & Thompson, D., How to Develop Neural Network
Applications, AI Expert, June 1990, pp. 38-47.
 Caudill & Butler, Naturally Intelligent Systems, MIT
Press,1989, pp 227-240.
 Caudill, M., “Expert networks”, BYTE pp.109-116, October
1991.
 Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997.
 Gallant, S., Neural Network Learning and Expert Systems,
MIT Press 1993.
 Medsker,L., Hybrid Intelligent Systems, Kluwer Academic
Press, Boston 1995
 Zahedi, F., Intelligent Systems for Business, Wadsworth
Publishing, , Belmont, California, 1993.
ICT619 51

You might also like