ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619 Intelligent
Systems
Topic 4: Artificial Neural

Networks
Artificial Neural Networks
PART A
 Introduction
 An overview of the biological neuron
 The synthetic neuron
 Structure and operation of an ANN
 Problem solving by an ANN
 Learning in ANNs
 ANN models
 Applications
PART B
 Developing neural network applications
 Design of the network
 Training issues
 A comparison of ANN and ES
 Hybrid ANN systems
 Case Studies
ICT619 2
Developing neural network
applications
Neural Network Implementations

Three possible practical implementations of ANNs are:
1. A software simulation program running on a digital
computer
2. A hardware emulator connected to a host computer -
called a neurocomputer
3. True electronic circuits
ICT619 3
Software Simulations of ANN
 Currently the cheapest and simplest implementation

method for ANNs - at least for general purpose use.
 Simulates parallel processing on a conventional
sequential digital computer
 Replicates temporal behaviour of the network by
updating the activation level and output of each node
for successive time steps
 These steps are represented by iterations or loops
 Within each loop, the updates for all nodes in a layer
are performed.
ICT619 4
Software simulations of ANN
(cont’d)
 In multilayer ANNs, processing for a layer is

completed and its output used to calculate states of
the nodes in the following layer
 Typical additional features of ANN simulators

1. Configuring the net according to a chosen architecture and
node operational characteristic
2. Implementation of training phase using a chosen training
algorithm
3. Tools for visualising and analysing behaviour of nets
 ANN simulators are written in hi-level languages such

as C, C++ and Java.
ICT619 5
Advantages and possible problems
with software simulators
Advantages and possible problems with software

simulators
 Main attraction of ANN simulators is the relatively low
cost and wide availability of ready-made commercial
packages
 They are also compact, flexible and highly portable.
 Writing your own simulator requires programming skills
and would be time consuming (except that you don't
have to now!)
 Training of ANNs using software simulators can be
slow for larger networks (greater than a few hundred)
ICT619 6
Commercially available neural net
packages
 Prewritten shells with convenient user interfaces

 Cost a few hundred to tens of thousands of dollars
 Allow users to specify the ANN design and training
parameters
 Usually provide graphic interfaces to enable monitoring
of the net’s training and operation
 Likely to provide interfacing with other software
systems such as spreadsheets and databases.
ICT619 7
Neurocomputers
 Dedicated special-purpose digital

computer (aka accelerator boards)
 Optimised to perform operations
common in neural network simulation
 Acts as a coprocessor to a host
computer and is controlled by a
program running on the host.
 Can be tens to thousands of times
faster than simulators
 Systems are available with approx.
1000 million IPS connection updates
per second for networks with 8,192
neurons e.g ACC Neural Network
Processor
ICT619 8
Neurocomputers
Genobyte's CAM-Brain Machine was developed between 1997 and 2000

ICT619 9
True Networks in Hardware
 Closer to biological neural networks than simulations

 Consist of synthetic neurons actually fabricated on
silicon chips
 Commercially available hardwired ANNs are limited to
a few thousand neurons per chip1.
 Chips connected in parallel to achieve larger networks.
 Problems: interconnection and interference, fixed-
valued weights - work progressing on modifiable
synapses.
1 Figures more than five years old.
ICT619 10
Neural Network Development
Methodology
 Aims to add structure and organisation to ANN

applications development for reducing cost, increasing
accuracy, consistency, user confidence and
friendliness
 Split development into the following phases:

 The Concept Phase
 The Design Phase
 The Implementation Phase
 The Maintenance Phase
ICT619 11
Neural Network Development
Methodology - the Concept Phase
Involves
 Validating the proposed application
 Selecting an appropriate neural paradigm.
Application validation
Problem characteristics suitable for neural network
application are:
 Data intensive
 Multiple interacting parameters
 Incomplete, erroneous, noisy data
 Solution function unknown or expensive
 Requires flexibility, generalisation, fault-tolerance, speed
ICT619 12
ANN Development Methodology - the
Concept Phase (cont’d)
 Common examples of applications with above
attributes are
 pattern recognition (eg, printed or handwritten character,
consumer behaviour, risk patterns),
 forecasting (eg, stock market), signal (audio, video, ultrasound)
processing
 Problems not suitable for ANN-based solutions include:

 A mathematically accurate and precise solution is available
 Solution involving deduction and step-wise logic appropriate
 Applications involving explaination or reporting
 One application area that is unsuitable for ANNs is

resource management eg, inventory, accounts, sales
data analysis
ICT619 13
Selecting an ANN paradigm
 Decision based on comparison of application requirements
to capabilities of different paradigms
eg, the multilayer perceptron is well known for its pattern
recognition capabilities,
 Kohonen net more suited for applications involving data
clustering
 Choice of paradigm also influenced by the training method

that can be employed
eg. supervised training must have adequate number of
input-correct output pairs available and training may take a
relatively long time
 Technical and economic feasibility assessments should be

carried out to complete the concept phase
ICT619 14
The Design Phase
 The design phase specifies initial values and
conditions at the node, network and training levels
 Decisions to be made at the node level include:

 Types of input – binary (0,1), bipolar (-1,+1), trivalent (-
1, 0, +1), discrete, continuous-valued
 Transfer function - step or threshold, hyperbolic tangent,
sigmoid, consider possible use of lookup tables for
speeding up calculations
 Decisions to be made at the network architecture

level
 The number and size of layers and their connectivity
(fully interconnected, or sparsely interconnected, feedforward
or recurrent, other?)
ICT619 15
The Design Phase (cont’d)
 'Size' of a layer is the number of nodes in the layer

 For the input layer, size is determined by number of data
sources (input vector components) and possibly the
mathematical transformations done
 The number of nodes in the output layer is determined

by the number of classes or decision values to be output
 Finding optimal size of the hidden layer needs some

experimentation
 Too few nodes will produce inadequate mapping, while
too many may result in inadequate generalisation
ICT619 16
Connectivity
 Connectivity determines the flow of signals between
neurons in the same or different layers
 Some ANN models, such as the multilayer perceptron,

have only interlayer connections - there is no intralayer
connection
 The Hopfield net is an example of a model with

intralayer connections
ICT619 17
Feedback
 There may be no feedback of output values, eg, the
multilayer perceptron
or
 There may be feedback as in a recurrent network eg,
the Hopfield net
 Other design questions include

 Setting of parameters for the learning phase – eg,
stopping criterion, learning rate.
 Possible addition of noise to speed up training.
ICT619 18
The Implementation phase
Typical steps:
 Gathering the training set
 Selecting the development environment
 Implementing the neural network
 Testing and debugging the network
 Gathering the training set

 Aims to get right type of data in adequate amount
and in the right format
ICT619 19
Gathering training data (cont’d)
 How much data to gather?

 Increasing data amount increases training time but may
help earlier convergence
 Quality more important than quantity
 Collection of data
 Potential sources - historical records, instrument
readings, simulation results
 Preparation of data
 Involves preprocessing including scaling, normalisation,
binarisation, mapping to logarithmic scale, etc.
ICT619 20
 Type of data to collect should be representative of

given problem including routine, unusual and
boundary-condition cases
 Mix of good as well as imperfect data but not

ambiguous or too erroneous.
 Amount of data to gather

 Increasing data amount increases training time but
may help earlier convergence
 Quality more important than quantity
ICT619 21
 Collection of data
 Potential sources - historical records, instrument
readings, simulation results
 Preparation of data
 Involves preprocessing including normalisation and
possible binarisation
ICT619 22
Selecting the development
environment
Hardware and software aspects

 Hardware requirements based on
 speed of operation
 memory and storage capacity
 software availability
 cost
 compatibility
 The most popular platforms are workstations and high-

end PC's (with accelerator board option)
ICT619 23
environment
Two options in choosing software

1. Custom-coded simulators – which requires more
expertise on part of the user but provides maximum
flexibility
2. Commercial development packages – which are

usually easy to use because of a more
sophisticated interface
ICT619 24
environment (cont’d)
 Selection of hardware and software

environment usually based on following
considerations:
 ANN paradigm to be implemented
 Speed in training and recall
 Transportability
 Vendor support
 Extensibility
 Price
ICT619 25
Implementing the neural network
Common steps involved are:

 Selection of appropriate neural paradigm
 Setting network size
 Deciding on the learning algorithm
 Creation of screen displays
 Determining the halting criteria
 Collecting data for training and testing
 Data preparation including preprocessing
 Organising data into training and test sets
ICT619 26
Implementation - Training
 Training the net, which consists of

 Loading the training set
 Initialisation of network weights – usually to
small random values
 Starting the training process
 Monitoring the training process until training
is completed
 Saving of weight values in a file for use
during operation mode
ICT619 27
Implementation – Training
(cont’d)
Possible problems arising during training
 Failure to converge to a set of optimal weight values
 Further weight adjustments fail to reduce output error,
stuck in a local minimum
 Remedied by resetting the learning parameters and
reinitialising the weights
 Overtraining
 Net fails to generalise, i.e., fails to classify less than
perfect patterns
 Mix of good and imperfect patterns for training helps
ICT619 28
Implementation – Training
(cont’d)
 Training results may be affected by the method
of presenting data set to the network.
 Adjustments may be made by varying the layer

sizes and fine-tuning the learning parameters.
 To ensure optimal results, several variations of

a neural network may be trained and each
tested for accuracy
ICT619 29
Implementation - Testing and
Debugging
Testing can be done by:
1. Observing operational behaviour of the net.
2. Analysing actual weights
3. Study of network behaviour under specific conditions
Observing operational behaviour

 Network treated as a black box and its response to a series
of test cases is evaluated
Test data
 Should contain training cases as well as new cases
 Routine, unusual as well as boundary condition cases
should be tried ICT619 30
Implementation - Testing and
Debugging (cont’d)
Testing by weight analysis
 Weights entering and exiting nodes analysed for
relatively small and large values
 In case of significant errors detected in testing,

debugging would involve examining
 the training cases for representativeness, accuracy and
adequacy of number
 learning algorithm parameters such as the rate at which
weights are adjusted
 neural network architecture, node characteristics, and
connectivity
 training set-network interface, user-network interface
ICT619 31
The Maintenance Phase
Consists of
 placing the neural network in an operational
environment with possible integration
 periodic performance evaluation, and maintenance
 Although often designed as stand-alone systems,

some neural network systems are integrated with other
information systems using:
 Loose-coupling – preprocessor, postprocessor,
distributed component
 Tight-coupling or full integration as embedded
component
ICT619 32
The Maintenance Phase
Possible ANN operational environments:
ICT619 33
System evaluation
 Continual evaluation is necessary to

 ensure satisfactory performance in solving dynamic
problems
 check for damaged or retrained networks.
 Evaluation can be carried out by reusing

original test procedures with current data.
ICT619 34
ANN Maintenance
Involves modification necessitated by

 Decreasing accuracy
 Enhancements
System modification falls into two categories

involving either data or software.
 Data modification steps:
 Training data is modified or replaced
 Network retrained and re-evaluated.
ICT619 35
ANN Maintenance (cont’d)
 Software changes include changes in

 Interfaces
 cooperating programs
 the structure of the network.
 If the network is changed, part of the design and most

of the implementation phase may have to be repeated.
 Backup copies should be used for maintenance and

research.
ICT619 36
A comparison of ANN and ES
Similarities between ES and ANN

 Both aim to create intelligent computer systems by
mimicking human intelligence, although at different
levels
 Design process of neither ES nor ANN is automatic

 Knowledge extraction in ES is a time and labour
intensive process
 ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.
ICT619 37
(cont’d)
Differences between ANN and ES
 Differ in aspects of design, operation and use
 Logic vs. brain

 ES simulate the human reasoning process based on
formal logic
 ANNs are based on modelling the brain, both in structure
and operation
 Sequential vs. parallel

 The nature of processing in ES is sequential
 ANNs are inherently parallel
ICT619 38
(cont’d)
External and static vs. internal and dynamic
 Learning is performed external to the ES
 ANN itself is responsible for its knowledge acquisition
during the training phase.
 Learning is always off-line in ES - knowledge remains
static during operation
 Learning in ANNs, although mostly off-line, can be on-
line
 Deductive vs. inductive inferencing

 Knowledge in an ES always used in a deductive
reasoning process
 An ANN constructs its knowledge base inductively from
examples, and uses it to produce decision through
generalisation
ICT619 39
(cont’d)
Knowledge representation: explicit vs. implicit

 ES store knowledge in explicit form -possible to inspect
and modify individual rules
 ANNs knowledge stored implicitly in the interconnection
weight values
 Design issues: simple vs. complex

 Technical side of ES development relatively simple
without difficult design choices.
 ANN design process often one of trial and error
ICT619 40
(cont’d)
 User interface: white box vs. black box

 ES have explanation capability
 Difficulty in interpreting an ANN's knowledge-base
effectively makes it a black box to the user
 State of maturity and recognition: well-

established vs. early
 ES already well established as a methodology in
commercial applications
 ANN recognition and development tools at a
relatively early stage.
ICT619 41
Hybrid systems
 Neuro-symbolic computing utilises the complementary
nature of computing in neural networks (numerical) and
expert systems (symbolic).
 Neuro-fuzzy systems combine neural networks with
fuzzy logic
 ANNs can also be combined with genetic algorithm
methodology
Hybrid ES-ANN systems

 The strengths of the ES can be utilised to overcome
the weaknesses of an ANN based system and vice
versa.
 For example, ANN’s extraction of knowledge from data
 ES’s explanation capability
ICT619 42
Hybrid ES-ANN systems
 Rule extraction by inference justification in an ANN

 MACIE, an ANN based decision support system
described in (Gallant 1993)
 Extracts a single rule that justifies an inference in an
ANN
 Inference in an ANN is represented by output of a

single node
 This output is based upon incomplete input values fed
from a number of nodes as shown in the diagram
below.
ICT619 43
Hybrid ES-ANN systems (cont’d)
 A node ui is defined to be a contributing node to node

uj if wij ui  0.
ICT619 44
 In this example, the

contributing variables are
{u2, u3, u5, u6 }.
 The rule produced in this
example is:
IF u6 = Unknown
AND u2 = TRUE
AND u3 = FALSE
AND u5 = TRUE
THEN conclude u7 = TRUE.
ICT619 45
 One approach to hybrid systems divides a problem into

tasks suitable for either ES and ANN
 These tasks are then performed by the appropriate
methodology
 One example of such a system (Caudill 1991) is an

intelligent system for delivering packages
 ES performs the task of producing the best loading
strategy for packages into trucks
 ANN works out best route for delivering the packages
efficiently.
ICT619 46
 Hybrid ES-ANN systems with ANNs embedded
within expert systems
 ANN used to determine which rule to fire, given
the current state of facts.
 Another approach to hybrid ES-ANN uses an

ANN as a preprocessor
 One or more ANNs produce classifications.
 Numerical outputs produced by ANN are
interpreted symbolically by an ES as facts
 ES applies the facts for deductive reasoning
ICT619 47
Case Study
Case: Application of ANNs in bankruptcy prediction

(Coleman et al, AI Review, Summer 1991, in Zahedi
1993)
 Predicts banks that were certain to fail within a year
 Predicts certainty given to bank examiners dealing with the
bank in question.
 ANN has 11 inputs, each of which is a ratio developed by
Peat Marwick.
 Developed by NeuralWare’s Application Development

Services and Support Group (ADSS)
 Software used - the NeuralWorks Professional neural
network development system.
 Uses the standard backpropagation (multiplayer perceptron)
network.
ICT619 48
Case Study (cont’d)
 ANN has 11 inputs, each a ratio developed by Peat
Marwick.
 Inputs connected to a single hidden layer, which in turn is
connected to a single node in the output layer.
 Network outputs a single value denoting whether the bank
would or would not fail within that calendar year
 Employed the hyperbolic-tangent transfer function and a

proprietary error function created by the ADSS staff.
 Trained on a set of 1,000 examples, 900 of which were
viable banks and 100 of which were banks that had actually
gone bankrupt
 Training consisted of about 50,000 iterations of the training
set.
 Predicted 50% of banks that are viable, and 99% of banks
that actually failed.
ICT619 49
REFERENCES
 AI Expert (special issue on ANN), June 1990.
 BYTE (special issue on ANN), Aug. 1989.
 Caudill,M., "The View from Now", AI Expert, June 1992,
pp.27-31.
 Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997
 Kirrmann,H., "Neural Computing: The new gold rush in
informatics", IEEE Micro June 1989 pp. 7-9
 Lippman, R.P., "An Introduction to Computing with Neural
Nets", IEEE ASSP Magazine, April 1987 pp.4-21.
 Lisboa, P., (Ed.) Neural Networks Current Applications,
Chapman & Hall, 1992.
 Negnevitsky, M. Artificial Intelligence A Guide to Intelligent
Systems, Addison-Wesley 2005.
ICT619 50
REFERENCES (cont’d)
 Bailey, D., & Thompson, D., How to Develop Neural Network
Applications, AI Expert, June 1990, pp. 38-47.
 Caudill & Butler, Naturally Intelligent Systems, MIT
Press,1989, pp 227-240.
 Caudill, M., “Expert networks”, BYTE pp.109-116, October
1991.
 Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997.
 Gallant, S., Neural Network Learning and Expert Systems,
MIT Press 1993.
 Medsker,L., Hybrid Intelligent Systems, Kluwer Academic
Press, Boston 1995
 Zahedi, F., Intelligent Systems for Business, Wadsworth
Publishing, , Belmont, California, 1993.
ICT619 51

ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

Uploaded by

Copyright:

Available Formats

You might also like

ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

Uploaded by

Copyright:

Available Formats

ICT619 Intelligent

Topic 4: Artificial Neural

Neural Network Implementations

 Currently the cheapest and simplest implementation

 In multilayer ANNs, processing for a layer is

 Typical additional features of ANN simulators

 ANN simulators are written in hi-level languages such

Advantages and possible problems with software

 Prewritten shells with convenient user interfaces

 Dedicated special-purpose digital

Genobyte's CAM-Brain Machine was developed between 1997 and 2000

 Closer to biological neural networks than simulations

 Aims to add structure and organisation to ANN

 Split development into the following phases:

 Problems not suitable for ANN-based solutions include:

 One application area that is unsuitable for ANNs is

 Choice of paradigm also influenced by the training method

 Technical and economic feasibility assessments should be

 Decisions to be made at the node level include:

 Decisions to be made at the network architecture

 'Size' of a layer is the number of nodes in the layer

 The number of nodes in the output layer is determined

 Finding optimal size of the hidden layer needs some

 Some ANN models, such as the multilayer perceptron,

 The Hopfield net is an example of a model with

 Other design questions include

 Gathering the training set

 How much data to gather?

 Type of data to collect should be representative of

 Mix of good as well as imperfect data but not

 Amount of data to gather

Hardware and software aspects

 The most popular platforms are workstations and high-

Two options in choosing software

2. Commercial development packages – which are

 Selection of hardware and software

Common steps involved are:

 Training the net, which consists of

 Adjustments may be made by varying the layer

 To ensure optimal results, several variations of

Observing operational behaviour

 In case of significant errors detected in testing,

 Although often designed as stand-alone systems,

Possible ANN operational environments:

 Continual evaluation is necessary to

 Evaluation can be carried out by reusing

Involves modification necessitated by

System modification falls into two categories

 Software changes include changes in

 If the network is changed, part of the design and most

 Backup copies should be used for maintenance and

Similarities between ES and ANN

 Design process of neither ES nor ANN is automatic

 Logic vs. brain

 Sequential vs. parallel

 Deductive vs. inductive inferencing

Knowledge representation: explicit vs. implicit