Intelligent Decision Support Methods

From the book-Intelligent Decision Support Methods

by Vasant Dhar and Roger Stein
Information Systems
 “I know of no commodity more valuable than

 Management Information System (MIS)

• Transaction Processing Systems
– Accurate Record Keeping
• Decision Support Systems (DSS)
– Model-Driven DSS
– Data-Driven DSS

Intelligence Density
 DEF: A Metric for Knowledge Work Productivity.

 Knowledge Intensive organizations transform raw data

into something useful-knowledge-and deliver the
knowledge to the part of the organization where it can be
used most effectively.

 Intelligence Density: How quickly can you get the essence

of the underlying data from the output?

The Vocabulary of Intelligence Density

 Quality of Model
• Accuracy, Explainability, Speed, Reliability..
 Engineering Dimension
• Flexibility, Scalability, Ease of Use,...
 Quality of Available Resource
• Learning Curve, Tolerances for Noise, Complexity,...
 Logistical Constraints
• Independence from Experts, Computational Ease, Development Time,..

Dimensions of Problems and Solutions

 Intelligence Density Dimensions: Quality of Systems

 How Well is the System Engineered?

 Quality of Available Resources

 Logistical Constraints

Intelligence Density Dimensions:
Quality of Systems (1/2)

 Accuracy
• measures how dose the outputs of a system are to the correct or best decisi
on. Can you be confident that the errors(results that are not accurate)are n
ot so severe as to make the sys-tem too costly or dangerous to use?

 Explainabilitv
• is the description of the process by which a conclusion was reached. Stati
stical models explain the output to some degree in the sense that each inde
pendent variable influences or ‘explains’ the dependent variable in that it
accounts for some portion of the variance of the dependent variable.

Intelligence Density Dimensions:
Quality of Systems (2/2)

• Other systems, where rule-based reasoning is involved, show exp1icitly how

conclusions are derived, yet others, such as neural networks, generate opaque
mathematical formulas. These are sometimes referred to as 'black boxes’,
because for the user they are the mathematical equivalent of the magician's
black box: Data go in at one end and results come out the other, but you
cannot (easily) see the rationale behind the conclusion.
 Response speed
• is the time it takes for a system to complete analysis at the desired level of
accuracy. The flip side to this dimension is confidence in the sense that you
can ask how confident you are that a certain period of time, within which the
system must provide an answer, will be sufficient to perform the analysis. In
applications that require that results be produced within a specified
timeframe, missing that time frame means that no matter how accurate and
otherwise desirable the results are, they will be useless in practice.

How Well is the System Engineered? (1/3)

 Scalability
• involves adding more variables to the problem or increasing the range of v
alues that variables can take. For example, scalability is a major issue wh
en you're interested in going from a prototype system involving 10 variabl
es to one with 30 variables. Scalability can be a real problem when the in
teractions among variables increase rapidly in unpredictable ways with th
e introduction of additional variables(making the system brittle)or where t
he computational complexity increases rapidly.
 Compactness
• refers to how small (literally, the number of bytes) the system can be mad
e.Once a system has been developed and tested, it needs to be put into the
hands of the decision makers within an organization. It must be taken out
into the field, be that the shop floor, the trading floor, or the ocean floor.

How Well is the System Engineered? (2/3)

 Flexibility
• is the ease with which the relationships among the variables or their domains can be c
hanged, or the goals of the system modified. Most systems are not designed to be use
d once and then thrown away. Instead they must be robust enough to perform well as
additional functionality is added over time. In addition, many of the business process
es that you might model are not static (i.e., they change over time). As a result, the abi
lity to update a system or to have the system adapt itself to new phenomena important
 Embeddability
• refers to the ease with which a system can be coupled with or incorporated into the inf
rastructure of an organization. In some situations, systems will be components of larg
er systems or other databases. If this is the case, systems must be able to communicat
e well and mesh smoothly with the other components of the organization infrastructur
e. A system that requires proprietary software engineer,or specific hardware will not
necessarily be able to integrate itself into this infrastructure.

How Well is the System Engineered? (3/3)

 Ease of use
• describes how complicated the system is to use for the businesspeople
who will be using it on a daily basis. Is it an application that requires a lot
of expertise or training, or is it something a user can apply right out of the

Quality of Available Resources
 Tolerance for noise in data
• the degree to which the quality of a system, most notably its accuracy, is af
fected by noise in the electronic data.
 Tolerance for data sparseness
• is the degree to which the quality of a system is affected by incompletenes
s or lack of data.
 Tolerance for complexity
• is the degree to which the quality of a system is affected by interactions am
ong the various components of the process being modeled or in the knowle
dge used to model a process.
 Learning curve requirements
• indicate the degree to which the organization needs to experiment in order t
o become sufficiently competent at solving a problem or using a technique.

Logistical Constraints
 Independence from experts
• is the degree to which the system can be designed, built, and tested
without experts. While expertise is valuable, access to experts within an
organization can be a logistical nightmare and can be very expensive.
 Computational ease
• is the degree to which a system can be implemented without requiring
special-purpose hardware or software.
 Development speed
• is the time that the organization can afford to develop a system or,
conversely, the time a modeling technology would require to develop a

 Data-Driven Decision Support

 Evolving Solutions: Genetic Algorithms
 Neural Networks
 Rule-Based Expert Systems
 Fuzzy Logic
 Case-Based Reasoning
 Machine Learning

Data-Driven Decision Support
 OLTP: On-Line Transaction Processing
 ISAM: Indexed Sequential Access Method, early DBMS
 RDBMS: Relational Database Management Systems
• Data Normalization
• SQL: Sequential/Structured Query Language
 EIS: Executive Information Systems
• Friendly & Intelligent User-interface
 Data Warehousing and OLAP: On-Line Analytical Process
• LAN: Local Area Network
• Data Loader->Converter->Scrubber->Transformer->Warehouse->OLAP

Evolving Solutions - Genetic Algorithms (I)
 Optimization Problems:
• A set of problem variables
• A set of constraints
• A set of objectives
 Example:
• ACME Transport, Inc., a shipping firm, needs to plan a delivery route that
will minimize the time and cost of the shipping, but at the same time , make
delivers to all 10 of its overseas clients.
• Exhaustive Search: evaluate all possible 10! = 3,628,800 routes.
– Problem: If the number of clients increase to 25, then there are 25! = 1.55*10 25 possible
route. Therefore it will take a very fast computer (evaluate a million route per second) to
evaluate only 0.23% possible route in 4 billion years.
• Often not a LP problem

The Example - Genetic Algorithms (II)
 Possible constraints to the ACME problem
• Shipping costs must be less than 70% of fee charged.
• Customer waiting time must be less than 90 days.
• If a customer does more than $x of business with ACME then waiting
time must be less than 60 days.

 Possible objectives to the ACME problem

• Overall delivery time is minimized.
• Overall profit is maximized.
• Ship fleet wear is minimized.
• Number of repeated country visits is minimized.

The Origin - Genetic Algorithms (III)
 GAs were originally developed by computer scientist John
Holland in the 1970s as experiments to see if computer
programs could evolve in a Darwinian sense.

 GAs are very useful for solving classes of problems that

were previously computationally prohibitive, especially in
the area of optimization.

 GAs is a heuristic techniques that cannot guarantee optimal

solutions. Only near optimal solutions can be expected
The Theory of Evolution - Genetic Algorithms (IV)

 Basic Concept:
• Natural Selection, i.e., Survival

 Different kromes will survive based on the compatibility of

their attributes with their environment. They are hunted by
their predators at night.

 Each type of krome represents one solution to the survival

problem. Kromes with better attributes have higher
probability to survive and therefore reproduce,
Introduction - Genetic Algorithms (V)
 The smallest unit of a GA is called a gene, representing a unit
of information in your problem domain.
 A series of these genes, or a chromosome, represents one
possible complete solution to the problem.
 A decoder converts the chromosome into a solution to the
problem. (or interprets the meaning of a chromosome)
 A fitness function then is used to determines which
chromosome solutions are good and which are not very good.

Introduction - Genetic Algorithms (VI)
 A GA randomly creates an initial population of
chromosomes and evaluates their fitness.
 A new generation (new population of chromosomes) is
created by combining and refining the information in the
chromosome using
• Selection
• Crossover
• Mutation

 The process is repeated until a satisfactory solution is found.

Notes - Genetic Algorithms (VII)
 Do not guarantee an optimal solution.

 You can use a GA to solve problems that you don’t even

know hoe to solve. All you need to be able to do is
describe a good solution and provides a fitness function
that can rate a given chromosome.

 How good a solution provided by a GA is determined by

how good the problem is formulated.

Simulating the Brain to Solve Problems
- Neural Networks (I)

 Learning preserves the errors of the past, as well as its

 The Learning Process: Induction
• Data
• Generalization
• Model

 The Example:
• Over the years, you must have a very good idea how much time you need to
spend on and how to prepare a quiz to get certain grade.
• That is, you build mental models based on the past experiences (data) by
The Origin - Neural Networks (II)
 Neural networks were first theorized as early as the 1940’s by
two scientists at the University of Chicago (McColloch and
Pitts). Works was done in the mid-1950s as well (McCarthy
1956; Rosenblatt 1957) when researchers developed simple
neural nets in attempts to simulate the brain’s cognitive
learning processes.
 ANNs are simple computer programs that build models from
data by trial and error.
 Very useful in modeling complex poorly understood
problems for which sufficient data can be collected.
Nervous Systems - Neural Networks (III)
 Our nervous systems consist of a network of individual but
interconnected nerve cells called neurons.
 Neurons can receive information (stimuli) from the outside
world at various points in the network.
 The information travels through the network by generating
new internal signals that are passed from neuron to neuron.
These new signals ultimately produce a response.
 A neuron passes information on to neighbor neurons by
firing or releasing chemicals called neurotransmitters.
Nervous Systems - Neural Networks (IV)
 The connections between neurons at which information
transfers are called synapses.
 Information can either excite or inhibit neurons.
 Synaptic connection can be strengthened (learning) or
weakened (forgetting) over time with experience.
 With repeated learning, one can generalize his/her
experience, modifying the response to stimuli, and thus
ultimately reach the level of reflexes.

Introduction - Neural Networks (V)
 ANN involves a system of neurons (or nodes) and weighted
connections (the equivalent of synapses) inside the memory
of a computer.
 Nodes are arranged in layers:
• Input layer
• Hidden layer
• Output layer

 Through learning (trial and error, propagating, other

algorithms), ANN adjusts the weights on each connections to
match the desired response (minimize the amount of error).
Training Steps of a Neural Network (1/2)

 Step l: The network makes a guess based on its current weights and the input
 Step 2: The net calculates the error associated with the output (at the out,put n
ode). For example, if the desired output were 1, but the network output were 0,
the error would be +1, based on the difference between l and 0.
 Step 3: The net determines by how much and in what direction each of the we
ights leading in to this node needs to be adjusted. How?
This is accomplished by calculating how much each of the individual weighte
d inputs to the node contributed to the error,given the particular input value. S
o, for example, if a node's output were too small, the net might need to concen
trate on (that is,increase) small or negative weights that lead up to that node. I
n essence, the network feeds back the information about how well it's doing to
the neurodes in the net, and where possible problems might be.

Training Steps of a Neural Network (2/2)

 Step4: The net adjusts the weights of each node in the layer according to the a
nalysis in the previous step. For example, in the case where thc output was to
o small, the neural network will try to increase the values of the positive weig
hts since that would make the weighted sum larger. This would bring the outp
ut closer to 1, which is what you want in this case. Similarly, the neural net s
hould also try to decrease the size of the negative weights (or even make them
 Step5:The net repeats the process by performing a similar set of calculations
(Step l-Step3)for-each node in the hidden layer below it. But since you canno
t tell the net what the desired output of each of the hidden nodes should be (the
y are internal and hidden), the neural network does a kind of sensitivity analysi
s to determine how large the error of each of these nodes is..

Note - Neural Networks (VI)
 No domain experts are needed, unlike Rule-based Systems
or Fuzzy Systems.

 Excel at mapping relationships on to data that are noisy

and incomplete.

 Need adequate learning rate step size.

 Avoid over-training. (may accidentally learn from noise)

Putting Expert Reasoning in a Box
- Rule-Based Systems (I)

 Learn to reason forward and backward on both sides of a


 You can view much of problem solving as consisting of

• Automobile/Car Repair
• Medical Care
• Accounting and Tax Practice
• Quality Control

 The most famous of RBS, XCON, developed in 1979 by

Digital Equipment Corp.
The Basic Concept
- Rule-Based Systems (II)

 CreditBank loan application example

 IF employment stability is very low

AND credit history is very low
THEN credit risk is very high

 The region that each rule applied as in Fig. 7.1 is called pr

oblem space. Each cube is essentially a rule. In other wor
ds, a rule “samples” a region of the problem space.

The Basic Concept
- Rule-Based Systems (III)

 The part before the “then” is referred to as the condition

part of the rule or the left-hand side (LHS), and the part
after the “then” as the action part, or the right hand side

 Forward Chaining

 Hypothesize

 Backward Chaining

The Basic Concept
- Rule-Based Systems (IV)

 How the rules are used is flexible and is referred as the

control strategy.

 Three basic components

• a rule base
• working memory
• a rule interpreter

 Steps (Recognize-Act Cycle)

• Rules are matched against the data
• The interpreter selects one instantiated rule
• The selected rule is fired

The Basic Concept
- Rule-Based Systems (V)

 Differences between RBS and Decision Tree

• How well you understand the problem at that time
• Modification
• RBS tells nothing about how to do things
• One-direction Vs. Multiple directions

 The order in which rules are processed affects the results

• Meta Rules

- Rule-Based Systems

 60% to 70% of the time taken to develop rule-based syste

ms is spent on knowledge acquisition.

 Only worth considering when you have experts available.

 What-if analysis using dependency network.

 The difficulty: making the right rules to fire at the right tim

