Professional Documents
Culture Documents
Software Reliability2
Software Reliability2
RELIABILITY
Part 2
Where are we now?
• Previous course
• System vs software reliability
• Model
• Module vs operation mode
• Software Reliability Prediction
• Metrics
• Software FRACAS
• Musa model
• This course
• Operational profile
• Human reliability
• SRE best practices
Factors Influencing Software Reliability
• A user’s perception of the reliability of a software
depends upon two categories of information.
• The number of faults present in the software.
• The ways users operate the system.
• This is known as the operational profile.
Cognitive approach
•models and theories of cognitive
functions which underlie human
behavior
•Cognitive psychology still immature
•Problem: human cognition is not
directly observable
Quantification techniques
• HEART: a human performance model-based technique
utilizing some standard probabilities
• A data-based method for assessing and reducing human error to
improve operational performance.
• J.C. Williams (1988) IEEE Fourth Conference on Human Factors and
Power Plants (pp.436-450)
• Based on long-term sizeable human reliability database; weighting
factors based on HF literature.
• Assumes human performance usually deteriorates when Error
Producing Conditions (EPCs) interact
• SLIM: a utility-based technique using team based
judgments
• THERP: earliest method
HEART generic categories
Generic Task Nominal human *5th-95th
unreliability percentile
bounds
(A) Totally unfamiliar, performed at speed with no real idea of likely
consequences 0.55 (0.35-0.97)
(B) Shift or restore system to a new or original state on a single attempt without
supervision or procedures 0.26 (0.14-0.42)
(C) Complex task requiring high level of comprehension and skill
0.16 (0.12 - 0.28)
(D) Fairly simple task performed rapidly or given scant attention
0.09 (0.06 - 0.13)
(E) Routine, highly practiced, rapid task involving relatively low level of skill
0.02 (0.007 -
0.045)
(F) Restore or shift a system to original or new state following procedures, with
some checking
0.003 (0.0008 -
0.007)
(G) Completely familiar, well-designed, highly practiced, routine task occurring
several times per hour, performed to highest possible standards by highly 0.0004 (0.00008 -
motivated, highly-trained
and experienced personnel, with time to correct potential error, but without the
0.009)
benefit of significant job aids
(H) Respond correctly to system command even when there is an augmented or
automated supervisory system providing accurate interpretation of system state
0.00002 (0.000006 -
0.0009)
Error producing Conditions (EPCs) (selection) Factor
Unfamiliarity with a situation which is potentially important but which only occurs infrequently or which is novel 17
A shortage of time available for error detection and correction 11
A low signal-noise ratio 10
A means of suppressing or over-riding information or features which is too easily accessible 9
No obvious means of reversing an unintended action 8
A need to unlearn a technique and apply one which requires the application of an opposing philosophy 6
The need to transfer specific knowledge from task to task without loss 5.5
Ambiguity in the required performance standards 5
A means of suppressing or over-riding information or features which is too easily accessible 4
A mismatch between perceived and real risk. 4
No clear, direct and timely confirmation of an intended action from the portion of the system over which control is 4
exerted.
Operator inexperience (e.g., a newly qualified tradesman but not an expert) 3
A mismatch between the educational achievement level of an individual and the requirements of the task 2
Little opportunity to exercise mind and body outside the immediate confines of a job 1.8
Little or no intrinsic meaning in a task 1.4
High level emotional stress 1.3
Evidence of ill-health amongst operatives especially fever. 1.2
Low workforce morale 1.2
A poor or hostile environment 1.15
Prolonged inactivity or highly repetitious cycling of low mental workload tasks (1st half hour) 1.1
(thereafter) 1.05
Disruption of normal work sleep cycles 1.1
Task pacing caused by the intervention of others 1.06
Additional team members over and above those necessary to perform task normally and satisfactorily. (per additional 1.03
team member)
How does it all come together?
• Find out task level:
• (E) Routine, highly practiced, rapid task involving relatively low level of
skill
• R=1-0.02
• Are there any EPCs?
• A mismatch between perceived and real risk. E1=4
• Little or no intrinsic meaning in a task E2=1.4
• Additional team members (per member) E3=1.03*3
• Assess proportion of EPC ( ≠ 1)
• P1= 0.5, P2=0.2, P3=0.5
• Assess effect = ((E-1)*P)+1
• F1=2.5, F2=1.08, F3=2.045
• Assessed probability of failure
• 0.02*2.5*1.08*2.045=0.11043
Human + software reliability
How do they interact?
The operational profile
• A software-based product’s reliability depends on just how
a customer will use it.
• Making a good reliability estimate depends on testing the product
as if it were in the field.
• The operational profile
• quantitative characterization of how a system will be used
• Works also for hardware, human components
• Can be used for the whole system
Who develops an operational profile?
• Developed by:
• systems engineers
• high-level designers (architecture)
• test planners
• Strong participation by:
• product planning
• marketing professionals
• key customers, if available
User groups
System-modes
Functional
Operational
Example: retail store market
• Customer groups: 1. customer profile
• large retail stores
• small chains
• grocery chains
2. user profile
• User groups:
• Cashiers
• marketing analysts
• I-S specialists
• System-modes: 3. system-mode profile
• I-S specialists do database cleanup and
also report generation
4. functional profile
• Functions
• each mode has several functions (e.g.,
various reports in report generation
mode)
• Note use of word function is from user
perspective, i.e. user task
• Operations: 5. operational profile
• user functions are mapped onto the
software product’s operations
Notes:
Some steps may be unnecessary
Uniformity of detail is not required
Customer Occurrence Probability
Step 1:
Educational Institution 0.45
Customer
Business Organization 0.35
Profile
Individual Home User 0.20
Ex: software spreadsheet
package
For instance, schools
might use them for
tabulating and updating
student grades.
Businesses might use
them mainly for financial
and operations controls.
Home users could keep
track of their monthly
income and expenses, as
well as investments and
savings plans.
The customer profile is
the list of customer
types and the
associated probabilities.
These probabilities are
simply the proportions
of time that each type of
customer would be
using the system.
Customer Occurrence Probability
Educational Institution 0.45
Step 2: User
Secretary 50%
Profile
Managers 30%
The user profile is the
set of user types and Teachers 20%
their associated
probabilities of using the Business Organization 0.35
system Secretary 40%
Within a customer Managers 60%
group: use the
proportion of customer
Individual Home User 0.20
group’s usage that the Individuals 100%
user group represents
If can’t determine usage,
use the number of users
as proportion of the total
users in that group
Combine same user
groups found in different
customer groups
User Occurrence Probability
Secretary 0.5*0.45+0.4*0.35=0.365
Step 2: User
Managers 0.3*0.45+0.6*0.35=0.345
Profile
Teachers 0.2*0.45=0.09
The user profile is the
set of user types and Other individuals 0.2
their associated
probabilities of using the
system
Within a customer
group: use the
proportion of customer
group’s usage that the
user group represents
If can’t determine usage,
use the number of users
as proportion of the total
users in that group
Combine same user
groups found in different
customer groups
System mode Occurrence Probability
Batch Mode 0.35
Step 3: System
User-Interactive Mode 0.65
Mode Profile
A system mode is a way that a
system can operate. The system
includes both hardware and
software. Most systems have more
than one mode of operation. For
example, system testing may take
place in batch mode or user-
interactive mode. An airplane flight
consists of takeoff and ascent
mode, level flight mode and
descent and land mode. An
automobile may be in normal
mode or four-wheel drive; it may
also be in normal mode or cruise
control. System modes can be
thought of as independent
segments of a system operation or
various different ways of using a
system. A system can switch
among modes sequentially, or it
can permit several modes to
operate concurrently, sharing the
same system resources. For each
system mode, if there are more
than one or two, an operational
profile (and sometimes functional
profile) should be developed.
There are no technical limits on
how many system modes may be
established
Short recap - Operational Profile Development
• Musa, J.D., “Operational Profiles in Software Reliability
Engineering,” IEEE Software Magazine, March 1993
Functional profile – 1/2
• After a good system mode profile has been developed, the
focus should turn to evaluation of each system mode for the
functions performed during that mode, and then assigning
probabilities to each of the functions.
• Functions
• are essentially tasks that an external entity such as a user can perform
with the system.
• user of an e-mail system would want the following functions: create
message, look up address, send message, open message
• are based on what activities the customer wants the system to be able
to perform.
• Developing a functional profile is, in that respect, a part of developing
requirements.
• A functional profile need not have a defined number of
functions, but generally contains 20 to more than a hundred.
The number will vary based on project size, number of system
modes, environmental considerations, and function breadth.
Functional profile – 2/2
• The functional profile can be either explicit or implicit,
depending on the key input variables
• A key input variable is an external parameter which affects the
execution path a software system traverses based on the
different values the parameter takes on.
• consist of ranges of variables that cause different operations to be
performed.
• These various ranges are referred to as levels.
• A profile is explicit if each element is designated by
simultaneously specifying the levels of all key input variables
needed for its identification.
• A profile is implicit if it is expressed by subprofiles of each key
variable.
• That is, each key environmental parameter is assigned probabilities
associated with the ranges it can legally use.
Implicit Profile
Subprofile C Subprofile D
• Wait a second….
It’s not that simple…
1. Generate an initial function list
• features and capabilities needed by the users
• organized by functions relevant to each key input variable if an implicit
profile is used
2. Determine environmental variables
• environmental variables characterize the conditions that influence the paths
traversed by a program, but do not correspond directly to features
• Ex: hardware configuration and traffic load
3. Create final function list
• environmental and feature variables should be examined for dependencies
• Partial dependencies can cause difficulties because all possible
combinations of levels of both variables may need to be listed
• The final number of functions in the list is then calculated as the product of
the number of functions in the initial list and the number of environmental
variable levels, minus the combinations of initial functions and
environmental variable values that do not occur.
4. Assign occurrence probabilities
Sample final function list
Function Environmental Variable
Standard Deviation X
Y
Correlation X
Y
Analysis of Variance X
Y
Regression X
Y
Final function list
Functional
Chi-Square
Profile System Overall
Segment Function Mode Occurrence
For the assignment of occurrence Occurrence Probability
probabilities, the ideal data source
consists of usage measurements
taken on the latest release or a
Probability
similar system. These
measurements may be obtained
from system logs or data storage
Standard Deviation 0.60 0.12
devices. Occurrence probabilities
computed with the historical data
should be updated to account for
Correlation 0.22 0.044
new functions, users, or
environments. In the event that a
system is completely new the
Analysis of Variance 0.10 0.02
functional profile might be very
inaccurate. It should still be
developed, however, and updated
Regression 0.08 0.016
as more is known about how the
system will be operated. The
process of predicting usage forces
interaction with the customer,
which can be very important. The
required dialogue may highlight
Environmental Profile
the relative importance of the
various functions, indicating that
some functions may not be
Variable count Occurrence Probability
necessary while others are most
significant.
Reducing the number of functions
One (X) 0.6
should increase reliability
Multiple (Y) 0.4
Chi-Square
Final Functional System Mode
Overall
Profile Function Occurrence
Occurrence
Probability
Segment Probability