Basic Definitions: Testing: What Is Software Testing?

Basic Definitions: Testing
 What is software testing?

• Running a program
• In order to find faults
• a.k.a. defects
• a.k.a. errors
• a.k.a. flaws
• a.k.a. faults
• a.k.a. BUGS
 Hrm. . . that’s a lot of “a.k.a”s

• Let’s refine this terminology a bit
1
Faults, Errors, and Failures
 Fault: a static flaw in a program
• What we usually think of as “a bug”
 Error: a bad program state that results

from a fault
• Not every fault always produces an error
 Failure: an observable incorrect behavior

of a program as a result of an error
• Not every error ever becomes visible
2
To Expose a Fault with a Test
 Reachability: the test much actually reach
and execute the location of the fault
 Infection: the fault must actually corrupt
the program state (produce an error)
 Propagation: the error must persist and
cause an incorrect output – a failure
3
An Example
int findLast (int a[], int n, int x) {
// Returns index of last element in a
// equal to x, or -1 if no such.
// n is length of a
int i; Find the fault
for (i = n-1; i > 0; i--) {
if (a[i] == x)
return i;
}
return -1;
}
4
An Example
// n is length of a Here’s a test case:
int i;
a = {}
for (i = n-1; i > 0; i--) { n=0
if (a[i] == x) x=2
return i;
} Does not even reach
the fault
return -1;
}
5
An Example
// n is length of a Here’s another:
int i;
a = {3, 9, 4}
for (i = n-1; i > 0; i--) { n=3
if (a[i] = x) x=2
return i;
} Reaches the fault
Infects state with error
return -1;
But no failure
}
6
An Example
// n is length of a And finally:
int i;
a = {2, 9, 4}
for (i = n-1; i > 0; i--) { n=3
if (a[i] = x) x=2
return i;
} Reaches the fault
Infects state with error
return -1;
And fails – returns -1
} instead of 0
7
Controllability and Observability
 Goals for a test case:
• Reach a fault
• Produce an error
• Make the error visible as a failure
 In order to make this easy the program must be
controllable and observable
• Controllability:
• How easy it is to drive the program where we want
to go
• Observability:
• How easy it is to tell what the program is doing
8
Design for Testability
 If a program is not designed to be
controllable and observable, it generally
won’t be
 We have to start preparing for testing
before we write any code
• Testing as an after-the-fact, ad hoc, exercise is
often limited by earlier design choices
9
Test-Driven Development
 One way to design for testability is to write the
test cases before the code
• Idea arising from Extreme Programming and agile
development
• Write automated test cases first
• Then write the code to satisfy tests
• Helps focus attention on making software well-specified

• Forces observability and controllability: you have to be
able to handle the test cases you’ve already written
(before deciding they were impractical)
• Reduces temptation to tailor tests to idiosyncratic
behaviors of implementation
10
Controllability: Simulation and Stubbing
 A key to controllable code is effective
simulation and stubbing
• Simulation of low-level hardware devices
through a clean driver interface
• Real hardware may be slow
• May be impossible/expensive to induce some
hardware failure modes on real hardware
• Real hardware may be a limited resource
• Stubbing for other routines and code
• Other code/modules may not be complete
• May be slow and irrelevant to test
• May need to simulate failure of other modules
11
Simulation and Stubbing: JPL Example
 When testing JPL flash storage modules we
rely on software simulation of flash devices
• Real flash devices are slow
• Can’t do aggressive random testing
• Real flash devices are expensive
• JPL only has a few boards – constant competition to
test on these
• Running hundreds of thousand of tests will wear the
flash hardware out
• Enables us to introduce rare hardware failures
• System resets, spontaneous bad blocks and write
failures, etc.
12
Controllability: Downwards Scalability
 Another important aspect of controllability is
to make code “downwards scalable”
• Many faults cause an error only in a corner
case due to a resource limit
• An effective strategy for finding errors is to
reduce the resource limits
• Test a version of the program with very tight bounds
• Finding corner cases is easier if the corners are
close together
• Too many programs hard-code resource limits
or make assumptions about resources
unconnected to defined limits
• E.g., not checking the result of malloc
13
Downwards Scalability: JPL Example
 Flight flash hardware is usually 1-4 GB
device
• E.g., 64 blocks of 32 pages of 8192 bytes
 We primarily test with much smaller “devices”
(using software simulation)
• 6 blocks of 4 pages of 64 bytes
• Forces flash file system to compact storage
more often
• Tests assumptions about how space is used on
flash
• Forces more multi-page writes and directory
entries over multiple pages
14
Downwards Scalability: JPL Example
 Easier to explore various combinations of
states of blocks/pages of the device
Used page
Free page
Dirty page
Bad block
15
Controllability
 Other important themes for controllability
• Network/file access
• If program reads from the network or to remote files,
this is hard to control
• Again, simulation and stubbing are key
• System calls
• Similarly, reading the time from the operating system
can be hard to control
• Simulation and stubbing – Operating System
Abstraction Layer etc.
• GUI control
• Allow scripted control of GUI elements so tests can
be automated
16
Observability: Assertions
 Assertions improve observability by making
(some) errors into failures
• Even if the effect of a fault doesn’t propagate, it
may be visible if an assertion checks the state
at the right time
 Assertions also improve observability by
making the error, rather than failure, visible
• Know how the state was corrupted
directly, not just eventual effect
17
Observability: Invariant Checkers
 Can extend the idea of assertions to writing
“full” invariant checkers
• Do a crawl of code’s basic data structures
• Check various invariants that would be
too expensive to check at runtime
• Invariant checker can be written to be
easy-to-use: recursion, memory
allocation, etc.
• Won’t run on actual system
• But be careful! If your invariant checker has
a bug and changes the system state. . .
18
Observability
 Other important themes for observability
• Logging
• Especially critical for GUI interfaces, to mirror
GUI events in ordered parseable messages
• Network/file access
• If program writes to the network or to remote
files, this is hard to observe
19
Controllability & Observability: Memory
Allocation
 More extreme case: embedded code for
mission or safety critical systems
• May be running without memory protection
• Dynamic allocation often forbidden
 Design module to accept a static block allocated
elsewhere, and only access this memory
• Controllability: allows us to introduce memory
faults, simulate warm reboots
• Observability: allows us to easily instrument
code with low-overhead checks to find memory
safety violations during testing
20
Coverage
 Literature of software testing is primarily
concerned with various notions of coverage
 Ammann and Offutt identify four basic kinds of
coverage:
• Graph coverage
• Logic coverage
• Input space partitioning
• Syntax-based coverage
21
Graph Coverage
 Cover all the nodes, edges, or paths of

some graph related to the program
 Examples:
• Statement coverage
• Branch coverage
• Path coverage
• Data flow (def-use) coverage
• Model-based testing coverage
• Many more – most common kind of
coverage, by far
22
Graph Coverage
 Most FSM testing algorithms can be seen

as graph coverage
• Consider VC – computing a spanning tree
to nodes is standard graph exploration
• Beizer: “find a graph and cover it”
23
Statement/Basic Block Coverage
if (x < y) Statement coverage:
{ Cover every node of these
y = 0; 1 graphs
x = x + 1; x<y x >= y
} y=0
x=x+1 2 3 x=y
else
{
x = y; 4
}
if (x < y) 1
{ x<y
Treat as one node because y = 0; y=0 x >= y
x=x+1 2
if one statement executes x = x + 1;
the other must also execute }
(code is a basic block) 3
24
Branch Coverage
if (x < y) Branch coverage vs.
{ statement coverage:
y = 0; 1 Same for if-then-else
x = x + 1; x<y x >= y
} y=0
x=x+1 2 3 x=y
else
{
x = y; 4
}
if (x < y) 1
But consider this if-then { x<y
structure. For branch coverage y = 0; y=0 x >= y
can’t just cover all nodes, but x=x+1 2
x = x + 1;
must cover all edges – get to }
node 3 both after 2 and without 3
executing 2!
25
Path Coverage
How many paths through
if (x < y) this code are there? Need
{ one test case for each to
y = 0; get path coverage
x = x + 1;
1
} To get statement and branch
x<y x >= y
else coverage, we only need two
y=0 test cases:
{ 2 3 x=y
x=x+1
x = y; 1 2 4 5 6 and 1 3 4 6
}
4 Path coverage needs two more:
x<y 12456
if (x < y)
y=0 x >= y
{ x=x+1 5 1346
1246
y = 0;
6 13456
x = x + 1;
}
In general: exponential in
the number of conditional branches!
26
Data Flow Coverage
x = 3;
1 x=3
Def(x)
y = 3; Annotate program with
2 y=3 locations where variables
if (w) { Def(y) are defined and used
x = y + 2; (very basic static
} 3 analysis)
w Def-use pair coverage requires
if (z) { x=y+2 executing all possible pairs
!w
y = x – 2; 4 of nodes where a variable is
Def(x)
} Use(y) first defined and then used,
without any intervening
n=x+y 54 re-definitions
z
y=x-2 !z E.g., this path covers the pair
Def(y) 6 where x is defined at 1 and used
Use(x) at 7: 1 2 3 5 6 7
May be many pairs, 7 n=x+y But this path does NOT:

some not actually executable Use(x) Use(y) 1234567
27
Logic Coverage
What if, instead of:
if (x < y) 1
{ ((a>b) || G)) && (x < y)
y = 0; y=0 ((a <= b) && !G) || (x >= y)
x = x + 1; x=x+1 2
}
we have: 3
if (((a>b) || G)) && (x < y)) Now, branch coverage will guarantee
{ that we cover all the edges, but does
y = 0; not guarantee we will do so for all
x = x + 1; the different logical reasons
}
We want to test the logic of the guard

of the if statement
28
Active Clause Coverage
( (a > b) or G ) and (x < y)

With these values
for G and (x<y), 1 T F T T
(a>b) determines
the value of the
predicate
2 F F T F
With these values
for (a>b) and 3 F T T T duplicate
(x<y), G
determines the
value of the
4 F F T F
With these values
predicate
for (a>b) and G,
(x<y) determines
5 T T T T
the value of the
predicate 6 T T F F
29
29
Input Domain Partitioning
 Partition scheme q of domain D
 The partition q defines a set of blocks, Bq = b1 ,
b2 , … bQ
 The partition must satisfy two properties:
1. blocks must be pairwise disjoint (no overlap)
2. together the blocks cover the domain D (complete)
b1 b2 bi  bj = ,  i  j, bi, bj  Bq
b3  b=D
b  Bq
Coverage then means using at least one input from each
of b1, b2, b3, . . .
30
30
Input Domain Partitioning
 Some subtleties here…
 What’s wrong with this partition of file contents?
• {
• b1: Sorted ascending file
• b2: Sorted descending file
• b3: Neither sorted ascending nor sorted descending
• }
b1 b2 bi  bj = ,  i  j, bi, bj  Bq
b3  b=D
b  Bq
31
31
Syntax-Based Coverage
 Based on mutation testing (a pet topic of

Amman and Offutt, who are heavily into this
research area)
 Bit different kind of creature than the other
coverages we’ve looked at
 Idea: generate many syntactic mutants of
the original program
 Coverage: how many mutants does a test
suite kill (detect)?
32
32
Mutating Our Buggy Program
// n is length of a
int i;
for (i = n-1; i > 0; i--) {
if (a[i] = x)
return i;
}
return -1;
}
33
Mutant #1
// n is length of a
int i;
for (i = n; i > 0; i--) {
if (a[i] = x)
return i;
}
return -1;
}
34
Mutant #2
// n is length of a
int i;
for (i = n-1; i > 0; i--) {
if (a[i] = x)
return i;
}
return 0;
}
35
Mutant #3
// n is length of a
int i;
for (i = n-1; i > 0; i--) {
if (a[i] != x)
return i;
}
return -1;
}
36
Mutant #4
// n is length of a
int i;
for (i = n-1; i > 0; i--) {
if (a[i] = n)
return i;
}
return -1;
}
37
Mutant #5: Wait, this one’s the fix!
// n is length of a
int i;
for (i = n-1; i >= 0; i--) {
if (a[i] = x)
return i;
}
return -1;
}
38
Syntax-Based Coverage
MUTANTS OF P
Program P
100% coverage
means you kill
all the mutants with
your test suite
39
39
Generation vs. Recognition
 Generation of tests based on coverage
means producing a test suite to achieve a
certain level of coverage
• As you can imagine, generally very hard
• Consider: generating a suite for 100%
statement coverage easily reaches
“solving the halting problem” level
• Obviously hard for, say, mutant-killing
 Recognition means seeing what level of

coverage an existing test suite reaches
40
Coverage and Subsumption
 Sometimes one coverage approach subsumes another
• If you achieve 100% coverage of criteria A, you are
guaranteed to satisfy B as well
• For example, consider node and edge coverage
• (there’s a subtlety here, actually – can you spot it?)
 What does this mean?

• Unfortunately, not a great deal
• If test suite X satisfies “stronger” criteria A and test suite
Y satisfies “weaker” criteria B
• Y may still reveal bugs that X does not!
• For example, consider our running example and statement
vs. branch coverage
• It means we should take coverage with a grain of salt,
for one thing
41
Testing “for” Coverage
 Never seek to improve coverage just for the
sake of increasing coverage
• Well, unless it’s a command from-on-high
 Coverage is not the goal
• Finding failures that expose faults is the goal
• No amount of coverage will prove that the
program cannot fail
“Program testing can be used to show the

presence of bugs, but never to show their
absence!” – E. Dijkstra, Notes On Structured
Programming
42
The Purpose of Testing
presence of bugs, but never to show their
absence!” – E. Dijkstra, Notes On
Structured Programming
 Dijkstra meant this as a criticism of testing and an

argument in favor of more disciplined and total
approaches (proving programs correct)
 But he also points out what testing is good for:
exposing errors
 Coverage is valuable if and only if test sets with
higher coverage are more likely to expose failures
43
The Purpose of Testing
presence of bugs”
 When we first start “testing,” we often want to

“see that the program works”
• Try out some scenarios and watch the program
“do its stuff”
• Surprised (annoyed) when (if) the program fails
• This is not really testing: testing is not the
same as a demonstration
• Aim to break (your) code, if it can be broken
44
Levels of Testing
 Adapted from Beizer, by Amman and Offutt
• Level 0: Testing is debugging
• Level 1: Testing is to show the program works
• Level 2: Testing is to show the program
doesn’t work
• Level 3: Testing is not to prove anything
specific, but to reduce risk of using program
• Level 4: Testing is a mental discipline that
helps develop higher quality software
45
What’s So Good About Coverage?
 Consider a fault that
causes failure every int findLast (int a[], int n, int x) {
// Returns index of last element
time the code is // in a equal to x, or -1 if no
// such. n is length of a
executed int i;
 Don’t execute the for (i = n-1; i >= 0; i--) {

if (a[i] = x)
return i;
code: cannot possibly }
return 0;
find the fault! }
 That’s a pretty good

argument for
statement coverage
46
What’s So Good About Coverage?
 We should have an
argument for any kind int findLast (int a[], int n, int x) {
// Returns index of last element
of coverage: // in a equal to x, or -1 if no
// such. n is length of a
• “If I don’t cover this, int i;
then there is more for (i = n-1; i >= 0; i--) {

if (a[i] = x)
chance I’ll miss a }
return i;
fault like that” return 0;

}
• Backed with
empirical data,
preferably!
47
Return to Our Example
// n is length of a
Let’s write a tester for
int i;
this version of the
for (i = n-1; i > 0; i--) { program (back to the
if (a[i] == x) first off-by-one bug)
return i;
Forget for a moment
} that we know what the
return -1; bug is!
}
48
// n is length of a
int i; What kind of coverage
might we want to think
for (i = n-1; i > 0; i--) {
about when testing this
if (a[i] = x) code?
return i;
}
return -1;
}
49
#define N 5 // 5 is “big enough”?
int testFind () {
int a[N];
int p, i;
for (p = 0; p < N; p++) {
random_assign(a, N)
a[p] = 3;
for (i = p; i < N; i++) { What kind of coverage
if (a[i] == 3) does this tester exploit?
a[i] = a[i] – 1;
}
printf (“TEST: findLast({”);
print_array(a, N);
printf (“}, %d, 3)”, N);
assert (findLast(a, N, 3) == p);
}
}
50

Basic Definitions: Testing: What Is Software Testing?

Uploaded by

Copyright:

Available Formats

You might also like

Basic Definitions: Testing: What Is Software Testing?

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Definitions: Testing: What Is Software Testing?

Uploaded by

Copyright:

Available Formats

Basic Definitions: Testing

 What is software testing?

 Hrm. . . that’s a lot of “a.k.a”s

 Error: a bad program state that results

 Failure: an observable incorrect behavior

• Helps focus attention on making software well-specified

• Input space partitioning

 Cover all the nodes, edges, or paths of

 Most FSM testing algorithms can be seen

• Beizer: “find a graph and cover it”

May be many pairs, 7 n=x+y But this path does NOT:

We want to test the logic of the guard

( (a > b) or G ) and (x < y)

 Based on mutation testing (a pet topic of

 Recognition means seeing what level of

 What does this mean?

“Program testing can be used to show the

 Dijkstra meant this as a criticism of testing and an

 When we first start “testing,” we often want to

 Don’t execute the for (i = n-1; i >= 0; i--) {

 That’s a pretty good

• “If I don’t cover this, int i;

then there is more for (i = n-1; i >= 0; i--) {

fault like that” return 0;

You might also like