Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Software Testing and

Quality Assurance

Chapter 5: Structural (White Box) Testing

1
What is Structural Testing?
 White-box testing (also called clear box testing, glass box
testing, transparent box testing, or structural testing) is a
method of testing software that tests internal structures or
workings of an application [the tester has access to the internal
data structures and algorithms including the code that
implement these.]
 Structural testing techniques include control flow testing and
data flow testing
 Using the White Box Approach to Test Case Design:
 White box testing methods are especially useful for revealing
design and code-based control, logic and sequence defects,
initialization defects, and data flow defects.

Figure5.1 White-box testing 2


What is Structural Testing?
Types of White Box Testing
 Basis Path Testing
 Flow Graph Notation
 Cyclomatic Complexity
 Coverage Testing
 Statement Coverage
 Branch Coverage
 Condition Coverage
 Data Flow Testing

3
Control Flow Testing
 A white box testing technique to find bug is control
flow testing.
 Control flow testing applies to almost all software and
is effective for most software.
 Control flow testing is more effective for unstructured
code rather than structured code.
 Used to determine the execution order of statements or
instructions of the program through a control structure.
 The control structure of a program is used to develop a
test case for the program.
 It is mostly used in unit testing. Test cases represented
by the control graph of the program.

4
Basis Path Testing
 A path through a program is a node and edge sequence
from the starting node to a terminal node of the control
flow graph of a program.
 There can be more than one terminal node in a program.
 A basis set for a flow graph is the set of paths that cover
every statement in the program.
 It allows the design and definition of a basis set of
execution paths.
 The test cases created from the basis path allow the
program to be executed in such a way as to examine each
possible path through the program by executing each
statement at least once

5
Path coverage testing
 In path coverage, we write test cases to ensure that each
and every path has been traversed at least once
 Writing test cases to cover all the paths of a typical
program is impractical.
 For this reason, the path-coverage testing does not
require coverage of all paths but only coverage of linearly
independent paths.
 The path coverage-based testing strategy requires us to
design test cases such that all linearly independent paths
in the program are executed at least once.
 A linearly independent path can be defined in terms of
the control flow graph (CFG) of a program
6
Linearly independent path
 A linearly independent path is any path through the
program that introduces at least one new edge that is
not included in any other linearly independent paths.
 If a path has one new node compared to all other
linearly independent paths, then the path is also
linearly independent.
 This is because; any path having a new node
automatically implies that it has a new edge.
 Thus, a path that is sub-path of another path is not
considered to be a linearly independent path.
 The path-coverage testing does not require coverage of
all paths but only coverage of linearly independent
paths. 7
Continued…
• The following steps can be applied to derive the
basis set:
1. Using the design or code as a foundation, draw a
corresponding flow graph.
2. Determine the cyclomatic complexity of the
resultant flow graph.
3. Determine the basis set of linearly independent
paths.
4. Prepare test cases that will force execution of each
path in the basis set.

8
1. Control Flow Graph (CFG)
 A control flow graph describes the sequence in which the different
instructions of a program get executed.
 In order to draw the control flow graph of a program, all the
statements of a program must be numbered first.
 The different numbered statements serve as nodes of the control flow
graph
 An edge from one node to another node exists if the execution of the
statement representing the first node can result in the transfer of
control to the other node.
Flow Graph Notation
• The simple notation used for representing control flow is called flow
graph. The flow graph Represent logical control flow which is used to
depict the program control structure.
 Each circle in a flow graph is called flow graph node.
 The arrows are called edges or links (represent flow of control).
 Area bounded by nodes and edges are called regions

9
Continued…
 The CFG for any program can be easily drawn by knowing
how to represent the sequence, selection, and iteration type of
statements in the CFG.
 After all, a program is made up from these types of
statements.
 Fig. 5.2 summarizes how the CFG for these three types of
statements can be drawn.
 It is important to note that for the iteration type of constructs
such as the while construct, the loop condition is tested only
at the beginning of the loop and therefore the control flow
from the last statement of the loop is always to the top of the
loop.
 Using these basic ideas, the CFG of Euclid’s GCD
computation algorithm can be drawn as shown in fig. 5.3.

10
Continued…
 Sequence:
 a=5;
 b = a*2-1;

Fig. 5.2 (a): CFG for sequence constructs

 Selection:
• if (a>b)
• c = 3;
• else
• c =5;
• c=c*c;
Fig. 5.2 (b): CFG for selection constructs

11
Continued…
 Iteration :
• while (a>b)
• {
• b=b -1;
• b=b*a;
• }
• c = a+b;
Fig. 5.2 (c): CFG for and iteration type of constructs
 EUCLID’S GCD Computation Algorithm

12
Continued…

Fig. 5.3: Control flow diagram

13
2. Cyclomatic Complexity
• A cyclomatic complexity is a software metric which gives a
quantitative measure of the logical complexity.
• It defines the number of independent paths (any path through the
program that introduces at least one new set of processing statement
or a new condition)
 For more complicated programs it is not easy to determine the
number of independent paths of the program.
 McCabe’s cyclomatic complexity defines an upper bound for the
number of linearly independent paths through a program.
 Also, the McCabe’s cyclomatic complexity is very simple to
compute.
 Though the McCabe’s metric does not directly identify the linearly
independent paths, but it informs approximately how many paths
to look for.
 There are three different ways to compute the cyclomatic
complexity. The answers computed by the three methods14 are
guaranteed to agree.
Method 1:
Continued…
 Given a control flow graph G of a program, the cyclomatic
complexity V(G) can be computed as:
 V(G) = E – N + 2
 where N is the number of nodes of the control flow graph and E is
the number of edges in the control flow graph.
 For the CFG of example shown in fig. 5.3, E=7 and N=6. Therefore,
the cyclomatic complexity = 7-6+2 = 3.
Method 2:
 The number of regions of the flow graph corresponds to the
cyclomatic complexity.
Method 3:
 The cyclomatic complexity of a program can also be easily computed
by computing the number of decision statements of the program.
 If N is the number of decision statement of a program, then the
McCabe’s metric is equal to N+1. 15
Continued…
 To measure cyclomatic complexity Region, R
=4
 Number of Nodes = 8
 Number of edges = 10
 Number of Predicate Nodes = 3
 Cyclomatic Complexity V(G) =R= 4 Or V(G)
= Predicate Nodes + 1 = 3 + 1 = 4 Or V(G) = E
– N + 2 = 10 – 8 + 2 = 4
 As set of independent paths for flow graph
 Path 1: 1-2-4-7-8
 Path 2: 1-2-3-5-7-8
 Path 3: 1-2-3-6-7-8
 Path 4: 1-2-4-7-2-4…..-7-8

16
Statement Coverage
 The statement coverage strategy aims to design test cases
so that every statement in a program is executed at least
once.
 The principal idea governing the statement coverage
strategy is that unless a statement is executed, it is very
hard to determine if an error exists in that statement.
 Unless a statement is executed, it is very difficult to
observe whether it causes failure due to some illegal
memory access, wrong result computation, etc.
 However, executing some statement once and observing
that it behaves properly for that input value is no
guarantee that it will behave correctly for all input values.
 Statement coverage = No of statements Executed/Total
no of statements in the source code * 100
17
Statement Coverage
 Example: Consider the Euclid’s GCD computation algorithm:
int compute_gcd(x, y)
int x, y;
{
while (x! = y)
{
if (x>y) then
x= x – y;
else y= y – x;
}
return x;
}
 By choosing the test set {(x=3, y=3), (x=4, y=3), (x=3, y=4)}, we can
exercise the program such that all statements are executed at least once.

18
Example Scenario 1:
If a = 5, b = 4
input (int a, int b) print (int a, int b) {
{ int sum = a+b;
sum = a+b if (sum>0)
If (sum>0) print ("This is a positive result")
{ else
Print (This is positive result) } print ("This is negative result")
}
else { SC = 5/7*100 = 500/7 = 71%
Print (This is negative result) print (int a, int b) {
} int sum = a+b;
if (sum>0)
}
print ("This is a positive result")
else
print ("This is negative result")
}
SC = 6/7*100 = 600/7 = 85%
19
Branch Coverage
 In the branch coverage-based testing strategy, test cases
are designed to make each branch condition to assume
true and false values in turn.
 Branch testing is also known as edge testing as in this
testing scheme, each edge of a program’s control flow
graph is traversed at least once.
 It is obvious that branch testing guarantees statement
coverage and thus is a stronger testing strategy
compared to the statement coverage-based testing.
 For Euclid’s GCD computation algorithm, the test cases
for branch coverage can be {(x=3, y=3), (x=3, y=2), (x=4,
y=3), (x=3, y=4)}.

20
Branch Coverage
 Test cases are designed such that:
 different branch conditions is given true and false
values in turn.
 Branch testing guarantees statement coverage:
 a stronger testing compared to the statement coverage-
based testing.

21
Condition Coverage
 In this structural testing, test cases are designed to make
each component of a composite conditional expression to
assume both true and false values.
 For example, in the conditional expression
((c1.and.c2).or.c3), the components c1, c2 and c3 are each
made to assume both true and false values.
 Branch testing is probably the simplest condition testing
strategy where only the compound conditions appearing
in the different branch statements are made to assume
the true and false values.
 Thus, condition testing is a stronger testing strategy than
branch testing and branch testing is stronger testing
strategy than the statement coverage-based testing.

22
Continued…
 For a composite conditional expression of n
components, for condition coverage, 2ⁿ test cases are
required.
 Thus, for condition coverage, the number of test
cases increases exponentially with the number of
component conditions.
 Therefore, a condition coverage-based testing
technique is practical only if n (the number of
conditions) is small

23
Data Flow-Based Testing
 Basic idea: test the connections between variable
definitions (“write”) and variable uses (“read”)
 Data flow-based testing method selects test paths of a
program according to the locations of the definitions
and uses of different variables in a program.
 For a statement numbered S, let
 DEF(S) = {X/statement S contains a definition of X},
and
 USES(S) = {X/statement S contains a use of X}
 For the statement S:a=b+c;, DEF(S) = {a}. USES(S) =
{b,c}.

24
Continued…
 The definition of variable X at statement S is said to be
live at statement S1, if there exists a path from
statement S to statement S1 which does not contain any
definition of X.
 The definition-use chain (or DU chain) of a variable X is
of form [X, S, S1], where S and S1 are statement
numbers, such that X Є DEF(S) and X Є USES(S1), and
the definition of X in the statement S is live at statement
S1.
 One simple data flow testing strategy is to require that
every DU chain be covered at least once.
 Data flow testing strategies are useful for selecting test
paths of a program containing nested if and loop
25
statements.
Example

26
Reaching Definitions

27
Def-use Pairs
 A def-use pair (DU) for variable x is a pair of nodes
(n1,n2) such that
 x is in DEF(n1)
 The definition of x at n1 reaches n2
 x is in USE(n2)
 In other words, the value that is assigned to x at n1 is
used at n2
 Since the definition reaches n2, the value is not killed
along some path n1...n2.

28
Examples of Def-Use Pairs

29
Continued…
 Identify all DU pairs and construct test cases that
cover these pairs
 Several variations with different “relative strength”
 All-DU-paths: For each DU pair (n1,n2) for x, exercise
all possible paths n1, n2 that are clear of a definition
of x
 All-uses: for each DU pair (n1,n2) for x, exercise at
least one path n1 n2 that is clear of definitions of x

30
Continued…
 All-definitions: for each definition, cover at least one
DU pair for that definition
 i.e., if x is defined at n1, execute at least one path n1..n2
such that x is in USE(n2) and the path is clear of
definitions of x
 Clearly, all-definitions is subsumed by all-uses which is
subsumed by all-DU-paths
 Motivation: see the effects of using the values produced
by computations
 Focuses on the data, while control-flow-based testing
focuses on the control

31
Data-flow anomalies
 It represent the patterns of data usage which may lead to
an incorrect execution of the code.
Examples of data flow anomaly
a) It is an abnormal situation to successively assign two
values to a variable without using the first value==>
Defined and then defined again
b) It is abnormal to use a value of a variable before assigning
a value to the variable ==> Undefined but referenced
c) Another abnormal situation is to generate a data value
and never use it ==> Defined but not referenced

32
Continued…
 After detecting a data flow anomaly The programmers must
analyse the causes of data flow anomalies and eliminate
them.
 Investigate the cause of the anomaly.
 To fix an anomaly, write new code or modify the
existing code.
 The presence of a data flow anomaly in a program does
not necessarily mean that execution of the program will
result in a failure.
 A data flow anomaly simply means that the program
may fail, and therefore the programmer must investigate
the cause of the anomaly.

33
Mutation Testing
 In mutation testing, the software is first tested by using an
initial test suite built up from the different white box testing
strategies.
 Mutation testing is that small changes are made in a module
and then the original and mutant modules are compared.
 After the initial testing is complete, mutation testing is taken
up.
 The idea behind mutation testing is to make few arbitrary
changes to a program at a time.
 Each time the program is changed, it is called as a mutated
program and the change effected is called as a mutant.
 A mutated program is tested against the full test suite of the
program.
34
Continued…
 If there exists at least one test case in the test suite for
which a mutant gives an incorrect result, then the
mutant is said to be dead.
 If a mutant remains alive even after all the test cases
have been exhausted, the test data is enhanced to kill
the mutant.
 The process of generation and killing of mutants can be
automated by predefining a set of primitive changes
that can be applied to the program.
 These primitive changes can be alterations such as
changing an arithmetic operator, changing the value of
a constant, changing a data type, etc.
35
When should we use mutation testing?
 Structural test suites are directed at identifying defects in
the code. One goal of mutation testing is to assess or
improve the efficiency of test suites in discovering
defects.
 When we are carrying out structural testing we are
worried about defects remaining in the code. Often we
are keen to measure the Residual Defect Density (RDD)
in the program P under test.
 The Residual Defect Density is usually measured in
defects per thousand lines of code.
 Advocates of mutation testing argue that it can provide
us with an estimate of the RDD of a program P that has
satisfied all the tests in a test suite T.
36
Thank you!
Questions?

37

You might also like