Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Unit 8.

Control Flow and Data Flow Testing


1. 2. 3. 4. Introduction Control Flow Graph Test Coverage Analysis Data Flow Testing

Reading: TB-Chapters 4 (4.1-4.9), 5 (5.1-5.10)

1. Introduction
-Control Flow Testing (CFT) and Data Flow Testing (DFT) are structural (i.e. White-box) techniques suitable for small units of source code, e.g., a function. -CFT and DFT help with interactions along the execution path and interactions among data items in execution, respectively.

-CFT and DFT use as test models control flow graphs (CFG), which give an abstract representation of the code. -CFG is also used as main model for code coverage analysis.
2

2. Control Flow Graph


Code Segments
-A code segment consists of one or several contiguous statements with no conditionally executed statements.
That means once a segment is entered, all the statements involved will execute. The last statement in the segment must be another predicate, a method exit, a loop control, a break, or a goto. The last part of a segment includes the predicate or exit expression that selects another segment but does not include any of subsequent segments code.

-A predicate expression contains one or many conditions that evaluate to true or false. One condition corresponds to each boolean operator in the predicate expression.
Predicates are used in control statements: if, case, do, while, do until, for, and so on. The evaluation of a predicate results in transfer of control to one or many code segments. A predicate with multiple conditions is called a compound predicate.
3

Examples: Code segments in Canonical loop structures For Loop


buffer= new char[nchar + 1]; for (n=0; n< nchars; ++n) { A B, D C

buffer[n] = newChar; }

While Loop

buffer= new char[nchar + 1]; int n =0; while ( n< nchars) { buffer[n] = newChar; ++n; }

B C

Until Loop

buffer= new char[nchar + 1]; int n =0; do { buffer[n] = newChar; ++n; } While (n < nchars);

Representation
-A control flow graph (CFG) describes code segments and their sequencing in a program. It is a directed graph in which:
A node corresponds to a code segment; nodes are labeled using letters or numbers. An edge corresponds to a conditional transfer of control between code segments; edges are represented as arrows. Entry node: the entry point of a method, represented with a node with no inbound edges. Exit node: the exit point of a method, represented with a node with no outbound edges. Decision node: more than one outgoing edge Junction node: more than one incoming edge Processing node: one incoming and one outgoing edge
5

Flow graphs for canonical loop structures For Loop


buffer= new char[nchar + 1];
for (n=0; n< nchars; ++n) {

A B, D

buffer[n] = newChar; }

A B

C D 6

Example: CFG for a method


public class Authenticator { final int MAX = 10000; String [] userids = new String [MAX]; String [] passwords = new String [MAX];
Segments

public boolean verify (String uid, String pwd) { boolean result =false; int i = 0; while ((result ==false) && (i < MAX)) {

A D B, C

E
G

if ((userids[i]==uid) && (passwords[i] == pwd )) D, E result=true; ++i;


F

F G

}
return result; } //Other methods } //Class end

Identify each node as: decision, junction, processing, entry, or exit


H

Is there any harm in merging two 7 neighbor nodes in a CFG?

Path Expressions -A path corresponds to a sequence of segments connected by arrows.


A path is denoted by the nodes that comprise the path; e.g., path ABH. Loops are represented by segments within parentheses, followed by an asterisk to show that this group may iterate from zero to n times, e.g., path A(BCDEFG)*BH.

-An entry-exit path is a path starting with the entry node and ending A with the exit node
Examples of Entry/Exit Paths:
B

-ABH -ABCH -A(BCDEFG)*BH -A(BCDEG)*BH -A(BCDG)*BH -A(BCDG)*BCH


G

C H D E

Compound Predicates -Modeled separately by specifying a node for each individual predicate involved. -All true-false combinations in a compound predicate should be analyzed, and the effects of short circuit boolean evaluation should be made explicit.
For instance: C++, Java, and Objective-C use C semantics- short circuit boolean evaluation is automatically applied to all multiple-condition boolean expressions.
A

Example:

while ((result ==false) && (i < MAX)) {


C H D E G B

if ((userids[i]==uid) && (passwords[i] = pwd )) result=true; ++i; }

Case and Multiple-if Statements

-Modeled by specifying a separate node for each predicate, each conditional action, and the default action.

Cond1

if cond1 P else if cond2 Q else if cond3 R else Default

P Cond2

Q Cond3

R Default

Next segment

10

Switch Statements -Modeled by specifying a node for the switch expression, and a separate node for each action.

Computed goto

switch (e) { case 1: P; case 2: Q; case 3: R; } P

R break; Next segment break; break;

11

3. Test Coverage Analysis Test Coverage Overview


-Test coverage attempts to address questions about when to stop testing, or the amount of testing that is enough for a given program. -Ideal testing is to explore exhaustively the entire test domain, which in general is impossible.

Some code may never be executed due to the possibility of missing test cases. Effectiveness of test suites can be established only by knowing what code is, or isn't executed.

-A code coverage model calls out the parts of an implementation that must be exercised to satisfy an implementation-based test model.
Coverage, as a metric, is the percentage of these parts exercised by a test suite.
12

Test Coverage Overview (ctd.)


-Hundreds of coverage models have been published and used since the late 1960s.
Most coverage models rely on control flow graphs (CFG) to provide an abstract representation of the code.

-Recommended not to use a code coverage model as a test model.


Instead, established test strategies (e.g., equivalence, domain) should be used to devise test suites, while coverage are used at the same time to analyze generated test suites adequacy. Coverage reports can point out a grossly inadequate test suite.

-Common coverage criteria include:


Statement coverage Branch coverage Condition/Multiple conditions coverage Basis-path coverage Data flow coverage

13

Common Code Coverage Criteria


Statement coverage -Achieved when all statements in a method have been executed at least once.
Also known as line coverage, segment coverage, or basic block coverage.
Segment and basic block coverages count segments instead of individual statements.

Segment coverage ensures that all code segments defined in the CFG are covered.

-Example:
int foo(int x) { if (a==b) { ++x; } else { --x;} return x; } Statement coverage is achieved by having test cases involving both: 1. a==b 2. a != b
14

Statement coverage (ctd.)


If a bug exists in a statement, and that statement is not executed, there is almost no chance of revealing that bug; hence statement coverage is the minimum coverage required by IEEE SENG standards.

However, it is a very weak criterion and should be viewed as the barest minimum.

Example:
void foo () { char* c = NULL; if (x == y) { c = &aString; } *c = test code; } Statement coverage is achieved for foo() by setting x equal y; however, when the condition is false, the code generates an incorrect pointer and crashes.

15

Branch Coverage -Achieved when every branch from a node is executed at least once by a test suite.
Also known as decision coverage or all-edge coverage. Improve on statement coverage by requiring that each branch be taken at least once. Hence each outcome of a predicate expression is exercised at least once (i.e., true and false). However, it is limited by the fact that it treats a compound predicate as a single statement: branch coverage may be achieved without exercising all of the conditions

16

Branch Coverage (ctd.) -Example


int foo(int x) { if (a==b || (x==y && isEmpty())) { ++x; } else { --x; } return x; }

Branch coverage may be Achieved using: 1. a ==b 2. a !=b and x !=y


Without ever exercising this.isEmpty()

Branch coverage misses several of the possible entry-exit paths.

17

Condition coverage/Multiple condition coverage -Condition coverage requires that each condition be evaluated as true and false at least once. -Multiple-condition coverage requires that all true-false combinations of simple conditions be exercised at least once.
There are at most 2n true-false combinations for a compound predicate with n simple conditions. Multiple-condition coverage ensures that all statements, branches, and conditions are covered, but not all paths. Reaching all condition combinations may be impossible due to short circuit evaluation. Mutually exclusive conditions spanning several predicate expressions may preclude certain combinations.
18

Basis-Path Model
Cyclomatic Complexity Metric -Defined as the number of edges minus the number of nodes plus 2:

C=en+2
Where e and n stand for the number of edges and the number of nodes in corresponding CFG, respectively.

-Also called McCabe complexity metric

Evaluate the complexity of algorithms involved in a method. Give a count of the minimum number of test cases needed to test a method comprehensively. Low C value means reduced testing, and better understandability.

-Example:

CC= e- n + 2 = 8 7 + 2 = 3
19

The Basis-Path Test Model -Call for testing C distinct entry-exit paths.
A test suite is developed by finding C distinct entry-exit paths and producing test cases by path sensitization. Traditionally, it is suggested that the longest entry-exit path be included in the test suite. Basis-path coverage is achieved when C distinct entry-exit paths have been exercised.

-The basis-path model is appealing because it is simple and pragmatic. -However it is unreliable as a coverage metric, because it can be satisfied without meeting the barest minimum generally accepted standards of code coverage:
On one hand branch coverage may be achieved with less than C paths in some methods. On the other hand it is possible to select C entry-exit paths and achieve neither statement coverage nor branch coverage. 20

The Basis-Path Test Model (ctd.)


Example 1:
C=24-19+2=7 Branch coverage can be achieved in four paths

Example 2:
According to how the paths are selected, neither branch coverage nor statement coverage may be achieved using C paths.

C=8-7+2=3 Branch coverage can be be achieved in two paths

21

Example: Binary Search Routine Code


Class BinSearch { public static void search(int key, int[] elemArray, Result r) { int bottom = 0; int top = elemArray.length 1; int mid; r.found = false; r.index=-1; while (bottom <= top) { mid = (top + bottom)/2; if (elemArray [mid] == key) { r.index = mid; r.found = true; return; } else { if (elemArray[mid] < key) bottom = mid + 1; else top = mid 1; } } } }
22

Class BinSearch {

public static void search(int key, int[] elemArray, Result r) { int bottom = 0; int top = elemArray.length 1; int mid; r.found = false; r.index=-1;
while (bottom <= top) { mid = (top + bottom)/2; if (elemArray [mid] == key) { r.index = mid; r.found = true; return; } else { if (elemArray[mid] < key) bottom = mid + 1; else top = mid 1; } } } }

B C D E

F G H I

23

Flow Graph
A

B
while bottom <= top bottom> top

C D
else

if (elemArray [mid] == key)

Derive 3 test cases: ABI ABCDE ABCDFGBI Is this enough?

F
if (elemArray [mid] < key) else

H
24

Independent Paths and Number of Test cases ABI ABCDEI A(BCDFG)*BI A(BCDFH)*BI

-The minimum number of test cases required according to the Basis path criterion is equal to the cyclomatic complexity (CC).
CC = Number (edges) Number (nodes) + 2 = 10 9 + 2 = 3

25

Path Sensitization
-Process of determining argument and instance variable values that will cause a particular path to be taken. -Path sensitization is undecidable; no algorithm can solve this problem in all cases.

-Instead it must be solved heuristically:


For small, well-structured methods, this task is usually not difficult. It becomes increasingly more difficult as size and complexity increase.

-Approach:
1. 2. Use initially other test strategies (e.g., domain analysis etc.) to select test inputs (this reduces the amount of work required). Then find additional test values to exercise the missing paths to meet the coverage goal.
26

Test Coverage, in Practice


-In practice, the three basic criteria most commonly used are statement coverage, branch coverage and condition coverage.
It has been suggested that the combination of these three criteria can achieve 80-90 % or more coverage needs in most cases.

-Important to note that test coverage is not enough by itself:


100% test coverage cannot guarantee achieving error-free software.

-However, test coverage in combination with appropriate test strategies can mitigate the impact of uncovered code during software testing.
Help the tester find some rational points at which to stop testing.

27

4. Data Flow Testing


-Basic idea of data flow testing (DFT) is to test the correct handling of data dependencies.
Data dependencies can be viewed as the mechanism that data are carried along during program execution.

-Data flow testing investigates how values of data that are associated with variables can affect the execution of programs.
Reveal data flow anomalies, e.g., data values are not available when they are supposed to be, or they are available and not used.

-Data flow anomaly could happen in various ways in program code, e.g.:
Using, destroying, or reinitializing a variable without defining or initializing it. Repeated assignment of value to some variable, without some intermediary reference.

-Test cases are derived from data dependency analysis (DDA) and an abstract model (e.g. flow graph).

28

Data Dependency Analysis


Data Operations -Three basic types of data operations:
1. Data definition: through data creation, initialization, assignment, all explicitly or through side effects such as shared memory locations, mailboxes, read/write parameters etc.
Abbreviated as D-operation or just D.

Key characteristic: destructive operation; whatever is stored in the data item is destroyed after this operation.

2.

Undefinition or Kill: occurs when the value assigned to a variable and corresponding memory location become unbound. Data use in computation or in predicate, referred to as C-use or P-use, respectively.
Both of these types of uses are collectively called U-operation, or just U.
Main characteristic: non-destructive operation; the value of the data item remains the same after this operation.
29

3.

Data Operations (ctd.) Example:

30

Data Dependency Analysis (ctd.)


Data Relations -Define pair-wise ordinal relations on the same data objects:
1. 2. D-U relation: normal usage case; when a data item is used, we need to obtain its value defined previously. D-D relation: represents the overloading or masking situation, where the later D-operation destroys the previous contents. This relation can be sign of error or inefficiencies in the software. U-U relation: there is no effect or data dependency due to the nature of non-destructive U-operation. Therefore these relations are ignored in data dependency analysis. U-D relation: this is called anti-usage. The only interesting situation with it is that a data item is used without ever being defined previously, which usually indicates a problem.

3. 4.

31

Data Flow Graph


-Drawn with the objective of identifying data definitions and their uses.

-Consist of a directed graph constructed as follows:


A sequence of definitions and c-uses is associated with each node of the graph. A set of p-uses is associated with each edge of the graph. The entry node has a definition of each parameter and each nonlocal variable which occurs in the subprogram. The exit node has an undefinition of each local variable.

32

Data Flow Graph (ctd.)


Example:

33

Data Flow Graph (ctd.)


Notion of Def-Clear Path -A path (i, n1,,nm,j) is called a definition clear path (def-clear path) with respect to variable x, from node i to node j, if x has neither been defined nor undefined in nodes n1,,nm. -Example:

- A,D,E is def-clear for x but not for y - A,B,E is def-clear for y but not for x

34

Data Flow Coverage


-Popular data flow coverage criteria include:
All definitions (all-defs) criterion: -Require that for each definition of a variable x, the set of paths executed by the test set T contains a def-clear path from the definition to at least one c-use or one p-use of x. -i.e., all definitions get used. All-c-uses criterion: -Require that for each definition of a variable x, and each c-use of x reachable from the definition, there is a def-clear path from the definition to the c-use. -i.e., all computations affected by each definition are exercised. All-p-uses criterion: -Require that for each definition of a variable x, and each p-use of x reachable from the definition, there is a def-clear path from the definition to the p-use. -i.e., all branches affected by each definition are exercised
35

Data Flow Coverage (ctd.)


-Data flow coverage criteria (ctd.):
All uses criterion: -Require that for each definition of a variable x, and for each use of x reachable from the definition, there is def-clear path from the definition to the use node. -i.e., all uses of a definition should be covered. -Stronger than all-definitions criterion. All definition-use-paths criterion: - Require that for each definition of a variable x, and for each use of x reachable from the definition, all def-clear paths from the definition to the use node must be covered, such that each path is loop free, or contains at most one loop. -i.e., coverage of all possible definition-use paths that either are cycle-free or have only simple cycles (i.e. one loop). -Stronger than all of the above criteria.
36

Data Flow Testing Strategy


-Track the definition and use of a variable (def-use pair), in which abnormal actions could happen, during program execution. -Involve selecting entry-exit paths with the objective of achieving certain data flow coverage criteria. -Consist of the following steps:
1. Draw a flow graph from a program
2. Select one or more data flow coverage criteria 3. Identify paths in the data flow graph satisfying the coverage criteria

4. Derive path predicate expressions from the selected paths and solve those expressions to derive test input.
37

Example:

A: {1,2} D(i) 1 i = get_a_number_from_console(); 2 j = get_a_number_from_console(); 3 if (a<30) { B: {3} 4 i= get_a_number_from_console(); a<30 a>=30 5 } else { D(i) C: {4} B: {5,6,7} 6 j=5; 7} 8 if (a>b) { D: {8} 9 i=8; a>b a<=b 10 j=6; D(i) E: {9,10} F: {11,12,13} 11 } else { 12 printf("%d\n",j); 13 } G: {14} b<30 b>=30 14 if (b<30) { 15 printf("%d\n",i); H: {15,16} I: {17,18,19} U(i) U(i) 16 j=6; 17 } else { J: {DONE} 18 printf("%d\n",i); 19 } (1) Which lines are definition occurrences of variable i? (2) Which lines are use occurrences of variable i? (3) For variable i, how many feasible def-clear paths are there? (Note that a feasible def-clear path 38 can be covered by a test case.) Identify these def-use pairs using their line numbers. (4) Apply adequate dataflow coverage criteria (with respect to i) by identifying sample test cases.

Flow Graph:

Review Questions:
Mark each of the following as true or false:
(1) (2) (3) (4) (5) (6) Segment coverage implies statement coverage Branch coverage implies segment coverage Branch coverage implies multiple conditional coverage Multiple conditional coverage implies branch coverage Path coverage implies branch coverage All def-use path coverage implies all definition coverage int foo(int x) { if (a==b || (x==y && isEmpty())) { ++x; } else { --x; } return x; } T T F T T T

-Example

-What test cases will give 100% branch coverage? a==b, a!=b && x!=y No -Does this achieve 100% multiple conditional coverage? -Which coverage criteria should be used here to maximize coverage?
39

You might also like