Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Coverage and Profiling for Real-Time Tiny Kernels

Sital Prasad Kedia, Anusree Bhattacharjee, Rajeshwar Kailash and Saurabh Dongre
Memory Solutions Group
Samsung India Software Operations
Email: sital.kedia@samsung.com

Abstract—Real-time tiny kernels run on the embedded devices of the application significantly which is directly proportional
with limited resources (especially memory) and usually with no to the number of functions (in case of profiling) or number of
file system. How theses real-time tiny kernels can be enhanced basic-blocks (in case of coverage) in the code and our target
to enable coverage and profiling for the applications running
in them using gcov and gprof tool provided by GCC has been board may not have sufficient memory to satisfy this additional
described in this paper. The changes made in gcc-tool-chain in memory requirement. Hence we have implemented a feature
order to separate the gprof tool (based on version glibc 2.8 and called function level coverage and profiling in GCC by which
binutils 2.18.50) so that it can be used for any target architecture we can disable coverage/profiling for some functions in a file,
and for any real-time OS without file system has been described saving the extra memory space required for coverage/profiling
here. Furthermore, when an application is enabled for coverage
or profiling, its memory footprint as well as the execution time of those functions.
increases significantly. Due to the stringent memory constraint on To enhance the tiny kernels with gcov and gprof tools one
embedded devices, it is necessary to keep the memory footprint of has to understand how these tools really works on Linux
the running applications as small as possible. Hence, in this paper, environment. On Linux, GCC compiler links these tools to the
a feature called function level coverage and profiling is described, application for which code coverage or profiling analysis has
which can be used to reduce the memory footprint and the
execution time for coverage and profiling enabled applications. to be done. So for enhancing tiny kernels with these tools, the
The above mentioned feature is incorporated in GCC (version tiny kernel and the application has to be compiled and linked
4.3) and the changes required in gcc-tool-chain to implement with the help of GCC (cross gcc-tool-chain based on target
the feature have been described. Function level coverage and architecture)compiler.
profiling feature was tested on arm target board with uC/OS-II The rest of the paper is organized as follows. Section II
RTOS and results show that it can save significant amount of
memory on target board as well as it can reduce the execution discusses the coverage analysis and gcov. An introduction
time of the coverage/profiling enabled applications. to profiling and operation of gprof tool on Linux has been
discussed in section III. Porting of gprof to real-time tiny
I. I NTRODUCTION kernel has been presented in Section IV. Section V presents
function level coverage and profiling. Simulation results are
A. Motivation
discussed in Section VI and we conclude in section VII.
Coverage and profiling tools such as gcov and gprof gives us
some basic performance statistics, which allow us to optimize II. C OVERAGE ANALYSIS AND GCOV

our code in an efficient way. Gcov [2] is used to create more Code coverage [8] analysis is all about finding the areas of
efficient and faster running code and discover untested part of code not exercised by a set of test cases. It can be used in the
the program and gprof [3] is used to determine which part of following two ways. First, it measures the quality of the test
our code uses greatest amount of the computation time. These cases by finding to what extent our test cases cover our source
profilers examine the temporal (execution time) and spatial code. This allows us to create more test cases to increase
(code size) [12] efficiency of the program. Since memory the code coverage. The quantitative measure of code coverage
space in embedded devices is of premium value, use of such determines the end product quality; hence coverage analysis is
coverage and profiling tools are indispensable in development helpful for improving the end product quality. Second, it can
of applications for embedded system. Many such tools [13] be used to eliminate dead code from our program that reduces
[14] have been developed for embedded systems. However, the code size, which in turn reduces the memory footprint of
these tools require file system support on the embedded the application. Due to the stringent memory requirement of
devices and real-time tiny kernels cannot afford to have the the embedded devices, coverage analysis is very important in
luxury of having file system with them. Porting of gcov on the context of designing applications for real-time tiny kernels.
embedded target board without file system support has already Gcov is the code coverage analysis infrastructure provided
been described in [6].In this paper we have described how the with gcc compiler. It can be used to analyze the C, C++ as
profiling tool gprof was ported to the embedded target board well as assembly files [9]. It uses a technique called code
without file system support. instrumentation [12] to generate coverage information of a
Moreover, enabling coverage or profiling for an application program. Gcov framework consists of the following 3 phases
increases the memory footprint (approximately 108 bytes/ ba- [6].
sic block on arm target board for coverage) and execution time 1) Compilation Phase
2) Execution Phase 1) Compilation Phase: When the program is compiled in
3) Reporting Phase gcc with profiling enabled (-pg option), gcc causes every
When a program is compiled in gcc with function to call the mcount routine as its first operation. The
-fprofile-arcs -ftest-coverage options, GCC mcount routine is included in the profiling library, which in-
inserts counting instruction at the beginning of each basic creases execution count of the current function and updates the
block. It also generates graph file (.gcno) [2] for each call graph by retrieving the caller identity through the return
compilation unit which contains information to reconstruct address in the stack and the called identity through the value
the basic block graph and assign source line numbers to of program counter. Since this is a very machine-dependent
blocks at the reporting phase. During execution of the operation, mcount itself is typically an architecture dependent
instrumented binary, the counting instructions for each basic short assembly-language stub routine. During linking phase,
block are executed before execution of the block itself. These depending on the processors architecture gcc links the object
counting instructions simply increment the counter variables files to the corresponding mcount routine.
for the corresponding basic block and thus keeping track 2) Execution Phase: Every executable file (.elf) in Linux
of the number of times each basic block in the source file has a constructor and a destructor section. Functions in the
was executed. At the end of the program execution, these constructor and destructor sections are executed before and
counter variables are written in a data file (.gcda) [2]. after the execution of the main function respectively. For a
During reporting phase, gcov utility reads both the graph files profiling enabled application to collect the histogram data, it
and the data files and maps the counter variables in the data is necessary that the memory array for the histogram should
files to the corresponding line number in the source file and be allocated before the call to main function. To accomplish
generates a coverage report(.gcov). In real-time tiny OSs, this, gcc inserts gmonstartup function as a constructor to the
file system support is not available, so all the coverage data executable of a profiling enabled application. This function
has to be written into the memory at the end of the execution calls the monstartup function with start of text segment and
phase, later it should be manually transferred to the host PC end of text segment as arguments. The monstartup function
for reporting phase. In [6] authors have described a method of registers a memory array with the kernel (for storing the
porting the gcov tool for embedded target without file-system histogram record), along with a scale factor that determines
support, which involves major steps like how the program’s address space maps into the array by
• Allocating memory for counter variables before the call calling the profil() function. The profil() function also sets a
to the main function. timer interrupt with profile count as the call back function for
• Replacing the standard C library calls like file operations the interrupt. The frequency of the interrupt is determined by
and dynamic memory allocation. the profil frequency function. The profil count function being
• Manual call to gcov exit function to write the data into called after every equal interval examines the value of the
memory. program counter and increments the corresponding slot of the
histogram array.
III. P ERFORMANCE PROFILING AND G PROF After the program execution, writing of the profiling data
Software performance analysis, more commonly known as is accomplished by a call to mcleanup function (in the
profiling [11], is an investigation of programs behavior based destructor section). The mcleanup function opens gmon.out
on the information collected as the program executes. Profiling file and writes the histogram, call-graph and basic-block count
allows us to learn in which part our program spent most of records to the file.
the execution time, so that these parts can be rewritten to 3) Reporting Phase: During reporting phase, the gprof
make the program execution faster. Generally profiling can utility reads the profile data written into the gmon.out file
be accomplished in two ways, 1.Sampling 2.Instrumentation in execution phase and generates mainly two types of output
[11] [12]. In sampling, the targets program counter is probed style: namely flat profile and call graph [3]. The flat profile
on a regular interval by interrupt generated by the operating shows the time spent in each function while the call graph
system. On the other hand, instrumentation involves adding shows the caller and called information of a function.
some extra instructions to the target program to collect the
profiling information. Instrumentation typically slows down IV. P ORTING OF G PROF TO R EAL -T IME T INY K ERNEL
the program execution, but it is considered to be more accurate Gprof works on Linux which has support for the features
than sampling. like file system and dynamic memory allocation. Now if we
Gprof is a profiling tool provided with the gcc compiler want to port it to real time tiny kernels then we have to make
which uses both statistical sampling and instrumentation tech- changes in gprof so that it does not use any of the above
niques to gather profiling information. For each function, it described features. Following are the changes that were made
records the number of calls to the function and the amount in gprof in order to port it to the real time tiny kernel.
of time spent there. From gprofs output, we can identify
the functions which consume large fraction of execution time A. Ensure call to monstartup routine before main
and try to optimize those functions. Similar to gcov, gprof As previously discussed, the monstartup function does the
framework also consists of the following 3 phases: initialization of the memory array to store the histogram
data. Hence before calling main routine of the program the we want to profile only a few functions in a file. Following
monstartup routine should be called with the images text are some advantages of this feature:
section start and text section end address as arguments. The • We can disable coverage/profiling for some trivial func-
text section start and end addresses can be achieved from linker tions in a file which we don’t want to profile and thus
script. saving the extra memory that would be required for
coverage/profiling of those functions.
B. Remove OS and target architecture dependencies
• Coverage/profiling of the whole application can be per-
The functions like profil and profil frequency should be formed in multiple run, enabling a part of the application
ported not to use any Linux system calls (like setitimer). The for coverage/profiling in one run, in case we are lacking
profil frequency function should be ported so as to return the short of memory in the target board.
frequency with which the program’s program counter should • This feature can be useful when we are interested in
be examined by the profil count function. This frequency obtaining the percentage of coverage (line coverage or
should be chosen very carefully depending on the context function coverage) only for selected functions in a file.
switch time of the target board and the desired accuracy of
To implement the aforementioned feature, changes have to
the profiling result. If the frequency of profiling is very large,
be made in the gcc tool chain itself. Following are the two
context switching takes most of the program execution time
major changes required to be made in GCC.
and this slows down the program execution significantly. On
the other hand if it is small then the accuracy of the profiling 1) Implement a mechanism through which we can specify
result will be hampered [3]. As already discussed, the profil GCC, the list of functions that are enabled for cover-
function sets a timer interrupt with profil count as the call age/profiling.
back function. On target board we need to make sure that 2) For each function under compilation, GCC should add
the profil count function is being called after every equal the coverage/profiling code to the function, only if it is
interval determined by the profil frequency function. This can enabled for coverage/profiling.
be accomplished by a call to profil count function from the First change can be implemented in several ways. The
Interrupt Service Routine (ISR) after every equal interval conventional way is to implement a new command option
determined by the profil frequency function. to GCC so that we can specify the list of functions that
All C standard library dependencies (like file system calls are enabled or disabled for coverage/profiling from com-
and dynamic memory allocation) should be replaced to use the mand line itself. However, to make things simpler we spec-
system calls provided by our target OS and file system usage ified the list of functions enabled for coverage/profiling in
should be handled by writing data into static buffers. the source file itself. Some dummy macros like COVER-
AGE ENABLE (function_name) or PROFILE ENABLE
C. Exit point (function_name) were defined. These macros were defined
At the end of execution an explicit call to mcleanup to NULL and they only served as a place holder for the
function is required to write the data into memory (to the static function names that are enabled for coverage/profiling. A
buffer allocated for gprof data). The profiling data in memory separate parser function was implemented in GCC which
should be transferred to the host pc for reporting phase. opens the current file under compilation (represented by the
global variable input_filename) and searches for the
V. F UNCTION LEVEL C OVERAGE AND P ROFILING above specified macros (as keyword) in the file and constructs
As already discussed, in case, coverage is enabled for a a list of functions that are enabled for coverage/profiling. For
particular file, GCC adds counting code for each and every each function under compilation, GCC searches the above
basic block in the file. Similarly, if profiling is enabled, GCC specified list for the function name (represented by the global
adds profiling code for each and every function in the file. So function current_function_name ) and if the function
enabling coverage or profiling for a particular file results in name is present in the list then it adds the coverage/profiling
increase in the text segment size as well as data segment size, code to the function.
which in turn increases the memory footprint of the resulting
binary for the target machine. But real time tiny kernels A. Function Level Coverage
generally lacks in memory space to satisfy the extra memory As previously discussed, GCC creates two types of files i.e.
requirement imposed by GCC for coverage and profiling. For graph file (.gcno) and data file (.gcda) for each coverage
the above mentioned reason, we have implemented a feature enabled file. Graph file contains information to reconstruct
called Function level Coverage and Profiling, which allows us the basic block graph and assign source line numbers to
to select or deselect some functions in a file for coverage or blocks and is generated when the source file is compiled
profiling. This feature can be independently implemented into with the GCC -ftest-coverage option. Data file contains
GCC by some minor changes in the gcc-tool-chain. Though, arc transition counts and some summary information and is
GCC implements a feature called file level coverage and generated when a program containing object files built with the
profiling by which we can enable or disable some files for GCC -fprofile-arcs option is executed. When a source
coverage/profiling in our application, there are situations when file is enabled for coverage, GCC defines counter variable and
adds counting code to each and every basic block in the file.
These counter variables are incremented by the corresponding
counting code during the program execution and are written
into the data file at the end of program execution via a call to
function at_exit via gcov_exit
To implement function level coverage, changes has
to be made in GCC such that both data file and
graph file contain information only for coverage en-
abled functions. For each function in the source file, the
branch_prob function (in profile.c), adds informa-
tion for the particular function to the graph file by a
call to the function coverage_begin_output. In ad-
dition to that, the branch_prob function calls the func-
tion instrument_edges, which instruments all the ba-
Fig. 1. Data Segment size vs. number of coverage enabled functions
sic blocks in the current function. It does so by defining
counter variables and adding counting code (for each ba-
sic block) to the resulting object file. The branch_prob
function should be modified so as to call the functions
coverage_begin_output and instrument_edges,
only for those functions which are enabled for coverage. It can
know whether the current function is enabled for coverage by
searching for the current function name in the list of functions
enabled for coverage.
B. Function Level Profiling
As discussed in section III, for profiling enabled files,
GCC adds profile hooks, which is call to monitoring routine
(mcount) at the beginning of each and every function in that
file. Although inserting a call to mcount routine instruction
doesn’t increase the size of text segment significantly, it Fig. 2. Text Segment size vs. number of coverage enabled functions
introduces significant overhead in terms of time required to
execute the program.
To implement function level profiling, the mcount routine size and binary size. For noting the size of text segment and
should be called only for those functions which are enabled data segment, a separate map file was created by the linker
for profiling. The function expand_function_start (in with Map option.
the file function.c) is responsible for adding profil- Figures 1,2,3 and 4 show the results of memory analysis
ing hooks to each function. It does so only if the flag for coverage enabled applications. Figure 1 compares the size
current_function_profile is set for the current func- of data segment against the number of functions enabled for
tion being compiled. To implement the function level profiling, coverage. As the number of functions enabled for coverage
the current_function_profile flag should be set to
0 for all the functions that are not enabled for profiling
and this has to be done at the prologue of the function
expand_function_start.
VI. RESULTS
We ported the gcov and gprof utilities to arm target board
with UC/OS-II RTOS as operating system. Changes were
made in gcc-tool-chain to implement the function level cov-
erage and profiling feature. We have analyzed the memory
overhead (per function) imposed for coverage and profiling
enabled applications on arm target board with UC/OS-II
RTOS. A C program having 1000 simple functions with each
function having exactly 3 basic blocks was written. Memory
analysis was done by enabling varying number of functions
for coverage and profiling for different runs of the application
and then noting the size of text segment, data segment, object Fig. 3. Object size vs. number of coverage enabled functions
Fig. 4. Binary size vs. number of coverage enabled functions Fig. 5. Text segment size vs. number of profiling enabled functions

increases, the number of basic block enabled for coverage


increases and consequently size of the data segment increases.
This is due to the fact that GCC defines counter variables for
each basic block and as the number of basic block enabled
for coverage increases, the number of counter variables and
hence the data segment size increases linearly. The increase
in data segment size for 100 functions (= 300 basic blocks) is
approximately equal 2.4 Kb so the corresponding increase per
basic block is 8 bytes. Figure 2 depicts a linear increase in
the text segment size against the number of functions enabled
for coverage. Again this linear increase in text segment size
is due to the fact that GCC adds counting code for each
basic block enabled for coverage. Approximate increase in
text segment size for 100 functions is 25Kb, so increase per Fig. 6. Object size vs. number of profiling enabled functions
basic block is approximately equal to 83.3 bytes. Comparison
of increase in object size and binary size is given in figure
3 and 4 respectively. Increase in object size and binary size age/profiling enabled application increases. In case of cov-
is almost same and is approximately equal to 100 bytes/basic erage, the increase in time of execution is due to execution
block. of extra counting instructions added at the beginning of every
The extra memory space required to execute coverage basic block in the coverage enabled functions and in case of
enabled binary on the target board can be estimated as the profiling, the same is due to call to monitoring routine at the
sum of increase in data segment size and increase in binary prologue of every profiling enabled functions.
size which adds up to 108 bytes/ basic block. In other words,
we can say that 108 bytes of memory space can be saved on
arm target board by disabling a basic blocks for coverage.
Figures 5,6 and 7 show the results of memory analysis for
profiling enabled applications. In figure 5, we observe that the
size of text segment increases with the number of functions.
This increase is due to insertion of call to mcount statement
in the prologue of each function enabled for profiling. A
similar increase in object size and binary size can be observed
in figure 6 and 7. The data segment size was found to be
independent of the number of functions enabled for profiling.
It should be noted that in case of profiling, the increase in text
segment, object size and binary size is significantly less then
the corresponding increase in case of coverage.
Time analysis for coverage and profiling is given in figures
8 and 9. As the number of functions enabled for cover-
age/profiling increases the time of execution of the cover- Fig. 7. Binary size vs. number of profiling enabled functions
[5] S.Graham, P.Kessler, M.McKusick, “gprof:A Call Graph Execution
Profiler”, Proceedings of SIGPLAN’ 82 Symposium on compiler
Construction, SIGPLAN Notices, Vol. 17, No 6, pp. 120-126, June
1982.
[6] Holger Blasum, Frank Gorgen, Jurgen Urban “Gcov on an embedded
system”, GCC for Research in Embedded and Parallel Systems,
Brasvo 16 Sept 2007.
[7] Dominic A. Varley “Practical Experience of the limitations of Gprof”,
Software Practice and Experience 23, 461- 463.
[8] S Cornett, Code coverage analysis, Accessed February, 2010. [Online].
Available: http://www.bullseye.com/coverage.html.
[9] X Fei, L Luo 2004, Using the GNU tools to measure assembly
program coverage (Liyong GNU gongju shixian huibian chengxu fugai
ceshi), Journal of Computer Applications (Jisuanji Yingyong, ISSN
1001-9081), Vol 24 No. 12 pp. 95-98.
[10] Q Yang, JJ Li, D Weiss, 2006,“A survey of coverage based testing
tools”, in: Proceedings of the 2006 International Workshop on
Automation, AST’ 06 Shanghai 2006, pp. 99-103.
Fig. 8. Execution time vs. number of coverage enabled functions [11] Profiling (Wikipedia), Accessed February, 2010. [Online]. Available:
http://en.wikipedia.org/wiki/Profiling (computer programming)
[12] Ramesh V Peri, Sanjay Jinturkar and Lincoln Fajardo 1999. A novel
technique for profiling programs in embedded systems. In ACM
Workshop on Feedback-Directed and Dynamic Optimization
(FDDO-2).
[13] OProfile - A System Profiler for Linux, Accessed February, 2010.
[Online]. Available:http://oprofile.sourceforge.net/news/
[14] Valgrind, Accessed February, 2010. [Online].
Available:http://valgrind.org/

Fig. 9. Execution time vs. number of profiling enabled functions

VII. C ONCLUSION
We have ported the gprof tool provided by GCC to real-time
OSs without file system support. Also, we have implemented
a feature called function level coverage and profiling in GCC.
We have described in details, the changes made in gcc-tool-
chain to implement the above feature. The feature was tested
for arm target board using UC/OS II RTOS and results show
that our feature can save significant amount of memory on
target board as well as it can reduce the execution time of the
coverage/profiling enabled applications.

R EFERENCES
[1] GNU GCC Compiler Collection, Accessed February, 2010. [Online].
Available: http://gcc.gnu.org/
[2] Free Software Foundation, gcov - a Test Coverage Program in: Using
the GNU Compiler Collection (GCC), Accessed February, 2010.
[Online] Available:
http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Gcov.html
[3] GNU gprof in: Documentation for binutils 2.20, Accessed February,
2010. [Online]. Available:
http://sourceware.org/binutils/docs-2.20/gprof/index.html
[4] Implementation of profiling in GNU gprof in: Documentation for
binutils 2.20, Accessed February, 2010. [Online]. Available:
http://sourceware.org/binutils/docs/gprof/Implementation.html#
Implementation

You might also like