Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

Program Execution in Linux

David Ferry, Chris Gill, Brian Kocoloski, Marion Sudvarg


CSE 422S - Operating Systems Organization
Washington University in St. Louis
St. Louis, MO 63143

1
CSE 361 Review
• If you took CSE 361 (undergraduates
must have), most of today’s material will
be review
• If you’re a grad student some of it may be
new, but hopefully most will be review as
well
• Either way, we will highlight material that
is new in this course (beyond CSE 361)

2
CSE 422S – Operating Systems Organization
Creating an Executable File
//Source code
Two stages: #include <stdio.h>

• Compilation int foo = 20;

• Linking int main( int argc, char* argv[]){


printf(“Hello, world!\n”);
return 0;
}
The compiler translates
Compiler
source code to machine
code. Relocatable Object file:
00000000 D foo
00000000 T main
U puts
The linker connects Linker
binary files to libraries
to create an executable. Executable file

3
CSE 422S – Operating Systems Organization
Static vs. Dynamic Linking
Static linking – required code and data is
copied into executable at compile time

Dynamic linking – required code and data is


linked to executable at runtime
my_program.o
Static: Dynamic:
Program code my_program.o
libc.so
Program Data Program code
Library
Library Code/Data
Program Data
Code/Data

4
CSE 422S – Operating Systems Organization
Static Linking
• With multiple programs, what happens to
the size of the executables?

Program code

Program Data

Library Code

5
CSE 422S – Operating Systems Organization
Static Linking
• With multiple programs, what happens to
the size of the executables?
Program code
Program code Program code Program code
Program Data
Program Data Program Data Program Data
Library Code
Library Code Library Code Library Code

Program code Program code


Program code

Program Data Program Data


Program Data

Library Code Library Code


Library Code

6
CSE 422S – Operating Systems Organization
Dynamic Linking
• With multiple programs, what happens to
the size of the executables?
Program code
Program code Program code Program code
Program Data
Program Data Program Data Program Data

Library Code

Program code Program code


Program code

Program Data Program Data


Program Data

7
CSE 422S – Operating Systems Organization
Running a
Statically Linked Program
• All functions and data
Stack
needed by the process
space are linked as the Static library A code/data
last step of the compiler Static library B code/data

• Only that code/data Heap


needed by the program
.bss
are loaded into virtual
memory .data

.text

8
CSE 422S – Operating Systems Organization
Running a Statically Linked Program
A statically linked program is entirely self-contained

1. An existing process is fork()ed to get a new process


address space
2. execve() reads program into memory
3. The new process starts executing at _start() in the
C runtime (added automatically by the linker), which
sets up environment (not covered in CSE 361)
4. The C runtime eventually calls the program’s main()
function
5. After main() returns, C runtime does some cleanup

9
CSE 422S – Operating Systems Organization
The fork() System Call
• Creates a new process by duplicating the calling process
• Uses copy-on-write (COW), a lazy optimization of memory copy for
new process
• Duplicates file descriptors
• Example:
#include <sys/types.h>
#include <unistd.h>

pid_t pid = fork();
if (pid > 0)
    printf ("I am the parent of pid=%d!\n", pid);
else if (!pid)
    printf ("I am the child!\n");
else if (pid == -1)
    perror ("fork");
• Cannot predict whether parent or child will run first!

10
CSE 422S – Operating Systems Organization
The exec() System Call Family
• Loads and executes a new program image
• Often called after fork() creates a new process
• Returns -1 on error, otherwise jumps to entry point of
new program and does not return
• Keeps pid, parent pid, priority, owning user and group
• Resets most attributes, e.g.: signals, memory locks,
thread attributes, process statistics, address space

11
CSE 422S – Operating Systems Organization
Examples of exec()
• Several versions: execl, execlp, execle, execv, execvp, execve
• l versus v: arguments provided via list or vector
• p: user’s path is searched for the specified file
• e: A new environment is supplied
#include <unistd.h>

//Edit file /home/pi/program.c with arguments as list


int ret = execlp ("vi", "vi", "/home/pi/program.c", NULL);
if (ret == -1)
    perror ("execlp");

//Edit file /home/pi/program.c with arguments as vector


const char * args[] = { "vi", "/home/pi/program.c", NULL }
int ret = execvp ("vi", args);
if (ret == -1)
    perror ("execvp");

12
CSE 422S – Operating Systems Organization
Running a
Dynamically Linked Program
• Some functions and data
Stack
do not exist in process
space at runtime Memory Map Segment (s)
(added at runtime by linker)

• The dynamic linker


(called ld) maps these Heap
into the memory map
.bss
segment on-demand
.data

.text

13
CSE 422S – Operating Systems Organization
Linking at Runtime
• Procedure Linkage Table (PLT)
– Used by dynamic linker to resolve locations of library
functions
– All function calls to shared libraries are replaced by stub
functions that query the PLT at runtime for the address of
the function

• Global Offset Table (GOT)


– Used by dynamic linker to resolve locations of library data
– All references to variables in shared libraries are replaced
with references to the GOT
– Shared libraries may be compiled with Position Independent
Code (allows branch and jump instructions to relative
addresses with respect to GOT)

14
CSE 422S – Operating Systems Organization
Linking at Runtime
At compile time:
• The linker (ld) is embedded in program
• Addresses of dynamic functions are replaced with calls to
the linker

At runtime the linker does lazy-binding:


• The program runs as normal until it encounters an
unresolved function
• Execution jumps to the linker
• The linker maps the shared library into the process’ address
space and replaces the unresolved address with the
resolved address
• This is called “lazy update of GOT” in CSE 361

15
CSE 422S – Operating Systems Organization
Runtime Linker Implementation
Uses a procedure link table
Stack
(PLT) to do lazy binding
//Source code
#include <stdio.h>

int foo = 20;

int main( int argc, char* argv[]){


printf(“Hello, world!\n”);
return 0; Heap
}

.bss
Procedure Link Table (PLT)
linker_stub() .data

.text

16
CSE 422S – Operating Systems Organization
Runtime Linker Implementation
Uses a procedure link table
Stack
(PLT) to do lazy binding
//Source code
#include <stdio.h>
Library with printf() function
int foo = 20;

int main( int argc, char* argv[]){


printf(“Hello, world!\n”);
return 0; Heap
}

.bss
Procedure Link Table (PLT)
library printf() .data

.text

17
CSE 422S – Operating Systems Organization
Static vs. Dynamic Linking
Tradeoffs
Static:
• Does not need to look up libraries at runtime
• Does not need extra PLT indirection
• Consumes more memory with copies of each library
in every program

Dynamic:
• Less disk space/memory (7K vs 571K for hello world)
• Shared libraries already in memory and
in hot cache
• Incurs lookup and indirection overheads

18
CSE 422S – Operating Systems Organization
Executable File Format
The current binary file format is called ELF header
ELF - Executable and Linking Format
• ELF header .text
– Word size, byte ordering, file type (.o, exec, .so),
machine type, etc. .rodata
• .text section
– Code .data
• .rodata section .bss
– Read only data: jump tables, …
• .data section .symtab
– Initialized global variables
• .bss section .rel.txt
– Uninitialized global variables or ones initialized with
0
.rel.data
– “Block Started by Symbol”
– Has section header but occupies no space
.debug
• Other sections added by gcc… Section header table
Slide borrowed, with slight modifications, from CSE 361 materials presented by Dr. Steve Cole in Spring 2019

19
CSE 422S – Operating Systems Organization
Binary File Utilities
Not covered in CSE 361:
• nm – prints symbol table
• objdump – prints all binary data
• readelf – prints ELF data
• pmap – prints memory map of a running
process
• ldd – prints dynamic library dependencies
of a binary
• strip – strips symbol data from a binary

20
CSE 422S – Operating Systems Organization
GNU Debugger (gdb)
Not covered in CSE 361:
• Allows you to debug a program while it
executes
• Parent process (gdb) forks to launch a program
• Uses ptrace system call to observe execution of
child program
• Can set breakpoints to pause execution of child
at specific instruction addresses
• Can set watchpoints to monitor changes to
particular memory locations

21
CSE 422S – Operating Systems Organization

You might also like