Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

CELF ELC Europe 2009

On Threads,
Processes and
Co-Processes
Gilad Ben-Yossef
Codefidence Ltd.

(C) 2008,2009 Codefidence


About Me

 Chief Coffee Drinker of Codefidence Ltd.



Co-Author of Building Embedded Linux System, 2nd
edition
 Israeli FOSS NPO Hamakor co-founder
 August Penguin co-founder
 git blame -- FOSS.*
 Linux
 Asterisk
 cfgsh (RIP) ...

(C) 2008,2009 Codefidence


Linux Process and Threads

Stack Stack Stack Stack Stack

State State State State State

Signal Signal Signal Signal Signal


Mask Mask Mask Mask Mask

Priority Priority Priority Priority Priority

Thread Thread Thread Thread


Thread 1
1 2 3 4

Process 123 Process 124

File Signal File Signal


Memory Memory
Descriptors Handlers Descriptors Handlers

(C) 2008,2009 Codefidence


The Question

Should we use threads?


Given an application that calls for several tasks, should we
implement each task as a thread in the same processes or in a
separate processes?

(C) 2008,2009 Codefidence


What Do The Old Wise Men Say?

 "If you think you need threads then your


processes are too fat."
 Rob Pike, co-author of The Practice of Programming and
The Unix Programming Environment.
 “A Computer is a state machine. Threads are
for people who can't program state
machines."
 Alan Cox, Linux kernel programmer

(C) 2008,2009 Codefidence


Spot the Difference (1)

 The Linux scheduler always operates at a


thread granularity level.
 You can start a new process without loading a
new program, just like with a thread
 fork() vs. exec()
 Process can communicate with each other just
like threads
 Shared memory, mutexes, semaphores, message
queues etc. work between processes too.

(C) 2008,2009 Codefidence


Spot the Difference (2)

 Process creation time is roughly double that of


a thread.
 ... but Linux process creation time is still very small.
 ... but most embedded systems pre-create all tasks
in advance anyway.
 Each process has it's own virtual memory
address space.
 Threads (of the same process) share the virtual
address space.

(C) 2008,2009 Codefidence


Physical and Virtual Memory
Physical address space Virtual address spaces
0xFFFFFFFFF 0xFFFFFFFFF 0xFFFFFFFFF

I/O memory 3
Process 0 Process1
I/O memory 2

0x00000000
I/O memory 1 0x00000000

Flash MMU CPU

0xFFFFFFFFF
Memory
RAM 1 Management All the processes have 
Unit their own virtual 
RAM 0 Process2
address space, and run 
as if they had access to 
0x00000000
the whole address 
0x00000000 space.

This slide is © 2004 – 2009 Free Electrons. (C) 2008,2009 Codefidence


Page Tables
Virtual Page Page Frame
Address Address Address
Space Read
Number * Write
ASN Virtual Physical Permission Execute
Cached
12 0x8000 0x5340 RWXC
15 0x8000 0x3390 RX

Read Only
Execute
Non Cached

Memory
CPU MMU
Bus

* Does not exists on all architectures.

(C) 2008,2009 Codefidence


Translation Look-aside Buffers

 The MMU caches the content of page tables in


a CPU local cache called the TLB.
 TLBs can be managed in hardware
automatically (x86, Sparc) or by software (Mips,
PowerPC)
 Making changes to the page tables might
require a TLB cache flush, if the architecture
does not support TLB ASN (Alpha, Intel
Nehalem, AMD SVM)

(C) 2008,2009 Codefidence


Cache Indexes

 Data and Instruction caches may use either


virtual or physical address as the key to the
cache.
 Physically indexed caches don't care about
different address spaces.
 Virtually indexed caches cannot keep more
then one alias to the same physical address
 Or need to carefully manage aliasing using tagging.

(C) 2008,2009 Codefidence


LMBench

 LMBench is a suite of simple, portable


benchmarks by Larry McVoy and Carl Staelin
 lat_ctx measures context switching time.
 Original lat_ctx supported only measurement of
inter process context switches.
 I have extended it to measure also inter thread
content switches.
 All bugs are mine, not Larry and Carl :-)

(C) 2008,2009 Codefidence


How lat_ctx works (1)
2. Pass
8923478234972364972349723469234692346923462397
token on a
462937y923236497234693246928923478234972364972 Unix pipe to
349723469234692346923462397462937y923236497234
6932469289234782349723649723497234692346923469 next task
23462397462937y9232364972346932469289234782349
72364972349723469234692346923462397462937y9232
3649723469324692892347823497236497234972346923
4692346923462397462937y92323649723469324692892
3478234972364972349723469234692346923462397462
937y923236497234693246928923478234972364972349
723469234692346923462397462937y923236497234693
2469289234782349723649723497234692346923469234 Task 1
62397462937y9232364972346932469289234782349723
64972349723469234692346923462397462937y9232364
9723469324692892347823497236497234972346923469
2346923462397462937y92323649723469324692892347
8234972364972349723469234692346923462397462937
y923236497234693246923246928923478234972364972
349723469234692346923462397462937y923236497234
Task 3
6932469289234782349723649723497234692346923469
23462397462937y9232364972346932469289234782349
72364972349723469234692346923462397462937y9232
3649723469324692892347823497236497234972346923
4692346923462397462937y92323649723469324692892
3478234972364972349723469234692346923462397462
937y92323649723469324692892347823497236497234

Task 2
Can set task size 1. Perform
and number of tasks calculation on a
variable size array

(C) 2008,2009 Codefidence


How lat_ctx works (2)

 Both the data and the instruction cache get


polluted by some amount before the token is
passed on.
 The data cache gets polluted by approximately the
process ``size''.
 The instruction cache gets polluted by a constant
amount, approximately 2.7 thousand instructions.
 The benchmark measures only the context switch
time, not including the overhead of doing the work.
 A warm up run with hot caches is used as a reference.

(C) 2008,2009 Codefidence


lat_ctx Accuracy

 The numbers produced by this benchmark are


somewhat inaccurate;
 They vary by about 10 to 15% from run to run.
 The possible reasons for the inaccuracies are
detailed in LMBench documentation
 They aren't really sure either.

(C) 2008,2009 Codefidence


Context Switch Costs

 Using the modified lat_ctx we can measure the


difference between the context switch times of
threads and processes.
 Two systems used:
 Intel x86 Core2 Duo 2Ghz
 Dual Core
 Hardware TLB
 Freescale PowerPC MPC8568 MDS
 Single core
 Software TLB

(C) 2008,2009 Codefidence


X86 Core2 Duo 2Ghz 0k data

1.85 1.83
1.81 1.81
Context Switch Time in Usec

1.8
1.76
1.75 1.74
Threads
1.7 1.69 Proccesses

1.65

1.6
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


X86 Core2 Duo 2Ghz 16k data

2.24
2.23
2.23
Context Switch Time in Usec

2.23
2.22 2.22 2.22
2.22
2.22
2.21
2.21 Threads
2.21 Proccesses
2.2
2.2
2.2
2.19
2.19
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


X86 Core2 Duo 2Ghz 128k data

18 16.61
15.85
16
Context Switch Time in Usec

14
12
10
Threads
8 Proccesses
6
3.73 3.28
4
1.99 1.92
2
0
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


PPC MPC8568 MDS 0k data

4
3.44
3.5 3.24 3.26
Context Switch Time in Usec

3.05
3
2.61
2.5 2.36

2 Threads
Proccesses
1.5
1
0.5
0
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


PPC MPC8568 MDS 16k data

20
17.77 17.87
18
Context Switch Time in Usec

16
14.25
14 13.18 13.7 13.35

12
10 Threads
8 Proccesses
6
4
2
0
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


PPC MPC8568 MDS 128k data

250
222.41227.63 227.68228.23
Context Switch Time in Usec

200
178.06
167.88

150
Threads
100 Proccesses

50

0
5 10 20
Number of Threads/Processes

(C) 2008,2009 Codefidence


Conclusions

 Context switch times change between threads


and processes.
 It is not a priori obvious that threads are better.
 The difference is quite small.
 The results vary between architectures and
platforms.
 Why do we really use threads?

(C) 2008,2009 Codefidence


Why People Use Threads?

It's the API, Silly.


POSIX thread API offers simple mental model.

(C) 2008,2009 Codefidence


API Complexity: Task Creation

 fork()  pthread_create()
 Zero parameters.  Can specify most
 Set everything yourself attributes for new
after process creation. thread during create
 New process begins
 Can specify function
with a virtual copy of for new thread to
parent at same location start with
 Copy on write semantics
 Easy to grasp
require shared memory address space
setup sharing

(C) 2008,2009 Codefidence


API Complexity: Shared Memory

 Threads share all the memory.


 You can easily setup a segment of shared
memory for processes using shm_create()
 Share only what you need!
 BUT... each process may map shared segment
at different virtual address.
 Pointers to shared memory cannot be shared!
 A simple linked list becomes complex.

(C) 2008,2009 Codefidence


API Complexity: File Handles

 In Unix, Everything is a File.


 Threads share file descriptors.
 Processes do not.
 Although Unix Domain Socket can be used to pass
file descriptors between processes.
 System V semaphores undo values not shared as
well, unlike threads.
 Can make life more complicated.

(C) 2008,2009 Codefidence


Processes API Advantages

 Processes  Threads
 PID visible in the  No way to name task
system. in a unique name.
 Can set process  Kernel thread id not
name via related to internal
program_invocation_n thread handle.
ame.  Difficult to ID a thread
 Easy to identify a in the system.
processes in the
system.

(C) 2008,2009 Codefidence


The CoProc Library

 CoProc is a proof of concept library that


provides an API implementing share-as-you-
need semantics for tasks
 Wrapper around Linux clone() and friends.
 Co Processes offer a golden path between
traditional threads and processes.
 Check out the code at:
http://github.com/gby/coproc

(C) 2008,2009 Codefidence


CoProc Highlights
 A managed shared memory segment,
guaranteed to be mapped at the same virtual
address
 Coproc PID and name visible to system
 Decide to share file descriptors or not at coproc
creation time.
 Set attributes at coproc creation time.
 Detached/joinable, address space size, core size,
max CPU time, stack size, scheduling policy,
priority, and CPU affinity supported.

(C) 2008,2009 Codefidence


CoProc API Overview

 int coproc_init(size_t shm_max_size);
 pid_t coproc_create(char * coproc_name, struct 
coproc_attributes * attrib, int flags, int (* 
start_routine)(void *), void * arg);
 int coproc_exit(void);
 void * coproc_alloc(size_t size);
 void coproc_free(void * ptr);
 int coproc_join(pid_t pid, int * status);

(C) 2008,2009 Codefidence


CoProc Attributes
struct coproc_attributes {
 
  rlim_t address_space_size;   /* The maximum size of the process
       address space in bytes */
  rlim_t core_file_size;     /* Maximum size of core file */
  rlim_t cpu_time;     /* CPU time limit in seconds */
  rlim_t stack_size;     /* The maximum size of the process
                 stack, in bytes */
  int    scheduling_policy;   /* Scheduling policy. */
  int    scheduling_param;   /* Scheduling priority or 
      nice level */
  cpu_set_t cpu_affinity_mask;   /* The CPU mask of the co­proc */
};

(C) 2008,2009 Codefidence


CoProc Simple Usage Example
  int pid, ret;
  char * test_mem;
  struct coproc_attributes = { ... };

  coproc_init(1024 * 1024);

  test_mem = coproc_alloc(1024); 

  if(!test_mem) abort();
 
  pid = coproc_create("test_coproc", &test_coproc_attr, \
    COPROC_SHARE_FS ,test_coproc_func, test_mem);

  if(pid < 0) abort();

  coproc_join(pid, &ret);

  coproc_free(test_mem);

(C) 2008,2009 Codefidence


Credits

 The author wishes to acknowledge the


contribution of the following people:
 Larry McVoy and Carl Staelin for LMBench.
 Joel Issacason, for an eye opening paper about a
different approach to the same issue
 Rusty Russel, for libantithreads, yet another
approach to the same issue.
 Free Electrons, for Virt/Phy slide.
 Sergio Leone, Clint Eastwood, Lee Van Cleef, and
Eli Wallach for the movie :-)

(C) 2008,2009 Codefidence


Thank You For Listening!

Questions?
 Codefidece Ltd.: http://codefidence.com
 Community site: http://tuxology.net
 Email: gilad@codefidence.com
 Phone: +972-52-8260388
 Twitter: @giladby
 SIP: gilad@pbx.codefidence.com
 Skype: gilad_codefidence

(C) 2008,2009 Codefidence

You might also like