Garbage Collection

Garbage Collection
Garbage collection makes memory management easier for

programmers by automatically reclaiming unused memory. The
garbage collector in the CLR makes tradeoffs to assure
reasonable performance for a wide variety of situations.
1
A Fundamental tradeoff
• Memory management has always involved tradeoffs between

numerous optimization possibilities:
– CPU overhead
– Working Set size
– Determinism
– Pause times
– Cache coherency
– Ease of development
• Schemes to manage problem fall into roughly two camps
– “Deterministic” camp vs. “heuristic” camp
2
Basic tradeoffs by different strategies
Manual deterministic
Ref counting
CPU Mark/Compact
heuristic
Mark / Sweep
Copy collect
RAM
3
GC and the CLR
• The CLR uses a Mark / Compact collector as a good choice to

handle a wide variety of situations
– With optimizations to reduce pause times
• Mark / Compact leads to good object locality
– Over time objects tend to coalesce together
– Tight locality leads to fewer page faults
• Mark / Compact has very low memory overhead per object
4
CLR heap usage
• Objects are always allocated sequentially on the heap

– Heap allocation very fast as a result
a b a.b a2 b a2.b
A func()
{ Start of free space
A a = new A();
a.b = new B();
A a2 = new A();
a2.b = new B();
return a2;
}
5
GC
• When the heap becomes too full then a GC cycle ensues

– Exact criteria guarded by MS
• GC can be ordered by programmer with a call to GC.Collect
– Generally discouraged
• Collection proceeds in two phases
• Phase 1: Mark
– Find memory that can be reclaimed
• Phase 2: Compact
– Move all “live” objects to the bottom of the heap, leaving free space
at the top
6
Phase I: Mark
• GC begins by identifying live object references in well-known

places
– AppDomain properties
– CPU Registers
– TLS slots
– Local variables on the stack
– Static members of classes
• This set of objects is called the root set
• Collector then recursively follows references to find all live
objects in the process
– Cycles are handled correctly
7
Marking objects
cycles are
not a problem
Root set Live objects

Available for collection
Free space
8
Phase II: Compact
• After finding live objects the Garbage Collector will initiate a

compaction phase
• All live objects are relocated to the bottom of the heap,
squeezing free space to the top
– All pointers to live objects are adjusted as well
9
After Compaction
reclaimed free space
Live objects
Free space
Root set
10
A real life example
memory usage
A program that simply spews garbage objects shows sequential

allocation intermingled with regular compaction. 11
Finalization
• Dead objects with finalizers can’t be collected at first GC

– Finalizer has to run first
• Finalizer execution deferred until after collection
– During collection goal is to quickly find memory that is available
immediately
• Reference to object is placed on “freachable” queue
– These references become part of root set
– GC will dequeue references in background and call finalizers
• Call GC.WaitForPendingFinalizers to wait synchronously for
finalization queue to empty
• Beware order of finalization!
– No guarantees in parent / child relationship which gets called first
12
Finalizers and compaction
freachable queue
Live objects
Pending Finalize
Free space
Root set
13
Memory cost of finalizers
Finalizers
no Finalizers
Pending finalizers piling up
Finalized objects take longer to be collected, resulting in

greater memory usage for otherwise identical program.
14
Controlling Finalization
• Objects with Finalizers can request that their finalizer not be

called with GC.SuppressFinalize
– Sets the “finalize” bit on object’s Sync Block
– Good thing to call in IDisposable::Dispose
• SuppressFinalize is currently pretty expensive
– Should be faster post beta2
• Objects that change their mind can call
GC.ReRegisterForFinalize
15
Example: IDisposable and SuppressFinalize
public class Transaction : IDisposable {
public Transaction() {
lowLevelTX = raw.BeginTransaction();
}
public void Commit() {

raw.CommitTransaction(lowLevelTX);
lowLevelTX = 0;
}
private void cleanUp() {
if (lowLevelTX != 0)
raw.AbortTransaction(lowLevelTX);
}
public void Dispose() {
System.GC.SuppressFinalize(this);
cleanUp();
// call base.Dispose(); if necessary
}
~Transaction() {
cleanUp();
}
} 16
Generations
• Compaction and object relocation can lead to large pauses

– Not good for programs that need to be responsive
• GC should concentrate on areas where objects are likely to be
dead
• In most programs most objects die young
– This assertion known as the “Weak Generational Hypothesis”
– So it makes sense to concentrate on recently allocated objects for
reclamation
• The .NET Collector segregates objects by age on the heap
– Into segments called generations
• Generations allow the collector to spend more time scavenging
young objects
– Where more garbage is likely to be found
17
Generations, cont’d
• Objects that survive a single GC are promoted to a higher

generation
• Current implementation uses 3 generations
Objects are always compacted down
Gen2 Gen1 Gen0
These objects These objects These objects

have survived two have survived one have never been
or more collections collection collected
18
Generations, cont’d
• GC.GetGeneration() returns the current generation of an object

– Value may change
• Each older generation is collected much more rarely than the
one before it.
– Up to ten times less often, depending on app
• During Collection collector does not follow references that point
into older generations
– GC.Collect() takes an optional parameter to specify how many
generations to search
• Note that finalized objects always get promoted at least once!
– Since they always survive their first collection attempt
19
The Large Object heap
• Really large objects are too expensive to move around

– And usually somewhat rare
• The CLR maintains a large object heap for really big objects
– Threshold ~20kb
• Objects on the LOH are never relocated
– Heap managed more like a classic C/C++ heap
20
Large object heap behavior
address
Once the LOH is filled new requests

are filled by searching for available space
21
Threads and GC
• Different versions of the runtime have different GC behavior

– mscorwks.dll used on single CPU boxes
– mscorsvr.dll used on multi-CPU boxes
• Every process has one dedicated Finalization thread
– runs at highest non-realtime priority
• GC can happen on dedicated background thread
– Finalization always happens on dedicated thread
• When GC runs all managed threads must be brought to a halt
• Runtime has several methods for doing this
– Thread Hijacking
– Safe Points
– Interruptible code
22
Thread hijacking
• For short methods runtime can wallop return address on stack to

“return” into GC code
int bar()
{
return 5;
}
GC
int foo()
{
int ret =bar();
return ret;
}
23
Safe Points
• For longer methods Jitter will add callouts to code to check for a
GC
– Callouts are made at places deemed safe for GC to occur
• If runtime is beginning a GC cycle then method will halt at safe
point until GC is completed
24
Fully Interruptible code
• For tight loops neither of the above methods really works

• The JIT compiler emits tables that are used by the collector to
determine which object references are live at each point in the
method
• Thread is simply suspended, GC takes place, and thread is then
resumed
Line Objects
1 void method() 3 {}
2 {
3 foo a = new foo(); 4 {a}
4 bar b = new bar(); 5 {a}
5 foo c = new foo();
6 { a,c }
6 a.Method(); 7 {c}
7 c.Method();
8 } 8 {}
25
Arenas
• On Multi-CPU systems heap contention is an issue

• Current implementation deals with this by divvying heap into
processor-specific “arenas”
– Initially each arena is ~24kb each
– Arenas allocated contiguously on heap
• Each CPU’s objects get allocated in that CPU’s arena
• Each CPU manages GC of its own arenas
– GC still synchronized across CPUs
• If one Arena fills up new ones are allocated further up the heap
– New arenas currently ~8kb in size
26
Arenas: Single CPU
Thread 1
Thread 2
On a 1 CPU box memory allocated contiguously

regardless of thread
27
Arenas: Two CPUs
Thread 1
Thread 2
On a 2 CPU box memory allocated from two

dynamically managed arenas
28
Summary
• GC involves tradeoffs
• Techniques chosen by MS give reasonable performance for a
wide variety of languages / patterns
• GC is mostly automatic
– But can be influenced via System.GC class
• Try to minimize use of Finalizers
– Saves CPU and memory
• Details certain to change in future
– As MS tweaks implementation
29

Garbage Collection

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Garbage Collection

Uploaded by

Copyright:

Available Formats

Garbage Collection

Garbage collection makes memory management easier for

• Memory management has always involved tradeoffs between

• The CLR uses a Mark / Compact collector as a good choice to

• Objects are always allocated sequentially on the heap

• When the heap becomes too full then a GC cycle ensues

• GC begins by identifying live object references in well-known

Root set Live objects

• After finding live objects the Garbage Collector will initiate a

reclaimed free space

A program that simply spews garbage objects shows sequential

• Dead objects with finalizers can’t be collected at first GC

Pending finalizers piling up

Finalized objects take longer to be collected, resulting in

• Objects with Finalizers can request that their finalizer not be

public void Commit() {

• Compaction and object relocation can lead to large pauses

• Objects that survive a single GC are promoted to a higher

Gen2 Gen1 Gen0

These objects These objects These objects

• GC.GetGeneration() returns the current generation of an object

• Really large objects are too expensive to move around

Once the LOH is filled new requests

• Different versions of the runtime have different GC behavior

• For short methods runtime can wallop return address on stack to

• For tight loops neither of the above methods really works

• On Multi-CPU systems heap contention is an issue

On a 1 CPU box memory allocated contiguously

On a 2 CPU box memory allocated from two

You might also like