Professional Documents
Culture Documents
Using Linearity To Allow Heap Recycling in Haskell
Using Linearity To Allow Heap Recycling in Haskell
Abstract
This project investigates destructive updates and heap-space-recycling in Haskell through the use of linear types. I provide a semantics for an extension to the STG language, an intermediate language used in the Glasgow Haskell Compiler (GHC), that allows arbitrary data types to be updated. A type system based on uniqueness typing is also introduced that allows the use of the new semantics without breaking referential transparency. The type system aims to be simple and syntactically light, allowing a programmer to introduce destructive updates with minimal changes to source code. I have implemented this semantic extension to both in an interpreter for the STG language and in the GHC backend. Finaly, I have written a type checker for this system that works over a subset of Haskell.
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction The Problem With Persistence . . . . . . . . . . . . . . . . . . . . 2 Uniqueness 3 Uniqueness in Type Systems Linear Logic . . . . . . . . . . . . . . . . . . . Clean . . . . . . . . . . . . . . . . . . . . . . Monads . . . . . . . . . . . . . . . . . . . . . Hage & Holdermans Heap Recycling for Lazy Uniqueness in Imperative Languages . . . . . A Simpler Type System for Unique Values . . 4 Implementation The STG Language . . . . . . Operational Semantics of STG Closure Representation . . . . Adding an Overwrite construct Ministg . . . . . . . . . . GHC . . . . . . . . . . . . Garbage Collection . . . . . . . 5 Results 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 3 4 6 6 7 7 8 8 10 13 14 16 18 19 20 20 24 27 29
Chapter 1
Introduction
The Problem With Persistence
One striking feature of pure functional programming in languages such as Haskell is the lack of state. As all data structures are persistent, updating a value does not destroy it but instead creates a new copy. The advantages of this are well known[1][2] but conversely so are the disadvantages[3][5]. In particular, persistence can lead to excessive memory consumption when structures remain in memory long after they have ceased to be useful[6]. The reason Haskell does not allow state is to avoid side eects and the reason side eects are avoided is because they can make understanding and reasoning about programs di cult. Indeed, from a theoretical point of view, side eects simple arent required for computation. Yet undeniably, side effects are useful, particularly when implementing e cient data structures[4]. Whilst the lack of destructive update in Haskell is useful in accomplishing the goal of referential transparency, it is not strictly necessary. It is sometimes possible to allow destructive updates without introducing observable side eects.
Chapter 2
Uniqueness
Imagine a program that reads a list of integers from a le, sorts them and then continues to process the sorted list in some manner. In an imperative setting, we might expect this sorting to be done in- place, but in Haskell we must allocate the space for a new, sorted version of the list. However, if the original list is not referred to in the rest of the program, then any changes made to the data contained in the list will never be observed. Thus there is no need to maintain the original list. This means we could re-use the space occupied by the unsorted list, and sine we know that sorting preserves length, we might begin to wonder if we can do the sorting in-place. The reason we could not use destructive updates in the example above is that doing so may introduce side eects into our program. For instance, if we are able to sort a list in-place then the following code becomes problematic: foo :: [a ] ! ([a ], [a ]) foo xs = (xs, (sort in place xs)) Does fst (foo [3,2,1]) refers to a sorted list or an unsorted list? With lazy evaluation we have no way of knowing. Notice however that modifying the original, unsorted list is only a problem if it is referred to again elsewhere in the program. If the list is not used anywhere else, then there can be no observable side eects of updating it in place as any data that cannot be referenced again can have no semantic eect on the rest of the program. If this where the case then the compiler would be free to re-use the space previously taken up by the list, perhaps updating the data structure in-place, and referential transparency would not be broken. This condition, that there is only ever one reference to the list, is known as uniqueness we say that the list is unique. Consider an algorithm that inserts an element into a binary tree(g 2.1). In an imperative language this would normally involve walking the tree until we nd the correct place to insert the element and updating the node at that 4
a'
c'
g'
(a) A binary tree. An element is to be inserted(b) After insertion. A new tree has been created in the outlined position from the old one
Figure 2.1: Inserting an element into a binary tree position. However in a functional language, we must instead copy all the nodes above the one to be updated and create a new binary tree. If the original tree was unique, that is, the only reference to a was passed to the function that inserted m, then there will no longer be any references to a. Consequently, there will be no longer be any references to nodes c or g either. All three node will be wasting space in memory. If a larger number of nodes are inserted then it is possible that the space wasted will be many times greater than the space taken up by the tree! Clearly a lot of space can be wasted. In general it is not possible to predict when an object in a Haskell will become garbage, so garbage collection must be a dynamic run-time process. Because garbage collection happens at run-time, there is a performance penalty associated with it. Indeed, whilst garbage collection can be very e cient when large amounts of memory are available[8], it can often take up non-trivial percentages of the programs execution time in memory constrained environments. But when an object is know always to be unique, its lifetime can be determined statically and so the run-time cost of garbage collection can be avoided.
Chapter 3
Clean
Clean[23] is a language very similar to Haskell that features a unique type system based on linear logic. Clean allows users to specify particular variables as being unique The type system exposed to the user is large and often simple functions can have complex types. However, the de-facto implementation has proved to be very e cient. One particularly interesting feature of Clean is that the state of the world is explicit. Every Clean program passes around a unique object, the world. The world represents the state of the system, explicitly threaded throughout the program and is thus destructive updates to the world can be used to sequence IO operations. Unique objects cannot be duplicated, so no more than one world can exist at a time and hence there is no danger of referring to an old state by accident.
Monads
Haskell takes a dierent approach towards IO. Monads, as presented by Wadler and Peyton-Jones [27] can do much of the work of uniqueness typing by the use of encapsulation. Indeed, they are much simpler in terms of both syntax and type system. However, monads do not solve every problem as elegantly. Suppose we have a program that makes use of a binary tree: data BinTree a = Empty | Node a (BinTree a) (BinTree a) insert :: a ! BinTree a ! BinTree a removeMin :: BinTree a ! (a, BinTree a) isEmpty :: BinTree a ! Bool
If we want to allow the tree to be updated destructively we can employ the ST monad, replacing each branch by a mutable reference, an STRef. However, as STRefs require a state parameter, we must also add a type parameter to our binary trees. data BinTree s a = Empty | Node a (STRef s (BinTree s a) (BinTree s a)) Unfortunately, none of the code we have written to work over binary trees will work any more! Not only are the type signatures incorrect, but the whole implementation must be re-written to work within the state monad. insert :: a ! BinTree s a ! ST s (BinTree s a) removeMin :: BinTree s a ! ST s (a, BinTree s a) ... 7
Monadic code can often dier signicantly in style to idiomatic functional code, so this may end up aecting large portions of our code. This can clearly cause problems if we where trying to optimise a large program in which the binary tree implementation had been identied as a bottleneck.
As an example, here is an implementation of quicksort in this theoretical language: qsort (consumed xs :: [Int ]) ! [Int ] = { return sort (xs, Nil ) }
sort (consumed xs :: [Int ], end :: [Int ]) ! [Int ] = { case xs of Nil ! return end ; Cons (x , xs 0 ) ! { ys, zs := split (x , xs 0 ); zs 0 := sort (zs, end ); return sort (ys, Cons (x , zs 0 )); } }
split (viewed p :: Int; consumed xs :: [Int ]) ! ([Int ], [Int ]) = { case xs of Nil ! return ([ ], [ ]) Cons (x , xs 0 ) ! { ys, zs := split (p, xs 0 ); if x > p then : return (ys, Cons (x , zs)); else : return (Cons (x , ys), zs); } } In the body of sort, xs will be out of scope after the case expression, and after the line split(xs0 , x) xs0 will be out of scope but x will remain in scope, since split consumes its second argument but only views its rst. These rules ensure that at any point in the programs execution, if x is consumable in the current environment then there is no more than a single reference to it. Conversely if there is more than one reference to x then x must be immutable. A su ciently smart compiler would be able to tell that in each case expression, list under scrutiny is never referred to again, only its elements are. Thus, in the case that the list was a Cons cell, the cell can be re- used when a Cons cell is created later on. In this way, the function sort can avoid allocating any new cells and instead operate in-place.
10
map :: (a ; b) ! [a ] ; [b ] map f [ ] = [ ] map f (x : xs) = f x : map f xs -- Map takes a unique list of and updates it in place -- Notice the function f is not unique itself as it is used twice -- on the right hand side id :: x ; x id x = x compose :: (b ; c) ! (a ; b) ! a ; c compose f g x = f (g x ) double1 double1 double2 double2 :: a ; (a, a) a = (a, a) -- error: unique variable a is used twice. :: a ! (a, a) a = (a, a)
apply1 :: (a ! b) ! a ; b apply1 f x = f x -- error: result of applying f to x will not be unique apply2 :: (a ; b) ! a ! b apply2 f x = f x -- error: f expects a unique argument, x is not unique twice :: (a ; a) ! a ; a twice f = compose f f fold :: (b ; a ; a) ! a ; List b ; a fold f e [ ] = e fold f e (x : xs) = fold f (f x e) xs f1 f1 f2 f2 f3 f3 :: a ; (a ! b) ! b x g =g x :: a ! (a ; b) ! b x g = g x -- error: g expects a unique argument, x is not unique :: a ! (a ! b) ; b x g = g x -- error: the result of applying g to x will not be unique. -- a unique variable may be passed to an argument expecting a -- non-unique variable, but not the other way round. -- Note that in f1, the type signature is implicitly bracketed like this: -- f 1 :: a ; ((a ! b) ! b) -- so the result of a partial application would be a function that is -- itself unique.
Figure 3.1: Some examples of function with possible type signatures and type errors.
11
Semantically, this can be viewed in terms of the system proposed by Hage and Holdermans, equivalent to 1 f :: a1 ! b1 ! ! b! ! g :: a Many functions can be converted to use this type system without needing to alter their denition at all. For instance, a function that reverses a list in-place can be constructed simply by altering the type signature or the standard Haskell function reverse. reverse :: [a ] ; [a ] reverse = rev [ ] where rev :: [a ] ; [a ] ; [a ] rev xs [ ] = xs rev xs (y : ys) = rev (y : xs) ys There is a signicant drawback to this system in the fact that there is more than one possible way to assign a type to some fragments of code. If we want to use both in-place reverse and regular reverse, then we must create two separate functions that dier only by name and type signature. I have implemented a typechecker for this system over a subset of Hasekell. Due to time constraints and the complexity of GHCs type system resulting from the vast number of type system extensions already present, the new type system has not been integrated into GHC. Despite this, the backend mechanisms to allow closure-recycling are fully functional: the example above will compile and run, sorting the list in-place although it will not be typechecked by GHC.
12
Chapter 4
Implementation
I have implemented the backend mechanisms for dealing with overwriting as an extension to the Glasgow Haskell Compiler. This section includes just enough detail about the inner working of the compiler to explain this extension. There are several main stages in the compilation pipeline: The Front End contains the parser and the type checker. The Desugarer converts from the abstract syntax of Haskell into the tiny intermediate Core-language. A set of Core-to-Core optimisations and other transformations. Translation into the STG language. Code generation. This chapter deals with the details of the nal two phases.
13
There are also several properties of STG code that are of interest: Every argument to a function or data constructor is a simple variable or constant. Operationally, this means that arguments to functions are prepared (either by evaluating them or constructing a closure) prior to the call. All constructors and built-in operations are saturated. This cannot be guaranteed for every function since Haskell is a higher order language and the arity of functions is not necessarily known, but it simplies the operational semantics. Functions of known arity can be eta-expanded to ensure saturation. Pattern matching and evaluation is only ever performed via case expressions, and each case expression matches one-level patterns. Each closure has an associated update ag . More is explained about these further down. Bindings in the STG language carry with them a list of free variables. This has no semantic eect but is useful for code generation.
14
prog binds lf
! ! ! ! | ! | | | | | | ! | ! ! ! ! ! ! ! !
binds var1 = l f1 ; ...; varn = l fn varsf u n let binds in expr letrec binds in expr case expr of alts var atoms constr atoms prim atoms literal aalt1 ; ...; aaltn ; def ault palt1 ; ...; paltn ; def ault constr vars -> expr literal -> expr var -> expr 0# | 1# | ... +# | -# | *# | /# | ... {var1 , ..., varn } {atom1 , ..., atomn } var | literal Primitive Integers Primitive integer ops n>0 n>0
varsa -> expr Updatable Not updatable Local denition Local recursion Case statements Application Saturated constructor Saturated buit-in op n > 0 (Algebraic) n > 0 (Primitive)
Expression
expr
Alternatives
alts
Algebraic alt Primitive alt Default alt Literals Primitive ops Variable lists Atom lists
15
let x = bind in e; s; H e[x0 /x]; s; H[x7! bind] case v of alts; s; H[v7! C a1 ...an ] e[a 1/x1 ...an /xn ]; s; H
(LET) (x free) alts = {...; C x1 ...xn ! e; ...} v is a literal and does not match any other case alternatives (CASECON)
(CASEANY)
(THUNK) (UPDATE)
The rst rule, LET, states that to evaluate a let-expression the heap H is extended to map a fresh variable to the right hand side bind of the 16
expression. The fresh variable corresponds to allocating a new address in memory. After allocation, we enter the code for e with x0 substituted for x. Here is the Haskell code for the function reverse, taken from the standard prelude and the corresponding STG code: reverse = rev [ ] where rev xs [ ] = xs rev xs (y : ys) = rev (y : xs) ys reverse = { } n { } ! rev {Nil } rev = { } n {xs ys } ! case ys of Nil { } ! xs Cons {z , zs } ! let rs = {z , xs } n Cons {z , xs } in rev {rs, zs } which should be read in the following way: First bind reverse to a function closure whose code pushes onto the stack a continuation that apply a function to the value N il, then evaluate the code for rev Bind rev to a function closure that expects two arguments xs and ys. The code for this closure should force evaluation of ys and examine the result: if it matches N il, then evaluate the code for N il; if it matches Cons z zs then allocate a Cons cell with arguments z and xs, load rs and zs onto the stack and enter the code for rev. Update ags One feature of lazy evaluation is that each closure should be replaced by its (head) normal form upon evaluation, so that the same closure is never evaluated more than once. The update ag attached to each closure species whether this update should take place. If the ag is set to u then the closure will be updated and if it set to n thet no update will be performed. Na vely, every ag can be set to u, but this is not always necessary. For instance, if a closure is already in head normal form, then updating is not required. Much more detail about this is given in Simon Peyton-Jones paper Implementing functional languages on stock hardware: the Spineless Tageless G-Machine. [26]
17
Closure Representation
Every heap object in GHC is in one of three forms: a head normal form (a value), a thunk which represents an unevaluated expression, or an indirection to another object. A value can either be a function value or a data value formed by a saturated constructor application. The term closure is used to refer to any of these objects. A distinctive feature of GHC is that all closures are represented, and indeed handled, in a uniform way.
Free Variables Code All closures are in this form, with a pointer to an info table containing code and other details about the closure and a list of variables that the closure needs access to. For example, a closure for a function application will store the code for the function in the info table and the arguments in the free variable list. When the closure is evaluated, the arguments can be reached via a known oset from the start of the closure. For a data constructor, the code will return to the continuation of the case statement that forced evaluation, providing the arguments of the constructor application. These arguments are again, simply stored as an oset from the start of the closure.
18
19
Expression expr ! |
(OVERWRITE)
Ministg
Ministg [29] is an interpreter for the STG language that implements the operational semantics as given above. It oers a good place to investigate the new semantics. Here is an outline of the relevant code: The code dealing with overwrite-expressions is largely similar to the code for let-expressions and usually is simpler. For instance, no free variable need be generated unlike in the let- expression and no substitution need be performed. Performing substitutions over overwrite-expressions is also simpler than the corresponding let-expression, as there is no variable capture to be avoided. The nal dierence is in calculating free variables, as in a let expression the variable appearing on the left-hand-side is not free but is free in an overwrite expression.
GHC
At an operational level, these are the only dierences between let- expressions and overwrite expressions. When it comes to implementing the STG language in GHC, however, there are a few more hurdles to overcome. Unsurprisingly, much of the code remains the same as for let- expressions, but the translation is not as direct as in the Ministg interpreter. Firstly: updating variables reacts badly with the generational garbage collector employed in GHC. More detail about this is provided in the next section. Secondly: whereas in the Ministg interpreter variable locations are stored in data structure representing a nite mapping, in GHC variable locations are stored as pointers kept in registers or as osets from the current closure. In the case that the location of a variable is stored at an oset from a closure that is to be overwritten we must make sure to save this location before performing the update, otherwise the location will be lost and will no longer be able to access the variable. In the example below, the addresses for x and xs will be located at an 20
smallStep :: Exp ! Stack ! Heap ! Eval (Maybe (Exp, Stack , Heap)) -- LET smallStep (Let var object exp) stack heap = do newVar freshVar let newHeap = updateHeap newVar object heap let newExp = subs (mkSub var (Variable newVar )) exp return $ Just (newExp, stack , newHeap) -- OVERWRITE smallStep (Overwrite var object exp) stack heap = do let newHeap = updateHeap var object heap return $ Just (exp, stack , newHeap) -- CASECON smallStep (Case (Atom (Variable v )) alts) stack heap | Con constructor args lookupHeap v heap, Just (vars, exp) exactPatternMatch constructor alts = do return $ Just (subs (mkSubList (zip vars args)) exp, stack , heap) -- CASEANY smallStep (Case (Atom v ) alts) stack heap | isLiteral v _ isValue (lookupHeapAtom v heap), Just (x , exp) defaultPatternMatch alts = do return $ Just (subs (mkSub x v ) exp, stack , heap) -- CASE smallStep (Case exp alts) stack heap = do return $ Just (exp, CaseCont alts callStack : stack , heap) -- RET smallStep exp@(Atom atom) (CaseCont alts : stackRest) heap | isLiteral atom _ isValue (lookupHeapAtom atom heap) = do return $ Just (Case exp alts, stackRest, heap) -- THUNK smallStep (Atom (Variable x )) stack heap | Thunk exp lookupHeap x heap = do let newHeap = updateHeap x BlackHole heap return $ Just (exp, UpdateCont x : stack , newHeap) -- Update smallStep atom@(Atom (Variable y)) (UpdateCont x : stackRest) heap | object lookupHeap y heap, isValue object = do return $ Just (atom, stackRest, updateHeap x object heap) Figure 4.4: Outline of the Ministg implementation for the evaluation rules given in gure 4.2 plus the new overwrite expression. 21
oset from the closure for ls. When that closure is overwritten, we lose these addresses, so we must take care to save them in tempory variables rst. ... case ls of Cons x xs ! ... overwrite ls with Cons y ys in ... x ... xs ...
xs
xs xs
ys
(a)
(b)
Figure 4.5: Overwriting a Cons cell. Any references that pointed to xs in (a) will point to ys after the update (b) and similarly for x and y.
22
Let us now consider another example, map. Intuitively, map seems like a good candidate for in-place updates we scan across the list updating each element with an function application. But there is a problem. Looking at the code for map and the corresponding STG binding we see that map does not allocate any Cons cells! At least not directly: map :: (a ! b) ! [a ] ! [b ] map [ ] = [ ] map f (x : xs) = f x : map f xs map = { } n {f , xs } ! case xs of Nil { } ! Nil Cons {y, ys } ! let fy = {f , y } u f {y } in let mfys = {f , ys } = u map {f , ys } in Cons {fy, mfys } The two closures allocated in the body of map are both thunks allocated on the heap whereas the Cons cell is placed on the return stack. In a strict language, the recursive call to map would allocate the rest of the list, but in a lazy language a thunk representing the suspended computation is allocated instead. This thunk will later be updated with its normal form (either a Cons cell or Nil) if examined in a case statement. In general, the size of the updatees closure and the size of the thunk will not be of the same size, so we cannot blindly overwrite the former with the latter. One can imagine a mechanism whereby upon seeing a unique value in a case statement, the code generator searches the rest of the code for the closure that ts best. If a closure of the same type is built, then we select that. Otherwise we try and reuse as much space as possible by selecting the largest closure that will take up no more space than the closure we wish to overwrite. There is also the possibility of reusing the thunk allocated for the recursive call itself, since once evaluated, it is no longer needed. I have not been able to try implementing this feature, but it would be an interesting improvement to make. There is one more optimisation that could potentially be included. When a variable x known to be unique goes out of scope, we know that it has become garbage, weather or not x appears in a case statement. The compiler would then be free to overwrite x with a new variable y without making any assumptions about the uniqueness of y. There is some di culty here as we do not know if x refers to a value or an unevaluated thunk. If x has not been evaluated then in general we can infer nothing about the size of the thunk it refers to, as it may have been formed from an arbitrary expression. 23
Garbage Collection
GHC uses an n-stage generational garbage collector. A copying collector will partition the heap into two heap spaces: the from-space and the to-space. Initially objects are only allocated in the from-space. Once this space is full, the live object in the from-space are copied into the to-space. Live objects are objects that are reachable from the current code. Any unreachable object (garbage) is never copied so will not take up space in the new heap area, so the new heap will be smaller than the old heap (provided there was unreachable data in the heap). Now the from-space becomes the to-space and vide-versa and the program continues to run. If no space was reclaimed, then the size of the two spaces must be increased, if this is possible. This can be generalised to more than two spaces so that there are many heap-spaces of which any one of them may be acting as the to-space at a given time. This process clearly cannot be employed in a language that allows pointer arithmetic, for example, since closures are frequently being relocated in memory and pointers would be left dangling, or pointing to nonsense. But are things any better in a functional language? Ignoring for the moment lazy evaluation and overwriting, Haskell has the property that any new data value will only point to old data, never the other way round since values are immutable. This means the references in memory form a directed acyclic graph with older values at the leaves and newer values nearer to the root. The idea behind generations is that since structures are immutable, old structures dont usually point to structures created more recently. Because of this is it possible to partition the heap into generations where old generations do not reference new generations. In this way, the garbage collector can re-arrange the new generations without aecting the old generations. It has been observed that in functional programming, old data tends to stay around for much longer than new data [reference] so most unreachable data is newly created. This means that a large proportion of the garbage to be collected usually lies in the youngest generation, so by collecting this we can reclaim decent amounts of space without having to traverse the whole heap. Occasionally however, garbage collecting a young generation will not free up enough space, in which case older generations must also be collected. A generational collector will also use a method to age objects from younger generation into older generations, if they have been around long enough. The usual way of doing this is by recording how many collections an object survives in a particular generation. Once this number exceeds some threshold, the object is moved up into the older generation. By default, GHC uses two generations. This scheme leads to frequent, small collections with occasional, much larger collections of the entire heap. Up until this point, we have been considering only garbage collection in directed acyclic graphs. Things become much less neat when we allow closures to be overwritten, as a closure in an old generation may well be 24
(a) A block of memory split into two generations. The grey blocks are garbage.
Figure 4.6: Generational garbage collection updated to reference a newer closure in a younger generation. When a garbage collection takes place, the younger closure will be moved to a new location and the reference inside the older closure will no longer point to the correct location. It is worth noting that this can happen even without the overwrite construct, owing to lazy evaluation. For example, the following code can be used to create a cyclic list: cyclic :: [a ] ! [a ] cyclic xs = rs rs = link xs rs link [ ] ys = ys link (x : xs) = x : link xs ys To work around this, for each generation, GHC keeps track of the set of closures that contain pointers to younger generations called the remembered set. During a garbage collection, the pointers to younger generations are maintained so as to keep pointing to the correct locations. This remembered set must be updated whenever a closure is overwritten; this is known as the write-barrier. This means that whenever a closure is overwritten, we must check for any old to new pointers being created by considering the generation 25
of the closure being overwritten. Unfortunately, this does incur a signicant performance penalty for the overwrite expression. This will discussed later on in more detail.
26
Chapter 5
Results
Due to the write-barrier overhead, performance gains for this new optimisation are slight. In a benchmark program that sorts a large list of integers we nd that although the time spent in garbage collection drops by around 10%, the extra cost of the write- barrier and saving free variables almost exactly counteracts the benets. By comparison, if we restrict GHC to use a single-space copying collector, thus avoiding the problems associated with older generations, we see a much bigger improvement from in-place updates. However, the overall execution time is worse than when using the generational collector, so there is little point in doing so. For larger, more realistic programs the gain is usually even smaller. Typically, there is little dierence between an optimised program and one without closure-overwriting. It is interesting to note though, that in no case has it been observed that the extension causes a program to run noticeably slower. However, this is only the case in conditions where the run time system is allowed access to much more heap space than is needed. When the amount of the heap space available is restricted to be close to the amount of live data, very dierent results can be seen. Fig 5.2 shows how the performance of the sorting program varies with the size of the heap. For small heap sizes, including destructive update makes a big dierence in the speed of the program. Without allowing destructive update, reducing the size of the heap dramatically incraces the amount of time spent garbage collecting. For a heap size of 8MB garbage collection accounts for approximately 50% of execution time and for a heap size of 2MB, this increases to around 75%. By contrast, with destructive overwriting turned on, reducing the size of the heap has little eect on the program. Indeed the program actually runs slightly faster with a smaller heap! This may be due to improved data locality of a smaller heap and fewer cache misses.
27
none With optimisation Time(s) %GC Without optimisation Time(s) %GC 6.43 46% 6.41 57%
Figure 5.1: Results of running a sorting algorithm with various options aecting the run time system. The code under analysis here is exactly the quicksort example given earlier used to sort a list of 20000 integers taking the minimum time over three runs.
Figure 5.2
28
Chapter 6
Conclusion
As the run time system of GHC has been highly optimised for persistent data structures, overwriting closures provides little benet under typical conditions. Despite this, the technique appears promising for environments where a large amount of excess heap space is not available. A number of possibilities for further optimisation remain open, that may improve the impact of this technique. It is likely that being more aggressive in deciding which closures can be overwritten would lead to better results. In particular, allowing the closures allocated for function calls to be updated is likely to be useful for optimising recursive functions that are not tail recursive.
29
Bibliography
[1] Hudak, P. 1989. Conception, evolution, and application of functional programming languages ACM Computer Survey [2] J. Hughes Why Functional Programming Matters [3] David B. MacQueen Reections on standard ML Lecture notes on Computer Science, Volume 693/1993, pages 32-46. [4] Sylvian Conchon Jean-Christophe Fillantre A Persistent Union-Find Data Structure [5] P. Wadler Functional Programming: Why no one uses functional languages [6] Niklas Rojemo Colin Runciman Lag, Drag and Void: heap-proling and space-e cient compilation revisited Department of Computer Science, University of York [7] David Wakeling Colin Runciman Linearity and Laziness [8] Andrew W. Appel Garbage Collection Can Ce Faster Than Stack Allocation. Department of computer science, Princeton University. [9] Philip Wadler The marriage of eects and monads. Philip Wadler, Bell Laboratories [10] Philip Wadler Is there a use for linear logic? Philip Wadler, Bell Laboratories [11] Philip Wadler A taste of linear logic Philip Wadler, Bell Laboratories [12] Philip Wadler Comprehendnig Monads Philip Wadler, University of Glasgow [13] David N. Turner Philip Wadler Operational Interpretations of Linear Logic
30
[14] Simon Peyton-Jones Implementing functional languages on stock hardware: The Spineless Tagless G-machine version 2.5 University of Glasgow [15] Simon Peyton-Jones Making a Fast Curry: Push/Enter vs Eval/Apply for Higher-order Languages [16] Antony L. Hosking Memory Management for Persistence University of Massachusetts [17] Henry G. Baker Lively Linear Lisp Look Ma, No Garbage [18] Edsko de Vries Rinus Plasmeijer David M Abrahamson Uniqueness Typing Redened [19] Jurriaan Hage Stefan Holdermans Heap Recycling for Lazy Languages Department of Information and Computing Sciences, Utrecht University [20] Jon Mountjoy The Spineless Tagless G-machine, naturally Department of Computer Science University of Amsterdam [21] Francois Pottier Wandering through linear types, capabilities, and regions. [22] Exploring the Barrier to Entry - Incremental Generational Garbage Collection for Haskell A.M. Cheadle A.J. Field S. Marlow S.L. Peyton Jones R.L. While [23] Ntcker E.G.J.M.H. Smetsers J.E.W. Eekelen M.C.J.D. van Plasmeijer M.J. Concurrent Clean [24] Simon Peyton-Jones Simon Marlow The STG Runtime System (revised) [25] Simon Peyton-Jones Simon Marlow The New GHC/Hugs Runtime System [26] Simon Peyton-Jones Implementing Functional languages on stock hardware: the Spineless Tageless G-Machine [27] Simon Peyton-Jones Philip Wadler Imperative Functional Programming [28] J. Launchburry S Peyton-Jones State in Haskell In Lisp and Symbolic Computation, volume 8, pages 293-342. [29] The Mini STG Language: http://www.haskell.org/haskellwiki/Ministg
31