Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Some notes on -calculus

Sergei Winitzki
January 12, 2014

Abstract
These notes document some thoughts that occurred to Sergei while
attending the -calculus meetups presented by Tikhon Jelvis and reading
various books and articles. Use at your own risk.

1 Untyped -calculus
1.1 Motivation
Suppose we want to design a programming language for doing some compu-
tations, and we want this language to be as radically simple and bare-bones
as possible. We would like to introduce as few constructions as possible, but
nevertheless have the full power of an ordinary programming language.
The -calculus is one approach to solving this problem. In this approach,
we imagine that we are designing a functional programming language, that
is, we are going to write some functions and apply them to expressions in order
to compute some result values. So our programming language needs to have,
at least, two constructions: a construction for making a new function, and a
construction for applying a function to some arguments.
The mathematical notation for functions, f (x, y), is adopted in ALGOL-
family languages but may be confusing in many ways. Other programming
languages adopt other conventions, such as (f x) of LISP. In the languages of
ML family, the notation can be described as LISP without parentheses, that
is, just f x. In these languages, parentheses may be used as in mathematics,
to surround any part of an expression without changing the meaning of the
expression. So, instead of f x we might write (f x), or f (x), or (f ) x, or even
((f ) (x)).
It seems clear that our programming language will need a notation for the
arguments of functions, such as x in f (x), and for function names such as f
in f (x). It turns out that it is sufficient to have just one kind of names (both
for arguments and for functions).
The programming language that has only these constructions is the untyped
-calculus.

1
1.2 Definition
Here is a more formal definition. The -calculus has only one type of value,
called the -term. A -term can be either a variable (denoted by an identifier,
such as x) or a compound expression. We will assume that we can use infinitely
many different names for variables, such as x, y, z, a1 , a2 , and so on. There
are only two constructions that make compound expressions: -abstraction
(making a new function) and application (applying a function to an argument).

If x is a variable and e1 is any -term then x.e1 is also a -term and called
the -abstraction of e1 with respect to the variable x. The variable x is
called bound by this -abstraction.
If e1 and e2 are two -terms then e1 e2 is also a -term and is called an
application of e1 to e2 .
There is only one rule for computing the values of functions:
A -term can be reduced (i.e. simplified or computed) only if it is
an application, i.e. only if it has the form e1 e2 , and only if e1 is a -
abstraction, i.e. if e1 has the form x.e for some e. All other -terms
cannot be reduced. The result of computing (x.e) e2 is the -term e in
which all occurrences of x are replaced by e2 .
The simplest -term is a variable such as x. This -term cannot be reduced any
further. We can now build compound expressions out of variables. There are
only two ways of doing this: abstraction and application. So, for example, we
can make the following -terms out of two variables: x x; z y; x.y; z.z; and
so on. Since these expressions are valid -terms, we can build larger -terms,
for example: x. (x x); x. (y.y); (z y) x; (z.z) (x.x); and so on.
If we want to build intuition for the meaning of these formal definitions, we
will need to go through a few examples and exercises. It might be helpful to fall
back to another, familiar programming language where -abstractions are avail-
able (such as Haskell, ML/OCaml/F#, Scala, LISP, or even JavaScript, Python,
Ruby). All these languages have a subset that approximately corresponds to
-calculus.
-calculus Haskell Scala JavaScript
x.e \x->e x => e function(x){return e;}
f xy f x y f(x,y) f(x,y)
x.y.x \x y -> x (x,y)=>x function(x,y){return x;}
(x.x) a (\x->x) a (x=>x)(a) (function(x){return x;})(a)
An example of a computation: We make a function x. (x x) and apply it
to the expression y.y:

(x. (x x)) (y.y) = (y.y) (y.y) = y.y

2
1.3 Lots of Irritating Silly Parentheses
There is no restriction on how many or what kind of -terms we may combine
when we make compound expressions out of them. For example,

(x. (((x x) x) x)) (x. (((x x) x) x))

is a valid -term. Note how we have to use lots of parentheses to separate


clearly some parts of -terms that we are using to build larger -terms. To make
the notation simpler to read, we would like to reduce the number of required
parentheses.
Usually one adopts the following two conventions for the scope of parenthe-
ses:

Every application starts an opening parenthesis that closes right away


after the first argument. So we can write

abcd

instead of
((a b) c) d.

Every -abstraction starts an opening parenthesis that goes as far to the


right as possible, until a closing parenthesis is encountered. So we can
write
x. (y.z.x y z) x
instead of
x. ((y. (z. (x y z))) x) .

Note that functions of many arguments are simulated through functions that
return functions.

1.4 Scoping rules and variable names


When we compute (x.x) y, we obtain y as the answer. The same answer will
be obtained if we compute (a.a) y or (q.q) y. Our method of computing is to
proceed purely by substitution, so the name of the argument under can be
changed to any other name.
We would certainly like to have this property hold as a general rule, the free
renaming rule: A variable bound by can be renamed at will.
For example, in every computation, x.x is always equivalent to y.y, and
the name x can be changed to any other name.
There is a price to pay for this freedom: For instance, if we are allowed
to rename x to anything else in (x.x) y then we are also allowed to write the
expression (y.y) y. We would like (y.y) y to be the same as (x.x) y! So we say
that the variable y inside the is local to the scope of the -abstraction
and is not the same as the y outside the scope of .

3
Clearly, it makes sense to introduce the local scoping rule: Any variable,
such as y, once bound by y. (...), will be considered a new local variable within
the scope of that and so will be a different variable than any other y used
outside that scope.
This rule is applied recursively, i.e. to each inside every scope.

Example: What does


x.x.x (x.x) x
mean? To disentangle the variable names, we need to rename variables, and we
need to do this consistently in various local scopes. We start with the innermost
scope where we find (x.x); lets rename this to (z.z). So we get

x.x.x (z.z) x.

The next scope looks like this, x.x (...) x, so lets rename this x to y. We get

x.y.y (z.z) y.

So now it is clearer what this expression does. 


The local scoping rule naturally gives rise to a distinction between variables
that are bound in a given scope, and variables that are not bound. For example,
if we have a large -term and find that some part of it looks like this,

x.z x,

then obviously we are looking at a local scope where x is bound by , while z


is not bound. We say that z is free in this scope.
The local scoping rule logically forces us to consider carefully what happens
when we compute -terms containing names that are already bound in the ex-
ternal scope. If our computation rule is really just a simple textual substitution,
it will lead us into trouble! Suppose we have a -term that contains a bound
variable x, and we substitute into it another -term that also contains an x that
is free (or bound in another scope, but we see it as free here):

(y.x.x y) (z.x) .

Now if we compute this expression by substituting z.x instead of y, we find

(y.x.x y) (z.x) = x.x (z.x)

the x has become bound! This is inconsistent with the free renaming rule,
because we can first rename x to t in the original expression and compute the
expression again:

(y.x.x y) (z.x) = (y.t.t y) (z.x) = t.t (z.x) .

In this last expression, x remains free. So it appears that the same initial
expression can be computed in two different ways and gives two different results.

4
It is certainly vital that our computations should be always consistent regard-
less of the names of the variables! Therefore we need to modify the substitution
procedure for evaluating (x.e) e0 . Our original procedure was simply to sub-
stitute e0 instead of x every time it occurrs in e, that is, we were just doing a
textual substitution. This is insufficient: we first have to check that there are no
name clashes, i.e. that no variable that is free in e0 is bound in e. If there is a
name clash, the bound variables in e should be renamed so that none of them
has the same name as the free variables in e0 . Only then it is safe to substitute
e0 textually into e for each occurrence of x.

2 Data types and structures


The untyped -calculus seems to be a rather barren programming language. It
does not have any integers, Booleans, arrays, or anything like that. Nevertheless,
it turns out to be as powerful (in terms of computations in principle) as any
other programming language. How is this possible?
The only kind of values we have seen so far were -terms such as x.x or
x.y.x. These -terms cannot be simplified any more and can be regarded as
constant values. So, any computation we perform in -calculus will have such
terms as the initial data and as the final answer. It is in this way that we can
build up a programming environment that is equivalent in power to ordinary
programming languages.

2.1 Booleans
Define
T = x.y.x, F = x.y.y, IF = p.x.y.p x y
and consider what these -terms would do if used together. It is easy to see
that
IF T x y = x, IF F x y = y.
So the -terms T , F , IF behave like the values true, false, and the if
operation.
We can define -terms that simulate the operations of Boolean algebra. For
example,

AN D = p.q.IF p (IF q T F ) F, OR = p.q.IF p T (IF q T F ) .

It is important to note that the terms we defined as T and F do not have


any special meaning in -calculus. The -calculus does not know that we use
them to represent Boolean values. It does not make sense to apply a Boolean
value as a function to another Boolean value, but we can write -terms such
as T F and compute them, because -calculus allows any two -terms to be
applied to each other:

T F = (x.y.x) x.y.y = y.x.z.z.

5
The result is meaningless - it is neither T nor F .
So, programming with Booleans in the pure -calculus is error-prone.
Nothing stops the programmer from applying the IF operation, by mistake,
to a term which is neither T nor F . The result will be, most probably, useless,
but -calculus has no means of preventing us from applying IF to a non-Boolean
term.

2.2 Natural numbers (Church encoding)


Natural numbers are simulated through the following -terms:

Zero = f.x.f, One = f.x.f x, Two = f.x.f (f x) , ...

The integer n is represented by a function that applies a given function f to a


given argument n times. For example,

Two f x = f (f x) .

To add integers n1 and n2 , we need to create a new integer that will apply
f to a given argument (n1 + n2 ) times.

Add = n1 .n2 .f.x.n1 f (n2 f x) .

It is quite difficult to define Subtract in this way. Several textbooks on


-calculus explain the devious tricks needed to define Subtract.

2.3 Product (pair)


Given two -terms a and b, we would like to produce a -term that represents
the pair (a, b). We need a constructor, i.e. a -term Pair such that Pair a b
makes the pair, and two accessors, i.e. two -terms P1 and P2 that extract the
first and the second value from a given pair.
A possible encoding is this:

Pair = a.b.f.f a b, P1 = p.p T, P2 = p.p F,

where T and F are the Booleans defined above.


Note that the constructor Pair a b returns an unevaluated -abstraction,
f a b, that holds the given values a and b and waits for the function f to be
given. The function f must have two arguments. When we give the function f
that selects its first argument, we get the first term from the pair.
Again, it is clear that this works only as long as the programmer keeps the
conventions. Nothing prevents the programmer from writing a -term that, by
mistake, applies a pair to a function f that does not accept two arguments.
There will be no error message. The result will be certainly some -term, and
most probably a useless one.

6
2.4 Disjoint union
We would like to have a -term that represents a choice between two possible
-terms. So we need two constructors and one accessor; the constructors Left
and Right will make the union, and the accessor Case will make the choice.
A possible encoding is this:
Left = x.f.g.f x, Right = x.f.g.g x, Case = c.f.g.c f g.
For example, the constructor Left a returns the -term f.g.f a that waits
for two functions, holding the value a. The Case function takes a choice term
and two functions, f and g. The first function will be applied to a, the second
ignored. This is quite similar to what happens in the usual case construct.

3 Simply typed -calculus (`


a la Church)
The -calculus just introduced does not have a notion of type. Any -term can
be applied to any other -term. In any reasonable programming language, this
would be strange: functions usually expect only certain types of arguments.
For example, consider a function that subtracts 8 from its argument (Haskell:
\x -> x-8). This function expects, at least, that x a number; no reasonable
computation can be performed if x were a non-numeric value (say, if x were a
list of strings). We might say that this function expects an argument of integer
type. So the type of a functions argument can be understood as the set of all
possible values that the function accepts.
Now, we would like to introduce this feature into -calculus.

3.1 Syntax
We introduce a new construction into the language: a -term may have a type
annotation. We may write an annotated term as e , where e is a -term and
is the name of a type. For simplicity, let us assume that we have types such
as void, bool, int. Then an annotated term could look like xint .
There is a special notation for the type of functions: it looks like 0
where and 0 are types, for example void void or int(boolint).
By convention, we write instead of ( ).
Examples of type annotations are
xint ; (yvoid .y)voidvoid .
We now require that every variable bound by a should have a type annota-
tion. So we do not allow expressions like x.y anymore because here the bound
x is not annotated. Instead we must always write something like xvoid .y or
xbool .y x, etc.
This is actually quite a nuisance because in -calculus we need to write a
huge number of s! So there is an alternative notation where the types are
implicit (the Curry notation). But lets keep going with the Church notation
for now.

7
3.2 Primitive types and constants
The only types we can write so far are either primitive types (void, bool, int)
or function types (voidbool etc.).
What exactly is the void type? It is a type that has only one value, the
empty value written as ().
The name void is something I prefer; it is called unit in ML-family lan-
guages and None in Python, for instance. The choice of the word void it is
motivated by the C language where, when a function does not require any ar-
guments, we write this function as if it accepts an argument of type void, anda
we invoke this function by writing () after the name of the function.
The value () is a constant of type void. This means that we do not have
to annotate this symbol; the programming language already knows that the
symbol () always has type void.
Similarly, we might introduce constants of other types. For example, we
might introduce true and false as constants of type bool, and the symbols
0, 1, 2, etc. as constants of type int. These symbols are, technically speaking,
a different kind of -terms in -calculus. They are not compound expressions
(neither abstractions nor applications); they cannot be reduced or applied to a
value; but other -terms can be applied to them. The constants are also not
variables because we are not allowed to use these symbols as bound variables in
-terms: we cannot write ().x or 1.2.
So we are forced to introduce constants as a separate kind of -term.

3.3 Type checking


The point of introducing types is to make sure that every function is applied to
arguments of the type that the function expects. So now the typed -calculus
introduces a rule: A function can be applied to a -term only if the -term
has the type that equals the argument type of the function. For example,
xint .yvoid .x can be applied only to some term of type int. We cannot
write
(xint .yvoid .x) (zvoid .z) ,
because the function xint .yvoid .x expects an argument of type int, but the
term zvoid .z has the type voidvoid.
Note here that all functions are considered to take only a single argument,
so we only need to check the type of that single argument for each .
To check types in this way throughout some long expression, we must be
able to figure out the type for all -terms that occur in that expression. In the
example shown above, only zvoid in the term zvoid .z is annotated. We are
not required to annotate every term; we annotate only the variables bound by
s. Nevertheless, we need to know that zvoid .z has type voidvoid. So we
need a procedure that will infer the types of the -terms from the annotations.
Otherwise we cannot check that our -terms have consistent types.

8
3.3.1 Example: booleans
Let us work through a simple example of type annotations.
Earlier we defined the Boolean constants T and F , as well as the IF
function whose argument must be one of T or F . We have seen that the untyped
-calculus cannot guarantee that the IF function gets a Boolean argument. Let
us try to do this now using types.
We first need to find the correct type annotations for the definitions of T ,
F and IF . The function IF contains three s, so lets assume that its type
annotation is
IF = p .a .b .p a b,
where the types , , are, so far, unknown. It follows that both T and F
must have type and accept arguments of types and :
T = x .y .x, F = x .y .y.
So we need to find types , , such that these definitions are consistent. The
definition of T implies that = , and the definition of F implies
that = . So the only way to achieve consistency is to have = ;
then = . Therefore, IF must have the type
= ( ) .
Which type is ? It is easy to see that the definitions will work for any type
. However, we must choose a particular type in the simply-typed -calculus.
This type will be also the type of the output value of an IF expression. So
first we need to know where we are going to use the IF function in our program,
and then we will see which specific type this IF expression needs, and define
IF accordingly. For example, if our program needs an IF with an integer result,
we need to set = int and define IF accordingly, e.g.
IF = pintintint .aint .bint .p a b.
Now the IF function will not accept arbitrary -terms for its argument.
Only -terms of type will be accepted. So the T and F terms must
also be defined with the same type .
It follows that we will have to define different T , F , and IF terms separately
for every type for which we need an IF operation in our program! This is
indeed a severe limitation of simply-typed -calculus.

3.4 Termination
The simply typed -calculus has a property that all programs terminate; for
instance, the infinitely looping term
(x.x x) (x.x x)
is impossible. Let us try proving this (very informally, we are just going over
the basic ideas of a possible method of proof).

9
Each -term has a type that is ultimately built up from primitive types and
some function arrows. So each type must ultimately have the form ( )
..., where , , , etc. are primitive types. Now we can count the number
of function arrows in the type. Let us call this the length of the type. For
example, int has length 0 and voidvoid has length 1.
The only computation in -calculus is application. When we apply a term
of type to a term of type , the result is of type . So initially we had a
-term
(x .E ) A ,
containing the types , , and ; but after reduction we have just a term
of type . We see that each reduction decreases the number of arrows in the
types used in the expression. In other words, if we compute the sum of lengths
of all used types, this total type length will never grow during computation.
Now we can start with any initial expression and perform any reductions that
are possible, in arbitrary order. Each time we will certainly decrease the total
type length. So it is impossible that our expression will be reduced infinitely
many times. Eventually there will be no more reductions; either because we
arrived at a value of primitive type that cant be reduced, or because we arrived
at a -term that contains no more applications.
By a similar reasoning, we can show that the variable x in the -term x.x x
cannot be consistently assigned any type in the simply typed -calculus. If it
were possible to assign a type to x here, say x .x x, then must be a function
type whose argument is again . So must be equal to for some other
type . This, however, is impossible because the number of arrows in cannot
be equal to the number of arrows in . The type is by necessity
longer than because contains at least one more arrow than . The
simply typed -calculus enforces the consistency of type length in this way, and
prohibits self-application.

3.5 Product and sum types


We would like to implement the product and the disjoint union types through
the typed -calculus. We will see that there is a problem implementing the
product (but not the sum).

3.5.1 Sum (disjoint union)


No difficulty is found if we want to define the disjoint union with the -calculus
terms Left, Right, and Case as we did before. Suppose that the fully con-
structed disjoint union term has type . Let us try adding annotations to the
Case term,
Case = c .f??? .g??? .c f g.
The Case term will be used with three arguments: the first argument (c) is
the constructed disjoint union itself, and the second and the third arguments
are functions f and g that mape from and from to some result type .

10
The result of a Case expression (the term c f g) must have the result type .
Therefore the Case term must have the following type annotations,
Case = c .f .g .c f g,
where
= ( ) ( ) .
Now consider the definitions of the constructors Left and Right:
Left = x .f .g .f x, Right = x .f .g .g x.
All terms are well-typed. We will need, of course, a separate definition of Left,
Right, and Case for each pair of types , and for each result type .

3.5.2 Product (pair)


Suppose we have types and , and we would like to define a product type. In
the untyped -calculus, we used the following -terms,
Pair = a.b.f.f a b, P1 = p.p (x.y.x) , P2 = p.p (x.y.y) .
To use this implementation in the typed -calculus, we need to annotate these
terms with types. A fully constructed pair, p = Pair a b , must have some
definite type; let us denote this type by . We expect that P1 p is of type ,
while P2 p is of type . Now, P1 and P2 involve the terms p T and p F , where
T = x.y.x, F = x.y.y,
and we do not yet know the type annotations for T and F . We see that we
must be able to apply the term p to T or to F . This forces T and F to have the
same type (the type of the argument of p). The only solution is to have T and
F both of the type for some . Hence, has to be a function type
whose argument is of type for some definite . In other words,
= ( )
for some type . It follows that both p T and p F must be of the same type
. Therefore, the type system forces = ; it is impossible that p T yields a
value of type while p F yields a value of a different type . We are unable to
define the product type if the two types and are different!
The solution is to add the product type to the language. We postulate the
syntax for the product type. In other words, we regard the symbol
as a valid type symbol, just like is considered to be a valid type symbol.
Can we now implement the product construction? No, because the new
type symbol cannot be used productively unless we also have some built-
in terms that produce or consume values of that type. We have to define Pair,
P1, P2 as built-in constants. Without these constants, we are back to finding a
simple type for the pair.
It appears that every time we add new types to the language, we also have
to add new built-in operations. Merely adding new type symbols does not help!

11
3.6 Demonstration (OCaml)
The problem with the product type: The type of the constructed pair ex12 is
inconsistent if we want to use both p1 and p2 on it.

Objective Caml version 3.12.1


# let pair (a:int) (b:bool) = fun f -> f a b;;
val pair : int -> bool -> (int -> bool -> a) -
> a = <fun>
# let p1 p = p (fun (x:int)(y:bool) -> x);;
val p1 : ((int -> bool -> int) -> a) -> a = <fun>
# let ex12 = pair 1 true;;
val ex12 : (int -> bool -> _a) -> _a = <fun>
# p1 ex12;;
- : int = 1
# let p2 p = p (fun (x:int)(y:bool) -> y);;
val p2 : ((int -> bool -> bool) -> a) -> a = <fun>
# p2 ex12;;
Error: This expression has type (int -> bool -> int) -
> int
but an expression was expected of type (int -> bool -
> bool) -> a
# ex12;;
- : (int -> bool -> int) -> int = <fun>
#

No such problem with the sum type:

# let lft (x:int) (f:int->bool)(g:unit->bool) = f x;;


val lft : int -> (int -> bool) -> (unit -> bool) -
> bool = <fun>
# let rgt (x:unit) (f:int->bool)(g:unit->bool) = g x;;
val rgt : unit -> (int -> bool) -> (unit -> bool) -
> bool = <fun>
# let case c f g = c f g;;
val case : (a -> b -> c) -> a -> b -> c = <fun>
# let la = lft 1;;
val la : (int -> bool) -> (unit -> bool) -
> bool = <fun>
# let rb = rgt ();;
val rb : (int -> bool) -> (unit -> bool) -
> bool = <fun>
# case la (fun x->x=1) (fun y->false);;
- : bool = true
# case rb (fun x->x=1) (fun y->false);;
- : bool = false
#

12
3.7 Questions
Can we prove that no -calculus construction can be found to implement
the product type?

I tried to use category-theoretic methods, and also I tried to use the Curry-
Howard isomorphism. I was not able to find a proof either way.
Why is it that we were able to implement the sum type but not the product
type?

It seemed to me that we can implement the product type if we already have


the sum type; this idea now seems incorrect. We actually have a consistent
implementation of the sum type in simply typed -calculus, and yet I was not
able to produce a consistent implementation of the product type.

3.8 Some answers (from Mathoverflow)


First, it is not quite true that we were able to implement the sum type. Accord-
ing to our scheme, we need to define a different sum type for every result type
of the Case expression. This is not what the correct sum type should be: it
should be the same for all required result types. We cheated when we claimed
that the sum type is definable in simply typed -calculus!
One approach to understanding the lack of sum and product types in the
simply typed -calculus is to use the Curry-Howard isomorphism: to map types
into propositions. Here I am quite lacking in mathematical rigor; I have only a
superficial understanding of the CH isomorphism and of the underlying issues.
Here is what I imagine. If we map types into propositions then functions are
mapped into statements of the form A B, that is, into implications between
propositions. So our computations in -calculus are equivalent to some com-
putations in propositional logic with expressions such as (A B) (A B),
etc. We must, however, keep in mind that we are not computing with ordinary,
classical propositional logic; instead, we are computing in the intuitionistic logic.
All theorems derivable in the intuitionistic logic are valid programs in the simply
typed -calculus. What we are able to derive are not simply true propositions
but propositions that correspond to implementable programs in -calculus.
So for instance we do not have the constant false, and we do not have
the law A A = true (and actually we do not have the negation operator!).
Generally, a tautology of classical propositional logic is not necessarily some-
thing that is derivable in the intuitionistic logic. An example is the formula
((A B) A) A, which is a tautology in classical propositional logic (called
Peirces law) but is not derivable in the intuitionistic logic. Accordingly, we
cannot define a -term of this type in the -calculus (such that A and B are
arbitrary types!).
Now, a sum type is an or proposition, A B, while a product type is an
and proposition, A B. The task of implementing, say, the product type
in the -calculus is now translated into the task of expressing A B through a

13
formula that only contains the implication. Now, it is easy to see that no formula
containing only implications can reproduce A B even in classical propositional
logic. The or expression formula such as (A B) B is equivalent to A B
in classical propositional logic. However, this formula is not derivable in the
intuitionistic logic. One can certainly prove this, as people on Mathoverflow
have indicated, but I do not know enough details here.
The second interesting comment from Mathoverflow is that we can imple-
ment the product type, if we are willing to fix the type of the results quite
similarly to the way we cheated when we implemented the sum type. So, there
is indeed a symmetry between the sum type and the product type.
Here is how one can cheat while defining the product type. We assume
that when we get values out of a fully constructed Pair, we will always need to
provide a function that maps these values to a fixed result type. For example,
a pair of int and bool will not be defined just like that; instead, there will
be a type of pair of int and bool, out of which we will always be getting an
answer as a bool. The accessor first will not simply return the first element
of the pair; instead, first will take a function that will map the first element
of the pair to bool.
Let me reproduce the Haskell code given on Mathoverflow:

pair :: Integer -> Bool -> (Integer -> Bool -> Bool) -
> Bool
pair a b = \c -> c a b
first :: ((Integer -> Bool -> Bool) -> Bool) -
> (Integer -> Bool) -> Bool
first p = \f -> p (\a b -> f a)
second :: ((Integer -> Bool -> Bool) -> Bool) -> (Bool -
> Bool) -> Bool
second p = \g -> p (\a b -> g b)
*Main> let p = pair 2 True in first p ((<)1)
True
*Main> let p = pair 2 True in first p ((<)3)
False
*Main>
let p = pair 2 True in second p ((||) False)
True
*Main> let p = pair 2 False in second p ((||) False)
False

So, the type of pair is different depending on what kind of result we will be
getting out of the pair. This is cheating: we should have defined the type of
pair regardless of the computations that will use it.

14
4 More about the Curry-Howard isomorphism
4.1 Intuitionistic logic
I am trying to avoid studying logic for the sake of logic alone, and Im trying to
avoid studying long proofs that have no immediate relevance.
Classical logic is based on Boolean values true and false, with Boolean
operations and, or, not. There are some atomic propositions p1 , p2 , ..., that
can be either true or false (the law of excluded middle). Other propositions
are made up from the atomic propositions with help of the Boolean operations.
The implication A B is defined as the Boolean expression (A) B. The
truth value of any Boolean expression is readily computed if we know the truth
values of all the atomic propositions.
It is straightforward to verify any theorem of classical logic. For example,
A (B A) is a valid theorem in the classical logic; to check validity, we
merely need to check that the truth value of A (B A) is true for any
truth values of A and B. On the other hand, A (A B) is not a valid
theorem because this is false when B is false and A is true.
Intuitionistic logic is a different sort of enterprise. Propositions can be true,
false, and unprovable; the latter means that we cannot say whether a state-
ment is true or false because we cannot find a proof either way. So the most
important thing in intuitionistic logic is being able to find a proof of a proposi-
tion; whereas in classical logic, the most important thing is the truth value of a
proposition.
The proof procedure in the intuitionistic logic is based on Modus ponens (if
we know A B and if we know A then we know B) and some axioms. We
begin with just one logical operation, the implication; then we have two axioms:

A (B A) ;(axiom K)
(A (B C)) ((A B) (A C)) .(axiom S)

Note that these axioms are valid for any propositions A, B, C.


We can now derive other propositions, or theorems, from this pair of ax-
ioms. For example, we can derive that A A for any A. This derivation,
however, is not very short.
If any proposition is a theorem in the intuitionistic logic, it is also valid
in classical logic (since the axioms K and S are both valid in classical logic).
The converse is, however, not true: not all theorems valid in classical logic are
provable intuitionistically.
The point of intuitionistic logic is that a theorem P is proved only if we
directly derive P from the given axioms, using Modus ponens. We are not
allowed to prove P by contradiction, i.e. by proving that the negation of P
is contradictory. We need to prove P constructively. In mathematics, this
kind of constructive proof is, say, to show that there exists a number satisfying
certain properties, by computing one such number. A non-constructive proof
would be to prove that the absence of such numbers is somehow contradictory;

15
but this kind of proof does not give any actual examples of such numbers, nor
does it give a method of computing them.
Mathematics normally admits non-constructive proofs. However, in pro-
gramming we must produce specific programs computing our answers, not merely
proofs that answers must somehow exist. So intuitionistic logic seems to be bet-
ter adapted to programming tasks.
At this point, let us make a connection with the simply typed -calculus.
This connection is the Curry-Howard isomorphism. Each proposition corre-
sponds to a type represented by the propositions pattern of implications. Prim-
itive types correspond to atomic propositions.
For example, xA .yB .y has type A B B, so this -term corresponds
to the intuitionistic proposition A (B B).
The interpretation of true proposition is we have a program that computes
a value of this type, or in other words this type is inhabited.
In the Curry-Howard isomorphism, Modus ponens corresponds to the appli-
cation of two -terms:
(xAB yA )B .
The result of the application x y is a term of type B. Since the only way to prove
a statement is by using Modus ponens and the axioms, a proof of a statement
corresponds to the calculation of some big -term, and this -term must be built
up from the two axiom -terms, denoted by K and S:
K = xA .yB .x, S = fABC .gAB .xA .f x (gx) .
It is easy to see that K has type A B A and S has the type of the second
axiom above.
Now, the terms K and S are just the standardcombinatorsof the -calculus
in the Curry formulation. Omitting the types, we have
K = x.y.x, S = f.g.x.f x (gx) .
So we can compute the term SKK and find
SKK = x.Kx (Kx) = x.x
because Kx is just y.x, so KxP = x for any P . The result is: SKK has type
A A for any A. (This is the I combinator.) Therefore, A A follows from
the axioms.
Now, the computation of SKK was rather quick. This computation can be
written out laboriously as a proof of A A obtained by applying the axiom S
to two copies of axiom K and using Modus ponens several times. So -calculus
can be seen as a quick and useful notation for obtaining proofs in intuitionistic
logic.

4.2 Example: programs as proofs


As an example, consider the following little theorem,
(A B) (A (C B)) .

16
The interpretation of this theorem is this: If we know A B, we can also infer
B given not only A but also some arbitrary other proposition C.
How can we prove this theorem (valid for any A, B, C)? We would have to
use Modus ponens and to manipulate the axioms S and K, which is cumbersome.
Alternatively, let us consider this as a programming task in -calculus. Our task
is to write a -term that has the type (A B) (A (C B)), where the
types A, B, and C are arbitrary.
It can be proved (rather laboriously, Im afraid, but its in the books) that
any -calculus term can be expressed through the combinators S and K alone.
Therefore, we do not actually need to confine ourselves to the combinators S
and K. We can just write some -term that has the correct type.
This -term must be a function that takes as argument something of type
A B and returns something of type A (C B). So we begin by writing

fAB . (???)

Since we need to return A (C B), we have to write the -term of the form
xA .yC . (???). The result must be of type B. Now obviously we can get a
value of type B only if we apply f to x. So the solution is

fAB .xA .yC .f x

And thats it! This -term is a program of -calculus that gives a proof of
the above theorem of the intuitionistic logic. It is a lot easier to come up with
such -terms than to manipulate axioms directly, trying to obtain the desired
implications.
In this way, we can understand the cryptic description of the Curry-Howard
isomorphism: types are propositions, programs are proofs.

4.3 Sum and product types


If we have only implication (A B) as the logical operator, intuitionistic logic
is quite limited, because it corresponds to simply typed -calculus. As we have
seen, the sum and the product types are not available in -calculus. The only
way to have the sum and product operations is to add them to the -calculus
by hand. The same situation holds for intuitionistic logic: we have to add the
operations or, and to the logic by hand.
So we introduce the operators and with the following axioms:

A B A; A B B; A A B; B A B;
A (B A B) ; (A C) ((B C) (A B C)) .

These axioms must correspond to some -terms; however, as we have seen,


simply typed -calculus is unable to provide us with such -terms, and they
have to be postulated. It is easy to see that these six axioms exactly correspond
to the -terms first, second, left, right, pair, and case.

17
4.4 Intuitionistic first-order logic?
First-order logic has predicates and variables, and contains propositions such
as x.p(x) q (x). Here p, q are predicates and x is a variable. It is impor-
tant that quantifiers are used only for variables but not for predicates. (Using
quantifiers for predicates means that we are using second-order logic.)
Now, so far we have always used -calculus terms such as x.y.x that work
with arbitrary types, not specific types like bool. The interesting theorems
were always theorems about arbitrary propositions A, B, C. So for our pur-
poses we would like to apply quantifiers to propositions, so that we could write
A.B.A (B A) and so on. Here A and B are not necessarily atomic
propositions, but arbitrary propositions of logic.
But I am not sure how to implement this in first-order logic. We could
introduce a single predicate, called true(x), which is true when the proposition
x holds. Then we could write
A.B.true(A) (true(B) true(A)) .
However, this can be used only with atomic propositions A and B, as far as I
understand.
If we had this kind of first-order intuitionistic logic, we could omit the pred-
icate true and write simply
A.B.A (B A)
and similarly for other such theorems that are valid for any propositions.
Now, it is clear that the Curry-Howard isomorphism will map this theorem
to the polymorphic -term
..x .y .x
In this way, we find another use for the Curry-Howard isomorphism. Namely,
we know that some implications can be derived in the intuitionistic logic, while
other implications cannot be derived. As a simple example, the implication
A.B.A B cannot be derived for arbitrary A and B. (This implication is
not valid even in the classical logic.) Accordingly, it is not possible to have a
polymorphic -term of type .. .
So, we conclude that polymorphic -calculus admits only such fully poly-
morphic terms as correspond to valid theorems of intuitionistic logic. We can
have polymorphic terms such as
x.y.z.x (yz) z
or whatever else; but we do not have polymorphic terms of type .; ...
; etc.

4.5 Continuation-passing style (CPS); the current con-


tinuation
There is a trick that transforms -calculus without changing its semantics into
another form. Consider any -calculus computation that contains several layers

18
of functions, such as
f ((x.y.y) a b) c = f b c.
In this computation, we first reduce the term (x.y.y) a b to just b, and then
apply f to b and c.
When we reduce (x.y.y) a b in this computation, we do not know that the
result (b) will be used by f . Continuation-passing style makes this information
explicit. Every function gets an additional argument, called the continuation,
which is itself a function that describes what needs to be done next with the
current result. Every function does not simply return its computed value, but
explicitly applies the continuation to that value.
For any given -term within an expression, we can rewrite the whole expres-
sion as a -abstraction applied to that term. This -abstraction is called the
current continuation for the selected term. For instance, in the computation
(x.y.y) (t.t) b, the current continuation for the term t.t is the -term
a. (x.y.y) a b.
In the example
f ((x.y.y) a b) c,
the term (x.y.y) ab is reduced and its result b is inserted into the computation
of f b c. This can be described as the calculation of
(t.f t c) ((x.y.y) a b) = (t.f t c) b.
Therefore, the -term t.f tc is the current continuation for the term (x.y.y) ab
in the full expression.
The current continuation of a -term can be understood as the computa-
tional context in which the selected subterm is currently being evaluated. The
current continuation represents the rest of the program that is waiting for the
value of the selected subterm.
What is the type of the current continuation? The current continuation
is defined with respect to a selected subterm in a big -term. The current
continuation is a function that takes the value of the selected subterm and
returns the value of the entire big -term. So, in general, the type of the current
continuation can be any particular .
Transforming -calculus into CPS is done like this: Each -abstraction, x.e,
is transformed to k.x.k e, where k is the continuation. Each application, x y,
where y is itself already in the CPS style, is transformed to y (k.x k). (Here I
am not really sure whether this formula is correct...)
What is the current continuation with respect to the entire -term? Well,
we need to realize that our -term is not being evaluated in a vacuum; there is a
computer and an running operating system that are evaluating our -term. So,
in principle, we may consider that there is a bigger continuation that is waiting
for our -term when it is finally completely evaluated. This bigger continuation,
for instance, involves printing the resulting value onto the screen. In order to
emphasize that the current continuation is not just a small -term, we might
write ... in front of it, for example as ... (t.f t c).

19
4.6 Call with current continuation (call/cc)
A few programming languages (Scheme, Standard ML) have call/cc. This is a
curious and confusing construction that, in some sense, represents a declarative
GOTO statement. In most cases, call/cc is used in a very restricted way: for
example, to implement exceptions.
Just as with the GOTO statement of imperative languages, unrestricted
and copious use of call/cc makes programs buggy and unmaintainable much
faster than if the programmer avoids using call/cc. So call/cc is dangerous;
nevertheless, we need to know what it is and to understand the dangers.

Definition: The call/cc operation is a new postulated -term with type


(( ) ) .
Note that in the simply typed -calculus, it is impossible to write a -term
with the type (( ) ) . In correspondence with this, ((A B) A)
A is not a theorem of intuitionistic logic, i.e. it cannot be derived from axioms
S and K; it is called Peirces law. (Note that Peirces law is a valid theorem
in classical logic.)
If we add Peirces law as an axiom to intuitionistic logic, we will obtain
classical logic (i.e. the law of excluded middle, the Boolean truth values, and
all that). So, we can expect that adding call/cc is a very significant change in
the power of the programming language.
Let us approach the understanding of call/cc by looking at its type. Why is
it impossible to write a -term with that type? A term with type (( ) )
takes a function of type ( ) and must yield a value of type , so it
must be
f() .(...value of type ),
but where would that value come from? It can only come from applying f to
something. But the argument of f is of type , and there is no way to
provide that kind of value to f . The reason we cannot have an ordinary -term
with type (( ) ) is, essentially, that we cannot have a term with
type for arbitrary and . If we had = , there would be no problem:
f() .f (x .x) ;
here we have used the I combinator, x .x, which has type . But in
general, with 6= , we cannot produce a polymorphic term with type .
So how does call/cc work? It can work only in this way: receive an f of
type ( ) , take somewhere a value k of type , and evaluate
(f k). The first decisive trick comes here: as the value k of type we use
the current continuation with respect to the call/cc term itself. The current
continuation is, in general, a function of arbitrary type , where is the
type of the result of call/cc, and is the type of the final result of the entire
-term we are evaluating.
We can invent a notation for call/cc like this:
CC f() = CC (k .B) ,

20
where B is the body of the function f (which must use k somehow). Now
suppose we use this kind of expression inside another function application:

g a (CC k.B) c,

say. The current continuation for the CC term is b.g a b c, therefore this will
be the value of k passed to B when the term CC k.B is evaluated.
Let us continue following the logic of computations: As CC (k.B) is evalu-
ating the body B, the expression B might apply k to something. However, k is
not just some little -term the value of k is actually the entire computational
context the entire running program, including the computers operating sys-
tem. When k is applied to something, it could mean that our entire program
has printed its result and stopped!
So, we must conclude that applying k within B will escape the computational
context of B and return to the bigger expression around CC. For example, let
us assume a simply typed -calculus with integers and the product type, and
compute some -term, say,

Pair (CC k.Pair (First (k 2)) 3) 1.

This entire term has type = intint, and the CC symbol has type (( ) )
, where = int.
Let us start by finding is the current continuation for the CC term. To do
this, we rewrite the above as

(x.Pair x 1) (CC k.Pair (First (k 2)) 3) .

So the the CC term will receive ... (x.Pair x 1) as the current continuation.
The CC term will apply its argument,

k.Pair (First (k 2)) 3,

to this value. Therefore, this will be the value of k when Pair (First (k 2)) 3 is
evaluated.
Now, it seems we need to evaluate the application k 2. The result of this
application is
... (x.Pair x 1) 2 = ...Pair 2 1,
where the ... indicates that the result is the final result of the computation.
In other words, the entire computation has finished with this result.
Note that inside the body of the CC term, we were not finished computing
the term Pair (First (k 2)) 3. If k 2 finishes with the value Pair 2 1 then it would
appear that the application of f to k inside CC must yield the result value
Pair 2 3. However, this value is not computed. In fact, its computation is never
finished because the application k 2 means that we return to the continuation
outside CC and abandon the computation of the body inside CC. This is the
second decisive trick involved in call/cc.

21
Note that the CC term must compute some value nominally of type ,
so that the type system is satisfied even though this value is never actually
computed at run time.
What happens if B never applies k but just computes some value? In that
case, for example,
(CC k.0) ,
the computation will never terminate. One could argue that in this case the CC
term should simply return 0. However, the CC term cannot know in advance
whether its argument will actually run the continuation k or not. So the CC
term will never compute its nominal value. (I might be mistaken here!)

4.7 Negation and the laws of de Morgan


De Morgans laws of logic are

(A B) = A B; (A B) = A B.

The = sign means implications in both directions, i.e. we have written four
implications here. These laws are valid in classical logic. However, intuitionistic
logic admits only three of these four implications; namely

(A B) A B

does not hold in intuitionistic logic.


Negation is expressible through implication, if we have the false value :

A = A .

However, there is a problem: there is nothing in -calculus that can represent a


type corresponding to . This would be a type that is uninhabited, i.e. a type
without values. But this is not realizable in a programming language, i.e. we
cannot actually write any code for a function mapping, say, bool to a type with
no values.
Polymorphic -calculus can help a bit. Instead of a function of type ,
we represent through a function of type . where is a free type
variable. (Anything can be derived from a falsehood.) This kind of function
might be possible to construct as an actual -term.
So now we can use the CH isomorphism to prove three of the four laws of
de Morgan. (See Math Stackexchange entry does de morgans laws hold in
propositional intutionistic logic by Henning Makholm.) The proof is simply by
showing a certain -term that has the required type.
To prove (A B) (A B):

f.+ .Pair (x .f (Left x)) (y .f (Right y))

The result of applying this -term to a function of type . + is a


function of type . ( ) . ( ). Let us omit the quantifiers to
save space here.

22
To prove (A B) (A B):

f()+() .g .Case f (x .x (First g)) (y .y (Second g))

To prove (A B) (A B):

f()() .g+ .Case g (x . (First f ) x) (y . (Second f ) y)

It is impossible to prove (A B) AB in intuitionistic logic; accord-


ingly, there is no -calculus term with the type . ( ) ( + ).
However, it is possible if we use call/cc, which brings us back to classical logic!
The following comment by Henning Makholm says (in my notation),

First save the continuation and return an . Then, if anyone


ever calls that, remember the and use the saved continuation to
go back in time and return a instead. When the latter return
value is called, you have both an and a and can therefore use
the original function to get an .

I tried to translate this into polymorphic -calculus with call/cc:

CC k .k (f . (Left x .k (g . (Right y .k (h .h x y))))) ,

where is a fixed other type (coming from the current continuation), and I
denoted for brevity

= ( ) ( + ) .

5 Polymorphic -calculus
Simply typed -calculus is not very useful or convenient for general-purpose
programming. Also, the Curry-Howard isomorphism is almost useless without
polymorphic -calculus: all propositions are trivially true since all types are
inhabited, and all implications are proved since we can always write a function
of any type. For example, as we know, A B is not a valid theorem for arbitrary
A and B; but for any specific propositions A and B this is always a theorem.
Say A = bool and B = int, then we can write a -term of type bool int very
easily,
xbool .1234int
or whatever. The true usefulness of the Curry-Howard isomorphism comes only
when we start using the polymorphically typed -calculus.

23

You might also like