Simply Typed Lambda Calculus

11/9/19

Introduction

The Lambda Calculus  that we have been looking at thus far has been the untyped Lambda Calculus. It is plenty interesting for sure, being able to provide a foundation for computability, programming languages, and many areas of linguistics and philosophy. But untyped Lambda calculus also has some features to be wary of; for example, some reduction sequences go on forever and never reduce the expression to a normal form. We wouldn't really want that happening unintentionally with an ordinary computer program running on an ordinary computer. Also Russell's paradox (considering 'the set of all sets which are not members of themselves' and asking whether that set is a member of itself— if it is, it isn't and if it isn't, it is) can be mimicked in the untyped Lambda Calculus, as can many other paradoxes and inconsistencies.

Church himself, the author of Lambda Calculus, presented a typed Lambda Calculus in his 1940 paper A Formulation of the Simple Theory of Types and this is a route we are going to follow. Simply Typed Lambda Calculus (STLC), at a basic conceptual level, is not that hard to understand (although type theory can ascend to the clouds and beyond).

Types and Values

Types are collections or sets of values. For our purposes, to produce an illustrative example, we can get by with just two basic Types, one with a finite number of values, and the other with an infinite number of values. 

The Type Boolean, or Bool, or just 'B' is a type with the two values: True, and False.

The Type Integer, or Int, or just 'I', is a type with the values 0,1,2,3... etc

Constructing New Types from Existing Types

The base types are not the only types there are. It is possible to construct new types from old types using the type constructor '→' 

If t1 is a type and t2 is a type then t1→t2 is a type.

This means that, for example, B→I is a type, and so too are B→I→I→I and I→I→I. For clarity, we will permit parentheses in type expressions. Also, if there are no parentheses, the type constructor → associates to the right. So, for example, B→I→I→I means B→(I→(I→I)).

There are now infinitely many types in our STLC.

The intended interpretation of t1→t2  is that of denoting a function from t1 to t2 . So, for example, a function from I to B would have type I→B. To make this a little more concrete, imagine, outside of STLC, a function from the Integers to Booleans such as the '... is even' function; this function returns True if the integer is even and False if it is not; that function has the type I→B.

 

Changes in the Syntax to form a Simply Typed Lambda Calculus

There is only one main change. It is required that a type be given for the 'binding' variable on a lambda expression, and this can be done by appending a colon and the type. So, in untyped Lambda Calculus, a lambda expression might look like this

 

λx.x

or this

λx.(x x)

or this

λx.λy.λz.(x (y z))

In the Simply Typed Lambda Calculus (STLC), similar lambda expressions would need to look like this

 

λx:B.x

or this

λx:I.(x x)

or this

λx:I.λy:B.λz:I.(x (y z))

i.e. each variable governed by the lambda in a lambda expression needs to have an annotation saying what its type is (and that can be done using ':<type>').

There is another smaller change. The values of Ints and Bools are going to be allowed into the expressions as though they were free variables, in fact they are constants but they go in places where a free variable might be found. So

 

λx:B.True

or this

λx:I.(x 49)

or this

λx:I.λy:B.λz:I.(False (True 0))

are all well-formed formula.

Now, of course, 0, 1, 2, 3, False, True etc. all have types, but we have no need to say what those are explicitly by annotation in the formulas. We know what their types are: 0, 1, 2, 3 have type I and False, True have type B.

Ordinary variables in expressions exist unchanged. Bound variables are understood to denote a value of their underlying implicit type; thus, that type is the type they have. Free occurrences of variables do not play a role; if they appear in a top level  expression we can often treat them as though they do not have a type.

If we do need to type occurrences of a free variable, and there will be occasions when we will, we can imagine that there is a collection of assumptions, which is a list of key-value pairs i.e. a dictionary, and those assumptions can provide the information that, say, k is an Integer.

With this wider range of types (indeed there are infinitely many), you might see expressions like

    λx:B→(I→(I→I)).(x False)

Typing an Expression

We can type, or try to type, an expression according to some typing rules. These work on what sort of expression the expression is: expressions are either i) atomic ii) applications or iii) lambda abstractions. 

i) if an expression is atomic, then it is either a Boolean or an Integer (and we know the types of those two) or a variable (and if we need to type a variable we just look it up in the assumptions).

For applications and lambdas, we invoke an additional technique. These are complex expressions, involving proper sub-expressions which are simpler than they are. We can make the inductive assumption that we have been able to type the simpler expressions and then provide rules for getting from that to typing the more complex expression. 

ii) if an expression is an application, it has the form (<function expression> <argument expression>) and, by the inductive assumption we have types for both the function and the argument. For the application itself to be well typed, the function must have type t1→t2 and the argument type t1 for some types, tand t2 . If so, the application is well-typed, and has type t2

iii) if an expression is a lambda, it has the form λx:<type>.<scope> . Now, a lambda entire expression is intended to be a function, so we are expecting, or requiring, its type to be t1→t2  ,for some types, tand t2. We know what the t is going to be here. It is going to be the annotation on the bound variable. So this is where we are: the expression is λx:t.<scope> and its type is t1→? , we just need to find the right value for the question mark. Typically the binding variable will appear in the <scope>, it might not, but typically it would. Obviously what we want for the value of the question mark is the type of the <scope> under the assumption that any occurrences of the binding variable x in it have type t(that is why we earlier had annotated x like this x:t1 )  If the scope is not well typed, then neither is the lambda. If the scope is well typed and has type tthen the entire lambda expression well-typed has type t1→t2

Well-typed expressions

You can see immediately that some expressions are well-typed and others are not

(λx:B→B.x False)  this expression is an application; its lambda (i.e. its function), actually is the identity function, here has the type B→B i.e. Boolean to Boolean; its argument is a Boolean; so this makes sense, it is well-typed, and the reduction (the result) will have type Boolean

(λx:I→B.x False)  this expression is an application; its lambda (i.e. its function), also the identity function, is said to have the type I→B i.e. Integer to Boolean; its argument is a Boolean; so this makes no sense at all; for a start, the function is supposed to be being applied to an Integer, but in fact it is applied to a Boolean, and then the function is the identity function which going to return a result of the same type as its argument ; the expression is a nonsense and it is not well typed

Well-formed expressions and Well-typed expressions

As we have just seen above, not all well-formed expressions are well-typed ((λx:I→B.x False) is well-formed (i.e. it is a proper grammatical expression in STLC)  and it does not have a type). All expressions with a type are well-formed (the typing rules build off that).

Finding the Type of an Expression

Ideally, if an expression has a type, we would like to find that type and prove that the expression had it; and if the expression does not have a type, we would like to prove that it does not. There are different approaches here as to how to do this. In more advanced work, with more advanced type systems, there would be type inference and even type inference by algorithm. The standard approach, more or less, is Hindley-Milner type inference. But, in turn, Hindley-Milner can be carried out by different algorithms. Usually in them there is an interleaving of allocating values to variables and extracting constraints that need to be satisfied. But there is something to be said for an approach that extracts the constraints first, then solves the constraints (apparently Hindley was in favor of this). This is clearer at a pedagogical level and it generalizes better to other areas. So that is what we will try in the next set of Notes.

In the meantime, you will be able to type expressions, or realize that they do not have a type, just by using some thought, a pencil and paper and the above rules.

We will temporarily turn to something else.

Typed expressions in STLC and 'stopping'

Reductions to Normal Form in STLC are exactly the same as those in ordinary lambda calculus. We don't worry about the types while doing reductions. There is a theorem, mentioned earlier, that if an expression has a Normal Form, then Normal Order Reduction (NOR) will find it. So if we want to explore whether there is a relationship between reducing to Normal Form and having a type, we need consider only NOR.

 

Statements you might like to explore (and these refer to STLC)

  • If an expression has a type, it is always reducible to normal form (i.e. NOR will find the normal form and stop, within a finite number of steps).
  • If an expression does not have a type, it is never reducible to normal form.
  • If an expression is in normal form, it has a type.
  • If an attempted NOR reduction of an expression 'runs forever', without stopping, that expression does not have a type.

There is practical example or analog here. The crypto-currency Bitcoin has a scripting language called 'Script' which is central to all its currency transactions. Script is not 'Turing Complete', which means that there are computations which are computable which cannot be programmed in Script. However, all Script programs will halt or stop. A 'stopping' language like Script was chosen on purpose— it was deemed desirable that no financial transaction should take forever to complete. Another crypto-technology ecosystem is Ethereum which hosts and runs it own contract language Solidity. Solidity is Turing Complete, but that comes at a price. Some Solidity programs will not stop— they will run forever (also, it is not possible, in the general case, to say whether a program will stop or whether it will not stop). Roughly speaking, it's not too wide of the mark to say that Solidity can be represented in lambda calculus and Script in simply typed lambda calculus with the expression restricted to those which are well typed.