Unannotated/Annotated Abstraction and Polymorphism

11/29/19

DRAFT

[In this section, we will sometimes use annotated abstractions, and sometimes use unannotated abstractions. This is just to be flexible in discussing their respective merits.]

There certainly is an unusual feature possessed by the annotated STLC that we have thus far. Consider two abstractions with annotation

λx:I.x

and

λx:B.x

and also, this abstraction, without an annotation,

λx.x

λx.x is the Identity function. It takes x as an argument, or input, and returns x as the value, or output.  It will work perfectly with any x whatsoever, no matter what the type of x.

Going back to λx:I.x and λx:B.x. They too are identity functions, but the first is only well-typed in contexts where it is a function Int->Int (i.e. for example (λx:I.x 5)), and similarly for the second function with Booleans (e.g (λx:B.x True)). In general, being more specific with types is a good thing (that is how we catch type errors), but in a case like this the excessive type annotation is 'busy'. 

Annotation here just seems 'fussy'. What we feel we might want is something like

λx:T.x where T is a type variable that can be instantiated to Int, or Bool, or Int->Int etc. Remember we do not have type variables in the STLC itself.

As we have seen, there are algorithms, such as the Hindley-Millner one, that can produce the annotations (basically without too much trouble).

We could just omit the annotations in cases where we wished to, and algorithmic type inference could produce the types for us. Behind the scenes, the algorithm would regard λx.x as having type T->T where T is a type variable and then instantiating the type to Int etc. where required.

In cases where expression may have a type that involves type variables its type is said to be polymorphic (i.e. may have difference instances of the same general form).

Let us build something up for discussion. Consider

λfirst.λsecond.second (that is a longer and clearer version of λf.λs.s). This expression is an abstraction, which has a body that is an abstraction (in fact, identity).  If this 'function' is applied to two arguments, it will drop the first and return the second Say we apply this to two identity applications as arguments
 
((λfirst.λsecond.second (λx.x 2)) (λx.x True))

Here is the Normal Order Reduction for that expression

((λsecond.second) (λx.x True ))

(λx.x True)

True

Now, let us annotate the variables going in the reverse order

True is an Bool

(λx:B.x True)

((λsecond:B.second) (λx:B.x True ))

((λfirst:I.λsecond:B.second (λx:I.x 2)) (λx:B.x True)) and that whole expression has type Bool.

How were these variables annotated?

Well, we just looked at the applications and saw what made sense. For example, in the application (λx.x True) the abstraction has to be Bool->Bool so we annotate the abstraction, within the application, as (λx:B.x True). Similarly, in the application (λx.x 2) the abstraction has to be Int->Int so we annotate the abstraction, within the application, as (λx:I.x 2).

Annotating these variables by algorithm— might Type Variables be a good idea?

As we have seen, there are algorithms, such as the Hindley-Millner one, that can produce the annotations (basically without too much trouble).

Then earlier we mentioned that having, for example, infinitely many identity functions λx:I.x, λx:B.x, λx:I->I.x, etc. seems a bit excessive.

Why couldn't we have

λx:T.x where T is a type variable that can be instantiated to Int, or Bool, or Int->Int etc. ?

then we could probably drop the annotation and let an algorithm produce the type for us. The actual type in use would an instantiation of the type variable.

An issue: passing functions as arguments ('let bound polymorphism')

Let's make our earlier example slightly more complex by passing the identity function in.

[A preliminary: passing functions/abstractions in is not particularly mystifying. Consider

(λfunction.(function True) λx.x)

this is an application, and it reduces


(λfunction.(function True) λx.x)

(λx.x True)

True

Similarly

(λfunction.(function 2) λx.x)

(λx.x 2)

2

]

Now we can build this into our more complex expression


(λfunction ((λfirst.λsecond.second (function 2)) (function True))     λx.x)

and this reduces to

((λfirst.λsecond.second (λx.x 2)) (λx.x True))

and then the same as before down to True.

But there is an issue here, and it arises with the first reduction step involving identity


function.((λfirst.λsecond.second (function 2)) (function True))     λx.x)

The identity function  λx.x on the outer right presumably needs to be polymorphic with a type variable. The 'same' function when it is substituted into (function 2) needs to have type Int->Int. Yet the 'same' function when it is substituted into (function True) needs to have type Bool->Bool.

So, what we have here is a question, or a consideration. Can a polymorphic expression have two separate and distinct type instances, or instance occurrences, when substituted into the same expression?

Part of the problem arises from this. In STLC, when we annotated a binding variable, say like this λx:I.x, that annotation applies throughout the scope or the body of the abstraction (in this case, x). So, one would think that were we to use a variable as an annotation, say, λx:TypeVar.x all reasoning, and instantiation, involving that variable would depend solely on the scope of the abstraction. However, when we wish, for example, to use the argument 2 to help us decide the type of the function in
(function 2) then we are going outside the scope of the function and into the local environment.

Now there is, so-to-speak, no right or wrong answers here. A type system is being devised and it can be better or worse for certain purposes. What Haskell, a modern functional programming language, does is to allow you to have both schemes in these settings (multiple instances of the same type, and no multiple instances of the same type). If you write the Haskell equivalent of

(λfunction.((λfirst.λsecond.second (function 2)) (function True))     λx.x)

the compiler will produce an error message that the expression is ill-typed because you cannot have a function which is both Int->Int and Bool->Bool. But then Haskell has the keyword 'let' which can be used to introduce local definitions. If you write the Haskell equivalent of

let f = λx.x

in

((λfirst.λsecond.second (f 2)) (f True))

then that will type check and will be type correct. The 'let' keyword invites multiple separate instances of the same type variable, if they are required.

How does STLC fare with this?

STLC has no type variables and every abstraction has to have a type annotation on its binding variable. The effect of this is that, roughly speaking, most of the problematic cases that we have been looking at will fail to have a type (although many of them will reduce to normal forms that do have a type). This failure to have a type initially comes about largely because a type cannot be both an Int and a Bool at the same time. They can reduce to a normal form with a type because one of the problematic components is redundant and disappears in the reduction. Also, many normal forms have a type (for example, a normal form might be an Int, say 39, or a Bool, True)

Neither of these expressions are going to have a type, no matter what other annotations are added.

(λfunction.((λfirst.λsecond.second (function 2)) (function True))     λx:B.x)


(λfunction.((λfirst.λsecond.second (function 2)) (function True))     λx:I.x)

But this can be annotated successfully

((λfirst:I.λsecond:B.second ( λx:I.x 2)) ( λx:B.x True))