Lambda

From Wikipedia:

"In mathematical logic and computer science, lambda calculus is a formal system for function definition, function application and recursion. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus. In both typed and untyped versions, ideas from lambda calculus have found application in the fields of logic, recursion theory (computability), and linguistics, and have played an important role in the development of the theory of programming languages (with untyped lambda calculus being the original inspiration for functional programming, in particular Lisp, and typed lambda calculi serving as the foundation for modern type systems)."

As another example of use of Xsemantics we also developed a prototype implementation of a lambda-calculus in XTEXT (we'll show the grammar in the following); in this lambda-calculus we can specify the type of the parameter of the abstraction, but we can also leave it empty; we can then infer the type of each lambda term. In particular, we infer types using type variables when the type of a term can be generic. The types of this lambda-calculus can be basic types (in this example integer or string), arrow types, and type variables (denoted by identifiers).

The challenging part in writing a type system for this language is that we need to perform unification in order to infer the most general type (see, e.g., J. A. Robinson. Computational logic: The unification computation. Machine Intelligence, 6, 1971.).

Again, this is just a tutorial example, but this technique can be used to infer types in another language implemented in Xtext, especially for functional languages.

Lambda in a nutshell

You can think of lambda abstraction as a function definition (without a name), with a parameter (in this version we consider one single parameter) and a body, such as

lambda x. x

which is the identity function (given an argument it returns the same argument). Lambda application, which corresponds to function invocation, is denoted without the parenthesis, thus if we have a lambda abstraction M and an argument N we write M N to mean "invoke the function M passing N as the argument." Of course, M N is a well typed lambda term only if M is a function which takes as a parameter of a type T and N is a lambda term with a type conformant to T.

Both of the following definitions with an explicit type for the parameter are correct:

lambda x : string . x
lambda x : int . x

These two functions have types, respectively, string -> string (given a string it returns a string) and int -> int. Note that arrow types associate to the right, thus a -> b -> c is to be intended as a -> (b -> c); otherwise, we must use parenthesis.

Indeed, we can be more general and say that the parameter x can be "any" type, using a type variable (similar to Java generics):

lambda x : a . x

This function then has type a -> a; note that since we return the argument as we received it, then the return type must be the same as the argument type, thus the type variable a must be the same in a -> a.

In other cases, we cannot be generic; for instance, consider that in our language we have the unary operator - which can be used on integers only. Then, the function

lambda x . -x

imposes x to be an integer, thus this function has type int -> int.

Other functions can be partially generic, like the following one (which makes a little sense, it's used only as an example)

lambda x . 10

which has type a -> int.

We might also let the system infer the type (and that's what we intend to do with our type system definition).

For non trivial cases the type inference is more interesting than the examples we saw so far; for instance, consider this lambda abstraction

lambda x . lambda y. x y

which has type (a -> b) -> a -> b: how can this be inferred? Informally, x cannot have any type, since in the body we read x y then x must be a function; for the moment we give it a generic type X1 -> X2; what can the type of y be? It can be a generic type, say X3, but since we pass it to x then it must have the same type of the argument of x, thus we require X1 to be the same as X3. The result of x y will have the same type as the return type of x, i.e., X2. Thus, the above function has the following type: it takes an argument for the parameter x of type X1 -> X2, and it returns a function (the inner lambda abstraction) which takes an argument for the parameter y of type X1 and returns something of type X2. Thus, using different (and more readable) type variable names, (a -> b) -> a -> b (we used the parenthesis since by default arrow types associate to the right). Again, the type variables make the function generic, provided that the same type is used for all occurrences of a and the same type is used for all occurrences of b.

lambda x .lambda y.y x a -> (a -> b) -> b
lambda f .(lambda x.(f (f x))) (a -> a) -> a -> a
lambda f .lambda g.lambda x.(f(g x)) (a -> b) -> (c -> a) -> c -> b

Note that there are functions which cannot be typed (at least with simple type systems we're used to), e.g.,

lambda x . x x

cannot be typed, since x should be a function, say with type a -> b, but since we apply x to x it should also be of type a; however, a -> b and a cannot be unified, since a occurs in a -> b.

Lambda implemented in Xtext

This is the grammar for our simple lambda calculus in Xtext:

grammar org.eclipse.xsemantics.example.lambda.Lambda with org.eclipse.xtext.common.Terminals
generate lambda "http://xsemantics.sf.net/example/lambda/Lambda"

Program: term=Term;

// left associative
Term: TerminalTerm (=>({Application.fun=current} arg=TerminalTerm)*) ;

TerminalTerm returns Term:
    '(' Term ')' |
    Constant |
    Arithmetics |
    Variable |
    Abstraction
;

Constant: StringConstant | IntConstant ;
StringConstant: string=STRING;
IntConstant: int=INT;

Arithmetics: '-' term=Term;

Variable: ref=[Parameter];

Abstraction: 'lambda' param=Parameter '.' term=Term ;

Parameter: name=ID (':' type=Type)? ;

// right associative
Type: TerminalType ({ArrowType.left = current'->' right=Type)? ;

TerminalType returns Type:
    '(' Type ')' |
    BasicType |
    TypeVariable
;

BasicType:
    {IntType} 'int' |
    {StringType} 'string'
;

TypeVariable: typevarName=ID;

Lambda Type Inference

Thus, we want to write the type system definition in Xsemantics for Lambda, also inferring types (performing unification for inferring the most general type).

In the following we will skip import statements.

We start with the auxiliary functions and judgments:

system org.eclipse.xsemantics.example.lambda.xsemantics.LambdaXsemanticsSystem

validatorExtends org.eclipse.xsemantics.example.lambda.validation.AbstractLambdaJavaValidator

inject LambdaUtils lambdaUtils

auxiliary {
    notoccur(Type type, Type other)
        error stringRep(type) + " occurs in " + stringRep(other)
    typesubstitution(TypeSubstitutions substitutions, Type original) : Type
    unify(TypeSubstitutions substitutions, 
            Type left, Type right) : UnifyResult
        error "cannot unify " + stringRep(left) + 
            " with " + stringRep(right)
}

judgments {
    type |- TypeSubstitutions substitutions |> Term term : output Type
    paramtype |~ Parameter param : output Type
}

Note that this type system uses a utility class implementing a map for type substitutions which we will use during type inference and unification

package org.eclipse.xsemantics.example.lambda.xsemantics;

public class TypeSubstitutions {

    protected Map<String, Type> substitutions = new HashMap<String, Type>();

    public void reset() {
        substitutions.clear();
    }

    public void add(String typeVariableName, Type mapped) {
        substitutions.put(typeVariableName, mapped);
    }

    public Type mapped(String typeVariableName) {
        return substitutions.get(typeVariableName);
    }

    public Set<Entry<String, Type>> getSubstitutions() {
        return substitutions.entrySet();
    }
}

It uses UnionResult that is basically an implementation of a pair of Types.

Let's start with the auxiliary function implementations that check that a type variable does NOT occur in another type term (judgment notoccur, which takes two input parameters):

auxiliary notoccur(Type type, Type other) { true }

auxiliary notoccur(TypeVariable variable, TypeVariable other)
{
    variable.typevarName != other.typevarName
}

auxiliary notoccur(TypeVariable variable, ArrowType arrowType)
{
    notoccur(variable, arrowType.left)
    notoccur(variable, arrowType.right)
}

The base case (which simply always holds) catches all the other situations not handled by the specific cases; the second case states that a type variable does not occur in another one, if it has a different name, and the third one that a type variable does not in an arrow type if it occurs neither in the left nor in the right part of the arrow.

Now we show the implementation of the auxiliary function typesubstitution, which takes as input parameters a map for type substitution and the original type and returns the new type after performing substitutions.

auxiliary typesubstitution(TypeSubstitutions substitutions, Type type)
{
    type
}

auxiliary typesubstitution(TypeSubstitutions substitutions, TypeVariable variable)
{
    var mapped = substitutions.mapped(variable.typevarName)
    if (mapped != null) {
        val result = EcoreUtil::copy(mapped)
        typesubstitution(substitutions, result) // recursive
    } else
        variable
}

auxiliary typesubstitution(TypeSubstitutions substitutions, ArrowType arrowType)
{
    var Type subResult
    val result = EcoreUtil::copy(arrowType)
    subResult = typesubstitution(substitutions, arrowType.left)
    result.left = subResult
    subResult = typesubstitution(substitutions, arrowType.right)
    result.right = subResult
    result
}

The general case simply returns the input parameter as the result (no substitution needed). When we substitute a type variable we create a new Type by cloning the value the type variable maps to (if there's no existing mapping, it means that there's no need of substitution, and the result is simply the input argument). However, we cannot stop here: a type variable can map to another type variable (see the cases below for unification) which in turn can be mapped to something else and so on, thus, we must make sure we apply all the substitutions that concern a type variable: we apply the substitution recursively on the result.

For instance, if we have the mappings X1 to (int->X2) and X2 to string and we have apply the substitutions to the the type X1->X2, the result must be (int->string)->string.

The substitution for an arrow type simply delegates it to its components (but again, it clones the original type, and acts on the clone).

Cloning is mandatory, as explained in section WhyCloning.

Note that, if we do not need to have a reference to the original input type, the substitution judgment can also be invoked by passing the same argument both for the input and as the output parameter (as we will see in the following).

The auxiliary function for unification

unify(TypeSubstitutions substitutions, 
        Type left, Type right) : Type
    error "cannot unify " + stringRep(left) + 
        " with " + stringRep(right)

takes the substitutions map, two types and outputs the unified type: it tries to unify the two input types, and if the unification succeeds it outputs two new types which are the unified version of the two input types after performing the substitutions (which are also recorded in the substitutions parameter).

The implementation of this auxiliary function considers all possible cases that make sense (the default case simply fails); let's start considering the simpler ones:

auxiliary unify(TypeSubstitutions substitutions, Type t1, Type t2) {
    // if we get here we cannot unify the two types
    fail
    null


auxiliary unify(TypeSubstitutions substitutions, StringType t1, StringType t2) 
{
    EcoreUtil.copy(t1)
}

auxiliary unify(TypeSubstitutions substitutions, IntType t1, IntType t2)
{
    EcoreUtil.copy(t1)
}

auxiliary unify(TypeSubstitutions substitutions, TypeVariable typeVar, BasicType basicType)
{
    substitutions.add(typeVar.typevarName, basicType)
    EcoreUtil.copy(basicType)
}

auxiliary unify(TypeSubstitutions substitutions, BasicType basicType, TypeVariable typeVar)
{
    unify(substitutions, typeVar, basicType)
}

Besides the trivial cases, a type variable unifies with a basic type, and the type variable is replaced by (a clone) of that basic type (and the substitution is recorded). Note that we must also consider the symmetric case.

auxiliary unify(TypeSubstitutions substitutions, TypeVariable left, TypeVariable right)
{
    // unify both variables with a fresh new variable
    val fresh = lambdaUtils.createFreshTypeVariable
    substitutions.add(left.typevarName, fresh)
    substitutions.add(right.typevarName, fresh)
    fresh
}

When we need to unify two type variables we create a fresh new type variable (using the utility class LambdaUtils, and we map both variables to that new fresh variable.

auxiliary unify(TypeSubstitutions substitutions, TypeVariable v, ArrowType arrow)
{
    // occur check
    notoccur(v, arrow)
    substitutions.add(v.typevarName, arrow)
    EcoreUtil.copy(arrow)
}

auxiliary unify(TypeSubstitutions substitutions, ArrowType arrow, TypeVariable v)
{
    unify(substitutions, v, arrow)
}

auxiliary unify(TypeSubstitutions substitutions, ArrowType arrow1, ArrowType arrow2)
{
    val newArrow1 = EcoreUtil.copy(arrow1)
    val newArrow2 = EcoreUtil.copy(arrow2)
    var result = unify(substitutions, arrow1.left, arrow2.left)
    newArrow1.left = EcoreUtil.copy(result)
    newArrow2.left = EcoreUtil.copy(result)
    result = unify(substitutions, arrow1.right, arrow2.right)
    newArrow1.right = EcoreUtil.copy(result)
    newArrow2.right = EcoreUtil.copy(result)
    newArrow1
}

When we unify a type variable with an arrow type, before substituting that type variable with the arrow type, we must first check that the variable does not occur in the arrow type (otherwise the unification fails). And two arrow types are unified by unifying their components.

Now we can start examining the actual typing rules.

First of all we need a dedicated judgment for Param, since it is not a subclass of Term:

rule ParameterType
    G |~ Parameter param : Type type
from {
    {
        param.type != null
        type = EcoreUtil::copy(param.type)
    }
    or
    type = lambdaUtils.createFreshTypeVariable
}

Remember that we can write the type of abstraction's parameter, and in that case the type is the clone of the original one; otherwise, we give the parameter a brand new type variable.

The judgment for typing takes as a parameter also the substitutions mapping table, since, as we will see, when inferring a type, we must also apply the possible substitutions computed so far.

axiom StringConstantType
    G |- TypeSubstitutions substitutions |> 
        StringConstant stringConstant : lambdaUtils.createStringType

axiom IntConstantType
    G |- TypeSubstitutions substitutions |> 
        IntConstant intConstant : lambdaUtils.createIntType

The typing of constants is trivial.

For the typing of a variable, one might be tempted to write

// WRONG!
rule VariableType
    G |- TypeSubstitutions substitutions |> 
        Variable variable : Type type
from {
    G |~ variable.ref : type
}

that is, the type of a variable is the type of the referred parameter. However, this would not work, since in the meantime we might have collected substitutions for possible type variables occurring in the type type of the referred parameter. Thus, the type of a variable, is the type of the referred parameter AFTER we have applied to it possible substitutions, that is:

rule VariableType
    G |- TypeSubstitutions substitutions |> 
        Variable variable : Type type
from {
    type = typesubstitution(substitutions, 
        EcoreUtil::copy(env(G, variable.ref, Type)))
}

Note also that we rely on the fact that the type of the parameter is in the environment (the environment will be updated with such information later in the typing rule for abstraction).

rule ArithmeticsType
    G |- TypeSubstitutions substitutions |> 
        Arithmetics arithmetics : IntType intType
from {
    intType = lambdaUtils.createIntType
    G |- substitutions |> arithmetics.term : var Type termType
    // the term type must be unifiable with int type
    unify(substitutions, termType, intType)
}

The code for Arithmetics will be integer, however, we must check whether the type of the subterm has a type which can be unified with IntType.

rule AbstractionType
    G |- TypeSubstitutions substitutions |> Abstraction abstraction : ArrowType type
from {
    G |~ abstraction.param : var Type paramType
    G, abstraction.param <- paramType |- 
        substitutions |> abstraction.term : var Type termType
    paramType = typesubstitution(substitutions, paramType)
    termType = typesubstitution(substitutions, termType)
    type = lambdaUtils.createArrowType(paramType, termType)
}

For typing a lambda abstraction we take the type of the parameter and we type the body of the abstraction after putting the mapping for the type of the parameter in the environment (see also section Environment). Then we apply possible substitutions to the type of the parameter and the type of the body, and we return a brand new arrow type, using the two components.

rule ApplicationType
    G |- TypeSubstitutions substitutions |> Application application : Type type
from {
    G |- substitutions |> application.fun : var Type funType
    
    // make sure funType can be unified ArrowType
    var arrowType = lambdaUtils.createFreshArrowType
    arrowType = unify(substitutions, funType, arrowType) as ArrowType
    
    G |- substitutions |> application.arg : var Type argType
    
    // unify arrow left with the type of the argument
    unify(substitutions, arrowType.left, argType)
    
    // the result is the arrow right after substitutions
    type = typesubstitution(substitutions, arrowType.right)
}

For typing a lambda application, we require the left part of the application to be a function, thus we take its type and we make sure it can be unified with a generic arrow type. Then, we take the type of the argument, and we try to unify it with the left part of the arrow type. If also this unification succeeds, the resulting type is the right part of the arrow type (after substitutions).

Let's see, informally, how these rules infer the type for

lambda x. lambda f. f -x

To avoid ambiguities with the -> of ArrowType in the following example we use --> for substitution mappings (which are shown in { }). Try not to confuse mappings for parameters and mappings representing type variable substitutions, which have strings as keys. Mappings for the environment are represented as key <- value.

This is informally the trace of the rules:

* AbstractionType
** what is the type of x? a fresh type variable X1
** type the body (lambda f. f -x) with environment mapping x <- X1
*** AbstractionType
**** what is the type of f? a fresh type variable X2
**** type the body (f -x) with (additional) environment mapping f <- X2
***** ApplicationType
****** what is the type of the left part? the type variable X2
****** (we used VariableType)
****** but it must be an arrow type
****** unify X2 with a fresh arrow type (X3 -> X4)
****** this will generate the substitution 'X2' --> (X3 -> X4)
****** {'X2' --> (X3 -> X4)}
****** thus now the type of f is (X3 -> X4)
****** what is the type of the right part? it is int
****** (we used ArithmeticsType, and we unified type of x, X1 with int)
****** {'X2' --> (X3 -> X4), 'X1' --> int}
****** try to unify X3 with int
****** {'X2' --> (X3 -> X4), 'X1' --> int'X3' --> int}
****** thus now the type of f is (int -> X4)
****** the type of the application is X4
**** apply the substitution to the type of param and body
**** f of type X2, becomes f of type (int -> X4)
**** the type of body (f -x) becomes X4
**** the resulting type is then (int -> X4) -> X4
** apply the substitution to the type of param and body
** x of type X1 becomes x of type int
** the body has type (int -> X4) -> X4
** the type of our term is int -> (int -> X4) -> X4

This type, let's write it for simplicity as int -> (int -> a) -> a, says that we must pass to this lambda term a function which takes an integer, and a function which takes an integer and returns "any type"; as a result we will have that "any type". Thus the following lambda application terms are well typed (remember that application associates to the left):

(lambda x. lambda f. f -x) (10) (lambda y. -y)
(lambda x. lambda f. f -x) (1) (lambda y. "foo")

The first one will return 10 (actually --10), the second one "foo".

Why Cloning?

Remember that the types we are manipulating might have been explicitly specified in the text of the program by the programmer, thus, if we did not use clones, during substitutions, we would directly modifying the AST of the original program, which is not what we want.

Lambda Type Checking

Now that we wrote all the rules for inferring types of Lambda terms, we can write a checkrule (see section CheckRules) that says that a Lambda Program is correct if we can assign a type to the term (starting from an empty environment):

checkrule CheckProgram for
    Program program
from {
    lambdaUtils.resetCounter
    // if we can type the program term
    empty |- new TypeSubstitutions() |> program.term : var Type type
}

The invocation of resetCounter is just to reset the counter used by LambdaUtils to create fresh type variables.

Customizations for Lambda

Since type variables during the type inference have the form X<number> which makes it hard to visually interpret an inferred type, we use a custom StringRepresentation, namely LambdaStringRepresentationWithTypeBeautifier, which relies on LambdaTypeBeautifier: this will consistently replace type variables in an inferred type with letters. For instance, X1 -> X3 -> (X4 -> X3) would become a -> b -> (c -> b).

Another customization that we perform in this example, is the way error markers will be generated.

For instance, due to the way error markers are generated (section ErrorGeneration), if we have the following code

lambda f . -'f'

we would get an error since the rule ArithmeticsType cannot be applied. However, we would like also to know why it cannot be applied, i.e., because the string type of the string literal 'f' cannot be unified with the int type required by ArithmeticsType.

We can fix this by injecting a custom XsemanticsValidatorFilter, so that it also shows the rules which failed from that point on:

class LambdaValidatorFilter extends XsemanticsValidatorFilter {
    
    @Inject extension TraceUtils
    
    override filterRuleFailedExceptions(RuleFailedException e) {
        val inner = e.innermostRuleFailedExceptionWithNodeModelSources
        if (inner != null)
            return inner.failureAsList
        else // we must return at least a failure, so we default to the passed one
            return Lists::newArrayList(e)
    }
}

Use of Type Inference in the IDE

We can also use the generated Java code for the inference type system to create an editor popup action (for the editor of Lambda) to automatically infer types and add the types in the program text, in particular we set the types of the parameters of abstractions. Most of the code we show here deals with Eclipse plugin development: the part which uses the code of the type system generated by Xsemantics is minimal.

This is done in the class LambdaTypeModifier (you can see the complete code in the sources of the Lambda example); we show here just the main parts:

public class LambdaTypeModifier {

    @Inject
    protected LambdaXsemanticsSystem typeSystem; // generated by Xsemantics

    @Inject
    protected Provider<LambdaTypeBeautifier> lambdaTypeBeautifierProvider;

    public RuleFailedException setAllTypes(Term term) {
        Result<Type> result = typeSystem.type(new TypeSubstitutions(),
                LambdaTermUtils.cloneWithoutTypes(term));
        if (result.getRuleFailedException() != null)
            return result.getRuleFailedException();
        Type inferredType = result.getValue();
        beautifyType(inferredType);
        setAllTypes(term, inferredType);
        return null;
    }
    
    protected void beautifyType(Type paramType) {
        lambdaTypeBeautifierProvider.get().beautifyTypeVariables(paramType);
    }
    ...

Through the generated type system we infer the type of the term and then we update all the types of the parameters of the abstractions in the term. Note that before we "beautify" the type variable names.

As explained in the Xtext documentation, since we need to modify the EMF model of the program in the open editor, we need to use the IXtextDocument (src) and its modify method, with a IUnitOfWork (src); this is done by the class LambdaTermModifier:

public class LambdaTermModifier {

    @Inject
    protected LambdaTypeModifier lambdaTypeModifier;

    public void modifyTermWithInferredType(IXtextDocument xtextDocument) {
        xtextDocument.modify(new IUnitOfWork.Void<XtextResource>() {
            public void process(XtextResource resource) {
                Program program = (Program) resource.getContents().get(0);
                lambdaTypeModifier.setAllTypes(program.getTerm());
            }
        });
    }

}

Then we have the code implementing the editor action (InferTypesAction):

public class InferTypesAction implements IEditorActionDelegate {
    protected IEditorPart editor;

    protected LambdaTermModifier lambdaTermModifier;

    public InferTypesAction() {
        lambdaTermModifier = LambdaUiUtil.getInjector().getInstance(
                LambdaTermModifier.class);
    }

    public void run(IAction action) {
        IXtextDocument xtextDocument = ((XtextEditor) editor).getDocument();
        lambdaTermModifier.modifyTermWithInferredType(xtextDocument);
    }

    public void selectionChanged(IAction action, ISelection selection) {

    }

    public void setActiveEditor(IAction action, IEditorPart targetEditor) {
        this.editor = targetEditor;
    }

}

Let's see this editor action in action: suppose you have the following opened lambda file (this is the abstraction for composing functions):

lambda f .
    lambda g .
        lambda x .
            f (g x)

We can right click on the editor and select the infer type action. And the content of the editor will be updated with the inferred types:

The original lambda program

Invoke the context menu action

The modified lambda program with inferred types