Shlomo Yona <yona@cs.technion.ac.il> http://yeda.cs.technion.ac.il/~yona/
I am pretty sure we will not have time to go through all the slides in depth.
I'll occasionally skip a slide or a few, if I see we're running out of time.
I hope to pass on the ideas and the simple implementation, and the rest, you can complete later using the references given at the end of the lecture.
Closure is a notion out of the Lisp world that says if you define an anonymous function in a particular lexical context, it pretends to run in that context even when it's called outside the context.
It's useful for setting up little bits of code to run later, such as callbacks.
sub newprint {
my $x = shift;
return sub { my $y = shift; print "$x, $y!\n"; };
}
$h = newprint("Howdy");
$g = newprint("Greetings");
# Time passes...
&$h("world");
&$g("earthlings");
This prints
Howdy, world! Greetings, earthlings!
Note particularly that $x continues to refer to the value passed into newprint() despite "my $x" having gone out of scope by the time the anonymous subroutine runs. That's what a closure is all about.
my $print_hello = sub { print "Hello, world!\n"; };
# later on...
$print_hello->(); # prints the Hello, world!
&$print_hello; # also prints the Hello, world!
sub make_hello_printer {
return sub { print "Hello, world!"; }
}
# later on...
my $print_hello = make_hello_printer();
$print_hello->()
sub make_hello_printer {
my $message = "Hello, world!";
return sub { print $message; }
}
# later on...
my $print_hello = make_hello_printer();
$print_hello->()
As you'd expect, that prints out the Hello, world! message.
Nothing special going on here, is there? Well, actually, there is.
This is a closure. Did you notice?
What's special is that the subroutine reference we created refers to a lexical variable called $message.
The lexical is defined in make_hello_printer, so by rights, it shouldn't be visible outside of make_hello_printer, right? We call make_hello_printer, $message gets created, we return the subroutine reference, and then $message goes away, out of scope.
Except it doesn't. When we call our subroutine reference, outside of make_hello_printer, it can still see and receive the correct value of $message.
The subroutine reference forms a closure, ``enclosing'' the lexical variables it refers to.
sub make_counter {
my $start = shift;
return sub { $start++ }
}
my $from_ten = make_counter(10);
my $from_three = make_counter(3);
print $from_ten->(); # 10
print $from_ten->(); # 11
print $from_three->(); # 3
print $from_ten->(); # 12
print $from_three->(); # 4
We've created two "counter" subroutines, which have completely independent values.
This happens because each time we call make_counter, Perl creates a new lexical for $start, which gets wrapped up in the closure we return.
So $from_ten encloses one $start which is initialized to 10, and $from_three encloses a totally different $start, which starts at 3.
An interesting thing happens when an anonymous subroutine uses a my variable defined in an enclosing scope.
Each time the enclosing scope is entered, the subroutine gets a different copy of the my variable.
An anonymous subroutine that refers to a my variable from an eclosing scope is called a closure.
Or, stated another way, a closure is an anonymous subroutine that has access to private variables of its own that are otherwise inaccessible.
Objects are data that have some subroutines attached to them.
Closures are subroutines that have some data attached to them.
for (0..2) {
my $time = time; # a new $time each time
push @stamp, sub { $time }; # each new anonymous
sleep 2; # subroutine has its
} # very own $time
for (0..2) {
print "stamp->($_): ", $stamp[$_]->(), "\n";
}
Each of those copies has its own copy of $time, created when my $time was encountered at the top of the loop.
Each copy of $time can be accessed by the copy of sub { $time } that it is bound to, even (or especially) later in the execution of the program when the my variable has gone out of scope.
Here is the classical counter example again:
sub make_counter {
my $i = 0;
sub { $i++ };
}
$count1 = make_counter;
$count2 = make_counter;
print "count1 is ", $count1->(), "\n"; # count1 is 0
print "count1 is ", $count1->(), "\n"; # count1 is 1
print "count2 is ", $count2->(), "\n"; # count2 is 0
print "count2 is ", $count2->(), "\n"; # count2 is 1
Two or more closures can even share a common set of variables, allowing a programming style that starts to look very object-oriented indeed.
sub make_binary { # creates a code
eval "sub { $_[0] }"; # ref with string eval.
}
while (<DATA>) { # read subroutine
my ($name, $code) = # bodies from
split /\s+/, $_, 2; # and create
$op{$name} = make_binary $code; # code refs out of them.
}
for (sort keys %op) { # call each of the
print "2 $_ 3 = ", # subroutines for
$op{$_}->(2,3),"\n"; # arguments (2,3).
}
__DATA__
add $_[0] + $_[1]
sub $_[0] - $_[1]
mul $_[0] * $_[1]
div $_[0] / $_[1]
max $_[0] > $_[1] ? $_[0] : $_[1]
Runtime generation of functions -- output
Runtime generation of functions -- output
When run prints:
2 add 3 = 5
2 div 3 = 0.6666666667
2 max 3 = 3
2 mul 3 = 6
2 sub 3 = -1
Note te sorted order.
Creating subroutines for pattern matching
Creating subroutines for pattern matching
It's not uncommon to need to perform one or more pattern matches that are specified at run time.
For example, you might be writting a Perl program to sort through your mail or news. Such a program would likely read in a "kill file" of patterns to match against the headers. You can specify matches at run time by interpolating variables into regular expressions, but such regular expressions will be repeatedly compiled, at a considerable cost in speed.
The /o option provides a means for compiling a pattern match containing interpolated variables only once. However, if you have several such matches to deal with, you are faced with a bot of a problem.
Creating subroutines for pattern matching (cont.)
Creating subroutines for pattern matching
Using closures in combination with eval allows you to generate subroutines that have particular regular expressions "locked in" with the same flexibility (and efficiency!) as if the expressions were specified at compile time.
sub make_grep {
my $pat = shift;
eval 'sub { grep /$pat/o, @_ }';
}
$find_us =
make_grep q/\b(joseph|randal)\b/i;
@found = &$find_us(<STDIN>);
The make_grep subroutine is a subroutine factory. The pattern is passed in as the first (only) argument as a string. $find_us contains a reference to a subroutine that looks for joseph or randal case ignored. And @found contains all matching lines from STDIN.
Creating subroutines for pattern matching (cont.)
Creating subroutines for pattern matching
The key to this construct is the use of string eval in make_grep. Using /o inside string eval still means "compile once", but now "once" means once per eval.
Lambda Calculus -- Background
Lambda Calculus -- Background
When computer scientists want to study what is computable, they need a model of computation that is simpler than real computers are. The usual model they use involves a Turing Machine, which has the following parts:
- One state register which can hold a single number, called the state; the state register has a maximum size specified in advance.
- An infinite tape of memory cells, each of which can hold a single character, and a read-write head that examines a single square at any given time.
- A finite program, which is just a big table. For any possible number N in the register, and any character in the currently-scanned memory cell, the table says to do three things: It has a number to put into the register, replacing what was there before,; it has a new character to write into the current memory cell, replacing what was there before, and it has an instruction to the read-write head to move to the next cell, the previous cell, or to stay still.
Lambda Calculus -- Background (cont.)
Lambda Calculus -- Background
This may not seem like a very reasonable model of computation, but computer scientists have exhibited Turing machines that can do all the things you usually want computers to be able to do, such as performing arithmetic computations and running interpreter programs to simulate the behavior of other computers.
They've also showed that a lot of obvious `improvements' to the Turing machine model, such as adding more memory tapes, random-access memory, more read-write heads, more registers, or whatever, don't actually add any power at all; anything that could be computed by such an extended machine could also have been computed by the original machine, although perhaps slowly.
Lambda Calculus -- Background (cont.)
Lambda Calculus -- Background
Finally, a lot of other totally different models for computation turn out to be equivalent in power to the Turing machine model. Each of these models has some feature about it that suggests that it really does correspond well to our intuitive idea of what is computable. One such model is the Lambda Calculus, invented by Alonzo Church. Lambda Calculus is intended to capture, in a very simple way, the idea of creating a function and then calling it on an argument.
Miraculously, the Lambda Calculus model is both simpler than the Turing machine model and more practical for real computation. The programming language Lisp, for example, is little more than Lambda Calculus in disguise---and not a very disguising disguise, either.
Lambda Calculus
Lambda Calculus
If you wanted to investigate Turing machines, you'd have to start by writing a simulator; then you'd have to write Turing machine programs to perform functions like addition and multiplication, and that soon becomes impossibly difficult. But to investigate Lambda Calculus, you don't need a simulator. Lambda calculus isn't a machine; it's more like a programming language. In fact, it is a programming language. It's programming language with a syntax for defining functions and invoking them, and nothing else. The language is so simple that you can view it as a subset of Perl.
The Lambda Calculus model of computation is very, very simple. There are only two operations. You can create a function of one argument with a specified body, and you can apply one of these functions to an argument.
Lambda Calculus -- Adding numbers
Lambda Calculus -- Adding numbers
Let's fast forward to the future and see what we're after:
print_number($ADD->($ONE)->($ONE));
# prints: "This number is actually 2."
Let's look inside and see how this was done:
lambda-brief.pl
Let's learn why it was done this way (as time permits).
Lambda Calculus -- %x.B
Lambda Calculus -- %x.B
The first operation is denoted like this:
%x.B
where B is the body and x is the formal parameter of the function. The % is usually written with a lowercase letter lambda, but my typewriter has no lambda, so I'll use % instead. B here is some arbitrary expression.
When the function is invoked on some argument A, you locate the occurrences of the formal parameter x inside the body B, and you replace each X with a copy of the argument A.
This is just what any language does when it calls a function: It looks in the body of the function, and replaces copies of the formal parameter with the actual argument.
Lambda Calculus -- Example
Lambda Calculus -- Example
Here's a function:
%x.(x x y)
If we apply this function to the argument (p q), we get
((p q) (p q) y)
To denote function application, we just juxtapose the function and its argument. So the function application we just discussed is denoted this way:
(%x.(x x y) (p q))
We say that this expression reduces to the simpler expression
((p q) (p q) y)
Lambda-expressions in Perl
Lambda-expressions in Perl
If we want to write %x.B, we'll just say
sub {
my $x = shift;
B
}
%x.B means to create a function, which, when applied to an argument A, substitutes it into B in place of x, and yields the result. And that's exactly what sub { my $x = shift; B } does.
Lambda-expressions in Perl (cont.)
Lambda-expressions in Perl
In Lambda Calculus, (p q) means to apply a function p to an argument q; Perl has that too. We'll just say:
$p->($q)
That's all we need; we can translate our Lambda Calculus expressions directly into Perl and have the Perl interpreter evaluate them immediately. Only the syntax is different, and not even very different at that.
Lambda Calculus -- Function application associates to the left
Lambda Calculus -- Function application associates to the left
so that
(a b c)
is an abbreviation for
((a b) c)
here we apply the function a to the argument b, and then apply the result to the argument c.
Lambda Calculus
Lambda Calculus
We can spend some time to build up all the stuff we need for our calculus, using the terminology of Lambda Calculus, but I suppose you're probably not interested in learning new syntax...
Since you're here for Perl - let me briefly summarize some fundemental details, and you'll have to believe me I'm telling you the truth, and then we can directly go see how to use this stuff in Perl.
Lambda Calculus -- Normal Form
Lambda Calculus -- Normal Form
In 1934, Church and Rosser proved that no expression has more than one normal form, and so any two sequences of reductions that end with a normal form give you the same normal form.
This means that we can consider the normal form to be the `value' of an expression: It's what's left when we finish evaluating it, and it doesn't matter how we carry out the steps in the evaluation.
Lambda Calculus -- What do we need for real computation?
Lambda Calculus -- What do we need for real computation?
Now what does all this have to do with real computation?
For computation we need several things:
- We need boolean constants and an if-then test,
- We need numbers,
- We need arithmetic,
and it would be nice to have some data structures too.
Currying
Currying
At first it's not clear that we're going to be able to accomplish this.
We only have functions of one argument; how are we going to express addition, which is a function of two arguments?
It turns out that the restriction to one-argument functions is no restriction at all. That's because we can play a trick called currying.
Currying (cont.)
Currying
First, imagine the function f3 that takes one argument and adds 3 to it.
Now imagine the similar function f4 that takes one argument and adds 4 to it.
Now imagine all the other functions in this family. Certainly they're all functions of one argument.
Our `add' function is going to take one argument.
If that argument is 3, it'll return f3. If the argument is 4, it'll return f4. if it's some other number, add will do the analogous thing.
Currying (cont.)
Currying
Now let's see what happens when we try to evaluate
(add 3 4)
This is just short for
((add 3) 4)
We already said that (add 3) will produce f3:
(f3 4)
f3 is the function that adds 3 to its argument, so the result of this is
7
that's just what we wanted---we put in 3 and 4 and got 7.
Currying -- Doing it in Perl
Currying -- Doing it in Perl
In Perl, we could express it this way:
sub add {
my $x = shift;
my $f = sub {
my $y = shift;
return $x + $y;
}
return $f;
}
add gets one argument $x, and constructs a function, $f.
$f is the function that gets a number $y, adds $x to it, and returns the result. add itself returns $f. When you apply $f to a number $y, it adds $x to it; when you apply add to a number $x, you get $f.
Currying -- Doing it in Perl (cont.)
Currying -- Doing it in Perl
In perl the notation is a little cumbersome; add actually returns a code reference, and to call it you have to use the &{...}(...) syntax, so it looks like this:
$sum = &{add(3)}(4); # Sum is 7
In perl 5.004 and later, there's a tidier notation, which we'll use from now on:
$sum = add(3)->(4); # Sum is 7
If $add is a reference to the add function itself, it looks like this:
$sum = $add->(3)->(4); # Sum is 7
As promised, we have an addition function that is made of functions of only one argument.
Back to general programming -- our goals
Back to general programming -- our goals
Our goals are to develop Lambda Calculus versions of if-then tests, boolean and integer constants, arithmetic operators, and some simple data structures.
if-then
if-then
The if-then test is most fundamental.
We can imagine that `if' is a function of three arguments.
The first is a boolean value; the second is the `then' clause, and the third is the `else' clause.
Function Call Semantics
Function Call Semantics
We have a problem, and the problem is that at any particular step, there might be any number of ways to reduce a given Lambda Calculus expression.
And we have to pick the right one.
The Church-Rosser theorem says that no matter how we do it, the normal form that we get at the end will be the same one---if we actually reach a normal form.
But if we follow the wrong path, we might never reach any normal form at all.
Function Call Semantics (cont.)
Function Call Semantics
sub loop_forever { while (1) { 1; } }
sub return_Q { my $arg = shift; return "Q" }
In Perl, the expression
return_Q(loop_forever())
does indeed loop forever, even though return_Q never uses its argument, because Perl has call by value semantics: This means that a function's arguments are always fully evaluated before the function is called. The other option is call by name, in which the argument is passed to the function without being evaluated, and is only evaluated later if it is required.
Function Call Semantics (cont.)
Function Call Semantics
You might think that since call-by-value is workable for Perl, it will work in practice in Lambda Calculus also. But you'd be wrong: call by value doesn't always work in Perl; there are some essential places where Perl (and every other language) uses call by name instead. To see why, suppose you were to try to write your own function to replace Perl's if construction:
trying to replace Perl's if construction
trying to replace Perl's if construction
sub my_if {
my ($condition, $then_part, $else_part) = @_;
if ($condition) { $then_part } else { $else_part }
}
Then you'd write
$max = my_if($x > $y, $x, $y);
and in fact this works, for this example.
But it doesn't work in general:
trying to replace Perl's if construction (cont.)
trying to replace Perl's if construction
sub factorial {
my $n = shift;
return my_if($n == 0, 1, $n*factorial($n-1));
}
This function loops forever on any input. When n=0, it's important that we not make another recursive call---but this function does try to evaluate 0*factorial(-1) and pass this result to my_if; it doesn't matter that my_if would eventually have chosen the other branch, which was 1; with call by value we have to compute both branches before we can ask which one we want. And that defeats the whole point of if which is to compute one or the other, but not both.
trying to replace Perl's if construction (cont.)
trying to replace Perl's if construction
What we'd like to do is somehow express IF in a way that delays the evaluation of x and y until after the IF has selected one or the other of them.
The simplest way to delay evaluation is to wrap up $x and $y into functions. This does not evaluate x, but rather creates a function that will evaluate it sometime in the future, when we apply the function to an argument; the argument is ignored, and out pops x.
Lambda-expressions in Perl
Lambda-expressions in Perl
So, for example, the straightforward Perl translation of
IF = %b.(%x.(%y.(b x) y))
is:
$IF = sub {
my $b = shift;
sub {
my $x = shift;
sub {
my $y = shift;
$b->($x)->($y);
}
}
}
Lambda-expressions in Perl (cont.)
Lambda-expressions in Perl
Here are a few other basic definitions:
# TRUE = %x.(%y.x)
$TRUE = sub {
my $x = shift;
sub {
my $y = shift;
$x;
}
}
# FALSE = %x.(%y.y)
$FALSE = sub {
my $x = shift;
sub {
my $y = shift;
$y;
}
}
Lambda-expressions in Perl (cont.)
Lambda-expressions in Perl
Let's try that out to see that it works right:
print $IF->($TRUE)->("then")->("else"); # prints "then"
print $IF->($FALSE)->("then")->("else"); # prints "else"
We didn't need to use the delayed-evaluation trick here because it didn't matter that the strings true and false were evaluated before the IF was called.
Data Structures -- ordered pair
Data Structures -- ordered pair
The simplest data structure is the ordered pair.
We'll define a pair function that takes two values and combines them into one, from which the original values can still be extracted.
In Lisp this operation is called cons, although the object produced by cons is in fact called a `pair'.
Data Structures -- ordered pair (cont.)
Data Structures -- ordered pair
We want to have PAIR, FIRST, and SECOND functions such that
(FIRST (PAIR A B)) -=- A
(SECOND (PAIR A B)) -=- B
The insight here is that we can use a partially-evaluated IF construction.
Lambda-expressions in Perl -- implementing PAIR
Lambda-expressions in Perl -- implementing PAIR
# PAIR = %a.(%b.(%f.f a b))
$PAIR = sub {
my $a = shift;
sub {
my $b = shift;
sub {
my $f = shift;
$f->($a)->($b);
}
}
};
Lambda-expressions in Perl -- implementing FIRST and SECOND
Lambda-expressions in Perl -- implementing FIRST and SECOND
# FIRST = %p.(p TRUE)
$FIRST = sub {
my $p = shift;
$p->($TRUE);
};
# SECOND = %p.(p FALSE)
$SECOND = sub {
my $p = shift;
$p->($FALSE);
};
Lambda-expressions in Perl -- implementing ZERO and IS_ZERO
Lambda-expressions in Perl -- implementing ZERO and IS_ZERO
# ZERO = PAIR TRUE TRUE
$ZERO = $PAIR->($TRUE)->($TRUE);
# IS_ZERO = %n.(FIRST n)
$IS_ZERO = sub {
my $n = shift;
$FIRST->($n);
};
Ordered pair -- example
Ordered pair -- example
As Lisp programmers know, with pairs you can contruct almost any interesting data structure.
For example, a linked list is a sequence of pairs, each of whose SECOND elements is the pair that represents the `rest' of the list.
Numbers
Numbers
Now we're in good shape to do numbers.
Integers have a few fundamental properties: There's a ZERO element, and an IS_ZERO function to recognize it.
Every number has a successor (the next number), which is generated by the SUCC function.
And finally, every number except zero has a predecessor, computed by the PRED function.
These four values must satisfy the following properties:
Numbers (cont.)
Numbers
IS_ZERO ZERO -=- TRUE
IS_ZERO (SUCC x) -=- FALSE
PRED (SUCC x) -=- x
There are a few other properties they should satisfy: if x and y are different numbers, then (SUCC x) and (SUCC y) should be different also, and so on. But these three are enough for now.
Numbers (cont.)
Numbers
Now, for example, we can construct the number ONE:
ONE = SUCC ZERO
TWO = SUCC ONE
and so on.
It turns out that these few operations are enough to define any arithmetic function on the natural numbers. For example, here's the addition function:
sub add {
my ($m, $n) = @_;
if (IS_ZERO($m)) { return $n }
else { return add(PRED($m),SUCC($n)) }
}
Numbers (cont.)
Numbers
All the tools we need here are available: We can construct functions that bind arguments and return values; we have an if-then-else test; we have IS_ZERO and SUCC and PRED. Similarly, here's the function that computes whether or not two numbers are equal:
sub equal {
my ($m, $n) = @_;
if (IS_ZERO($m)) {
return IS_ZERO($n);
} else {
if (IS_ZERO($n)) {
return FALSE;
} else {
return equal(PRED($m), PRED($n));
}
}
}
Debugging Lambda Calculus Programs in Perl
Debugging Lambda Calculus Programs in Perl
Let's check to make sure that this works:
print $IS_ZERO->($ZERO); # prints CODE(0x81179cc)
Oops! This anonymous function that we got might be the TRUE function %x(%y.x), or it might be something else, and there's no simple way to peek inside to see. Let's define a utility function that analyzes a Lambda Calculus boolean and tells us which it is:
sub print_bool {
my $f = shift;
print $IF->($f)->("It's true")->("It's false");
}
print_bool($IS_ZERO->($ZERO)); # prints "It's true"
# SUCC = %n.(PAIR FALSE n)
$SUCC = sub {
my $n = shift;
$PAIR->($FALSE)->($n);
};
# ONE = SUCC ZERO
$ONE = $SUCC->($ZERO);
# TWO = SUCC ONE
$TWO = $SUCC->($ONE);
print_bool($IS_ZERO->($ONE)); # prints "It's false"
# PRED = %n.(SECOND n)
$PRED = sub {
my $n = shift;
$SECOND->($n);
};
sub convert_to_perl_number {
my $n = shift;
return "OOPS($n)" unless ref $n eq 'CODE';
my $o = 0;
until ($IF->($IS_ZERO->($n))->(1)->(undef)) {
$o++; $n = $PRED->($n);
}
$o;
}
sub print_number {
print
"If this is a number, its value is ",
convert_to_perl_number(shift()), "\n" ;
}
print_number($SUCC->($TWO));
# prints: "If this is a number, its value is 2."
The peculiar ($IF->($IS_ZERO->($n))->(1)->(undef)) expression tests a Lambda Calculus number with $IS_ZERO, then converts the Lambda Calculus-style boolean result into a Perl-style boolean value, either 1 or undef, for use in the until loop.
I still owe you explanations about how we implement (really implement) addition.
This is actually done using recursion. To implement recursion we need to be able to apply the lambda expression without looping forever, there is a nice trick to do that, which is surprisingly simple, but we won't have time for that.
Also, in order to be able to do this trick we need to be able to force delayed computation (to overcome Perl's semantics) - We'll not see that too, I'm afraid.
See http://perl.plover.com/lambda/tpj.html for the full detailed discussion.
The material of this lecture on the lambda calculus was provided by Mark Jason Dominus. His original article and other material is available here.