Recursive closures =====================================================
Haldean Brown First draft: Nov 2016
Last updated: Jan 2018
Status: In progress, SELFREF not yet implemented
------------------------------------------------------------------------
The closure transformation turns closures into partially-applied
functions, by turning values that have been closed over into arguments
to the function itself. The specific algorithm for doing this is
detailed greatly in the header closure.c, but the general gist is that:
It transforms this:
\x -> (\y -> + x y)
To this:
\x -> ((\x0 y -> + x0 y) x)
It transforms this:
\x -> (\y -> (\z -> x))
Into this:
\x -> ((\x0 y -> ((\x1 z -> x1) x0)) x)
It transforms this:
\x -> {
: y = + x 10
! \z -> y
}
Into this:
\x -> {
: y = + x 10
! (\y0 z -> y0) y
}
This closes these functions over the enclosing scope by making all
enclosing scope information an explicit argument to the function, and
then partially applying the function over that information.
This document describes how the closure transformation interacts with
recursion for values that are not bound to a global name. For example,
this snippet:
! {
: t = \x -> ? {
. eq x 0 => "ok\n"
. => t 0
}
! t 1
}
In this example, t needs access to its own value from within its
definition. However, since t is a local binding, it's never actually
assigned to a name that persists past its own scope. To handle this, we
transform t to take itself as a parameter; callers then call t by
passing it itself. The resulting function then looks like:
: t = \$t x -> ? {
. eq x 0 => "ok\n"
. => $t $t 0
}
This poses a problem for the callers of this function, though; now all
callers have to know that t needs to be passed itself as its first
argument, so the immediate value of the block would be transformed to:
! t t 1
But this is clunky; we don't want to have to transform every call site
that references t. Instead, we transform t to a partially-applied
function, with itself already applied (for brevity, B is used to
represent the body of the function t as given in the previous listing):
: t = (\$t x -> B) (\$t x -> B)
Note that this can be very succinctly represented in Ubik bytecode,
without requiring a double-definition of the function, as the function
must be its own value:
000 VALUE &B
001 APPLY 000 000
This seems nice in practice, but there's one more sticky bit: the
original closure transformation. Let's take a slightly more complicated
example:
: top = \x -> {
: inner = \y -> inner (+ x y)
! inner
}
Never mind that this is infinitely recursive: it's a nice simple example
for our syntactic transformations. The closure transformation takes
inner from that given in the previous listing to this:
: inner = (\x y -> inner (+ x y)) x
Which works fine. The recursion transformation would then take it to:
: inner =
(\inner x y -> inner inner (+ x y))
(\inner x y -> inner inner (+ x y))
x
This looks okay, but doesn't properly work, because the reference to
inner in the body of the function doesn't pass any of the closed-over
parameters. What we want is something more like:
: top = \x -> {
: inner_ = \x inner y -> inner (+ x y)
: inner_with_x_ = inner_ x
: inner_with_inner_ = inner_with_x_ inner_with_x_
}
But that doesn't work either. Instead, we introduce a secret atom type,
SELFREF, which is a reference to the innermost enclosing function. So
we end up rewriting inner to be:
: inner = (\x y -> {
: inner = SELFREF x
! inner (+ x y)
}) x
Now the "inner" on the inside doesn't actually reference a named binding
on the outside. When this is compiled down to a graph, SELFREF will be
transformed into a VALUE node that points at the value currently being
compiled.
With more-heavily-nested recursion, this ends up transforming:
: outer = \a -> {
: inner = \b -> outer (+ (inner a) b)
! inner a
}
Into:
: outer = \a -> {
: outer = SELFREF
: inner = (\a outer b -> {
: inner = SELFREF a outer
! outer (+ (inner a) b)
}
}
Which actually ends up being pretty pleasantly readable.