ubik / 02a5277 docs / ports.txt

Tree @02a5277 (Download .tar.gz)

ports.txt @02a5277raw · history · blame

Ports ==================================================================
Haldean Brown                                      First draft: Mar 2017
                                                  Last updated: Mar 2017

Status: Draft

To do: extract, zip-new (tell the zip which value triggered it)


Ports are how a Ubik program interacts with the world outside the
interpreter, and also form the highest-level control flow construct of
Ubik. Ports are pluggable interfaces that allow for values to "flow"
through a directed acyclic graph of computation. They can be backed by
hooks (these are called "external ports" or "eports") or created from
within the runtime (these are "virtual ports" or "vports"), and can be
sources, sinks, or pipes (both source and sink). Ports are also typed;
sources only produce values of a certain type, and sinks only support a
certain type. This allows for the safe composition of ports into larger
graphs of computation.

Speaking of composition, ports are connected and composed using a
special set of operators. Similar to bindings, tests, or type
definitions, port connections are defined using a special top-level
statement: a plug. Plugs begin with the plug operator >>, and then are
composed of ports, functions and port operators. The available port
operators are:

    >   sink
    |   map
    .   reduce
    /   zip
    %   left-zip

All of the operators are left-associative, infix (yes, infix!), and have
equal precidence. There are no parenthesized statements; only simple
pipelines are allowed (more complex branching will be discussed in a
moment). Infix operators are used here instead of prefix operators,
which is admittedly inconsistent with the rest of Ubik.  However, plugs
are special in that they are expected to form chains; you want it to be
easy to express things like:

    >> weather-data
        | get-temperature
        / dow-jones-index
        . find-correlation
        | correlation-to-string
        > stdout

The equivalent prefix operation would be:

    >> > (| (. (/ (| weather-data get-temperature) dow-jones-index)
                find-correlation) correlation-to-string) stdout

Which is so much harder to read. Ubik uses prefix notations for
expressions because it handles functions of multiple arities cleanly,
it's a syntax that is familiar to almost all programmers (even C-like
languages, if you ignore arithmetic) and requires no precedence rules.
With plugs, we only have one arity, we have no user-definable operators,
we don't need multiple precidence levels, and binary operators forming
an arithmetic (an arithmetic of streams, sure, but an arithmetic none
the less) are also a familiar form to most programmers. Also, it's just
way prettier.

Let's go over what each of the operators does!

Sink >  ----------------------------------------------------------------

Map is by far the simplest of the operators: it takes a port of type T
on the left, a port of type T on the right, and it puts each value from
the port on the left into the port on the right.

The sink operator has no result, and a plug without a sink produces no
observable change to the system. Each plug therefore has exactly one
sink, as the rightmost operator.

The ubik implementation of cat looks like:

    >> io:stdin > io:stdout

Map |  -----------------------------------------------------------------

Map takes a port of type A on the left and a function of type A -> B on
the right and produces a port of type B. It does this by taking each
element from the port, calling the function on it, and placing the
result of that function into the result port.

For example, here's a program that takes every line on standard in,
concatenates it to itself, and then prints it on standard out:

    : double-line ^ String -> String = \a -> concat a a
    >> io:stdin | double-line > io:stdout

Reduce .  --------------------------------------------------------------

The reduce operator takes a port of type A, a function (let's call it f)
of type maybe:Maybe B -> A -> B, and produces a port of type B. When the
first value, v, is seen on the left port, the reduction behaves as if
this were executed:

    : res = f (maybe:No) v

and then it stashes that value away in an accumulator. Note that it does
not push it into the result port! For each subsequent value read off the
source port, it executes:

    : res = f (maybe:Yes accumulator) v

It then both stashes the result in the accumulator and pushes it into
the result port.

This pattern is handy for doing things that require maintaining some
kind of state between calls to the function. For example, keeping a
cumulative sum of a stream of numbers is easy:

    : sum-reducer ^ maybe:Maybe Number -> Number -> Number
        = \acc x -> ? acc {
            . maybe:Yes n => + n x
            . maybe:No    => x
    >> numbers . sum-reducer > sums

Or, to calculate a mean of a source:

    ^ RunningSum = RunningSum Number Number
    : running-sum ^ RunningSum -> weather:WeatherUpdate -> RunningSum
        = \rs, wu -> ? wu {
            . weather:Temperature t => ? rs {
                . RunningSum sum n => RunningSum (+ t sum) (+ n 1)
    : find-mean ^ RunningSum -> Number
        = \rs -> ? rs {
            . RunningSum sum n => / sum n
    >> weather:temperatures
       . running-sum
       | find-mean
       | humanize
       > io:stdout

Zip /  -----------------------------------------------------------------

Zip takes two ports of types A and B and produces a stream of type
port:Zip A B. These are 2-tuples that can be unpacked by maps, reducers
or whatever downstream of the zip. A new Zip is created for each value
on each port; zips do not coalesce updates that occur simultaneously on
the two input ports. That means that, if you have values A1 and A2 on
the left port, and B1 and B2 on your right port, your zip port will end
up with either:

    port:Zip A1 B1
    port:Zip A1 B2
    port:Zip A2 B2


    port:Zip A1 B1
    port:Zip A2 B1
    port:Zip A2 B2

This is an important note! You cannot depend on the order in which
things are delivered to separate ports at any point in the Ubik system,
and this is an example of that. Both of those orderings for events are

The reasoning behind this is simple: coalescing that into two values
(port:Zip A1 B1, port:Zip A2 B2) would require a determination of
intent, one in which the mechanism for specifying intent would be far
more complex than the solution that can be implemented by the user. If
you want the result to only be updated when one of the streams changes,
you're looking for...

Left-zip %  ------------------------------------------------------------

Left-zip is the same as zip, except a result is only pushed into the
resulting stream when the left stream is updated. This is provided to
associate a continuously-varying port with values from a discrete port,
the most common case of which is timestamping values. This is probably
easiest to explain with an example:

    >> tweets-about-ubik
       % time:clock
       | extract-time
       . difference-pairs
       | extract-delta
       > time-between-tweets-about-ubik

    : extract-time ^ port:Zip Tweet Time -> Time
        = \z -> ? z { . port:Zip _ time -> time }

    ^ TimeReducer = TimeReducer Time TimeDelta

    : difference-pairs
        ^ maybe:Maybe TimeReducer -> Time -> TimeReducer
        = \mtr t -> ? mtr {
            . maybe:Yes (TimeReducer last _) => TimeReducer t (time-diff last t)
            . maybe:No => TimeReducer t (time-diff t t)

    : extract-delta ^ TimeReducer -> TimeDelta
        = \tr -> ? tr { . TimeReducer _ delta -> delta }

This calculates the amount of time between events occurring on the
tweets-about-ubik port, and saves that to a result port. A normal zip is
not appropriate here, because it would update every time the clock
ticked; instead, we use a left zip, and the resulting port only has
values that were triggered by values from the tweets-about-ubik port.

Special ports ----------------------------------------------------------

Sometimes you want to sink values into a sink without sourcing those
values from somewhere else. Tough shit! That doesn't work in Ubik. There
is, however, one special port that contains a single token at the start
of the interpreter:

    ~ my-module
    ` port
    >> port:once | (\x -> 7) | humanize > io:stdout