## Tree @02a5277 (Download .tar.gz)

## syntax-2.txt @02a5277 — raw · history · blame

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 | ```
Expression Syntax 2.0 ==================================================
Haldean Brown First draft: Feb 2017
Last updated: Feb 2017
Status: Rejected
Once I started playing with the syntax, I realized this was fun
but also a completely different language from Ubik. I sort of
got carried away here.
------------------------------------------------------------------------
Currently, Ubik expressions are LISP-like, with some special forms and
some redundant parentheses removed. In this way, it is similar to
Haskell; Haskell further removes redundant parentheses using significant
whitespace and operators like <$>, which are only possible when combined
with laziness.
Ubik's desire to be (a) context-free and (b) eager mean that these
options are not available to us. Also, Forth is cool. For that reason,
I'm switching syntax over to a postfix, stack-style syntax.
Ubik has a few expression forms:
* Atoms; things like:
"hello" -3.4 list:map
* Application, which is defined by adjacency with a function at the
head and an argument at the tail:
+ 8 3
* Lambda statements, currently of the form:
\x y z -> body
* Pattern blocks, like:
? expr {
. pattern => result
. pattern => result
}
* Condition blocks, of the form:
? {
. boolean-expr => result
. boolean-expr => result
}
* Scope blocks, such as:
{
: x = 8
: y = -3
! - x y
}
* ADT declarations, in the style of:
^ List a
= Cons a (List a)
= Nil
* Type application, which looks like:
String -> Number
Each of these needs a new form in our new stack syntax. During this
document, I'll refer back to a simple program for checking whether a
number is a prime:
: is-prime
^ Number -> Boolean
= \x -> {
: divisors = list:range 2 (- x 1)
! list:all (list:map (\y -> not (eq (% x y) 0)) divisors)
}
Atoms won't change; atoms are atomic. Application changes, though.
The first switch is to change everything to postfix, so (- 8 3) becomes
(8 3 -). Note that this mixes with partial application really nicely;
(3 -) becomes a partial function that subtracts 3 from another number.
: is-prime
^ Number -> Boolean
= \x -> {
: divisors = 2 (x 1 -) list:range
! divisors (\y -> x y % 0 eq not) list:map list:all
}
Look at all of the parens we got rid of! The challenge here is actually
parsing this. For that inner function, we would still like to get out a
parse tree that looks kind of like this:
not --- eq --- 0
\
+- % --- x
\
+- y
Because that's much more helpful for code generation. We can do this,
but it means we have to know the arity of all functions before we can
parse stack operations like this into a tree. That means that parsing
now happens in two steps: one gives us a sequence of stack operations
and the arities of all functions (from their type signatures), and once
we have the arities in hand we can take a second pass to generate the
trees.
Unfortunately, to get this syntax, we either have to drop existing type
inference for lambdas that are bound to a name, or we have to go full
Hindley-Milner: imagine the following:
: z = \x y -> 0 1 x y
It is impossible to know whether x is a monadic function and y is a
dyadic function, or vice-versa. Adding a type lets us know:
: z ^ (a -> b) -> (c -> d -> e) -> f
= \x y -> 0 1 x y
One day we'll have real type inferrence and that won't be an issue. For
now, c'est la vie. It's not so bad, too, because for lambdas that are
used at a call site:
my-list (\x -> x 2 *) list:map
We have the knowledge of the type of the other argument and the
function, which less us narrow down the type of x.
While we're here, let's call a spade a spade and say that the choice of
"a -> b" for type application was the misguided addition of the only
infix operator in Ubik, and fix that right up:
: is-prime
^ Boolean Number <
= \x -> {
: divisors = 2 (x 1 -) list:range
! divisors (\y -> x y % 0 eq not) list:map list:all
}
Hey look a new operator! I just picked that off the top of my head, but
I kind of like it. Let's transform some other type signatures:
: z ^ (a -> b) -> (c -> d -> e) -> f
: z ^ b a < e d < c < < f <
Okay so that's not the easiest to read at first glance, but that's a
noisy one. How about:
^ List String -> List (List Char)
^ String List Char List List <
Again, this is only possible with static knowledge of fixed arity for
types, but we've got that, so we're good.
So what about lambdas? They're still pretty infixy:
\y -> x y % 0 eq not
What if they looked like this?
[ x y % 0 eq not ] y <
That's kind of neat! Multiple arguments? You bet:
\x y z -> + x (+ y z)
[ x y z + + ] z y x < | all three of these are
[[ x y z + + ] z < ] y x < | equivalent (there's still
[[[ x y z + + ] z < ] y < ] x < | currying!)
So now we've got
: is-prime
^ Boolean Number <
= [{
: divisors = 2 x 1 - list:range
! divisors [ x y % 0 eq not ] y < list:map list:all
}] x <
While we're at it, let's make lambdas implicitly add a new scope:
: is-prime
^ Boolean Number <
= [ : divisors = 2 x 1 - list:range
! divisors [ x y % 0 eq not ] y < list:map list:all
] x <
So let's talk about condition and pattern blocks, then. Right now, we've got:
: fib
^ Number Number <
= [? {
. x 0 eq => 1
. x 1 eq => 1
. => x 1 - fib x 2 - fib +
}] x <
: maybe-string
^ String String Maybe <
= [? s {
. Just str => str
. Nothing => "nothing"
}] s <
These feel pretty infixy and Haskelly to me. We can do better.
: fib
^ Number Number <
= [{
1 [ x 0 eq ]*
1 [ x 1 eq ]*
x 1 - fib x 2 - fib + []*
} ? ] x <
: maybe-string
^ String String Maybe <
= [s {
str [ str Just ]*
"nothing" [ Nothing ]*
} ?] s <
[]* becomes a pattern constructor, {} becomes a block constructor, and ?
evaluates the block. ? is a clever thing; it might be the only thing in Ubik
without a fixed arity. When the block is a pattern block, ? is dyadic and
consumes the block and a value to match. When the block is a predicate block,
it consumes only the block.
Let's switch over to ADT declarations:
^ List a
= Cons a (List a)
= Nil
^ Expr
= Add Expr Expr
= Multiply Expr Expr
= Log Expr
= Value Number
Really the biggest change here is switching to postfix, but I'm also going to
make it a little more explicit that the heads are constructors as well, by
making users write it out like a type signature.
^ List a
= Cons a List < a <
= Nil
^ Expr
= Add Expr < Expr <
= Multiply Expr < Expr <
= Log Expr <
= Value Number <
As a sidenote, check out how much nicer nested-conses are:
(Cons "a" (Cons "hello" (Cons "world" Nil)))
"a" "hello" "world" Nil Cons Cons Cons
Still a little awkward to have all the Cons-es at the end, but at least it's
not paren soup.
I glossed over a thing earlier; now that just mentioning a function applies it
to things ahead of it in the stack, what do we do when we want to pass
functions to other functions?
mylist 0 + list:reduce
During stack evaluation, you would expect + to try to add 0 and mylist, which
is no good; we intended for it to be passed to reduce. For that reason, we add
a new function reference syntax which doesn't immediately apply the function:
mylist 0 @+ list:reduce
Note that while quoting is not always strictly required for disambiguation,
Ubik requires that all function references be quoted:
: double = 2 *
: double-list = @double list:map
Technically there is no ambiguity here, because the alternate interpretation
(that double should be applied to the stack) would result in a compile-time
error. To avoid the ridiculousness of C++'s SFINAE rule, we just say all
function references have to be unambiguous, and sometimes they are doubly so.
``` |