Parentheses are Just Typechecking

June 22, 2021

Consider this Scheme expression, taken from the Rosetta Code example for the Mandelbrot set:

(boolean->integer
  (inside?
    (make-rectangular (+ x-offset (* pixel-size i))
                      (- y-offset (* pixel-size j))))

How would we translate this to a stack language, like Forth? Stack languages are a mirror image of Lisps, written in postfix rather than prefix notation. So a Forth-like version of this expression would be the same words in reverse, without the parentheses:

j pixel-size * y-offset -
i pixel-size * x-offset +
make-rectangular inside? boolean->integer

But why does this necessarily have to be written postfix? A Forth that evaluates right-to-left would look almost identical to a prefix notation language, just with no delimiters.

boolean->integer
  inside?
    make-rectangular + x-offset * pixel-size i
                     - y-offset * pixel-size j

It's not exactly more readable than the original. But if you reverse the words and evaluate it as if it's Forth, the expression is the same, with or without the parens.

Here's one way to look at this: for simple expressions, parentheses are a form of static typechecking. If you model a Lisp as a stack language that evaluates from right to left, each parenthesized block must push exactly one value onto the stack, and must not consume any values. In Forth terms, each ( ... ) enforces that its contents are of the type -- a.

This equivalence makes some assumptions. For one, it's assuming a Lisp-2 (even though Scheme is a Lisp-1), since a closure returned from a stack-language function would just be pushed onto the stack, so ((compose a b) c) wouldn't work.

I'm also only talking about simple expressions here, ignoring the other more significant differences between Lisps and Forths, like lists and variadic functions (which would still need parentheses), as well as variable definitions versus stack munging operators like dup or swap.

Even with these caveats, the equivalence is still fascinating. And it makes me aware of some gaps in the design space of stack languages and Lisps:

Why have I never seen...

...a stack language that evaluates right-to-left? Non-stack-based languages evaluate right-to-left within individual expressions, so this could be surprisingly intuitive. Yes, we often expect function arguments to evaluate left-to-right, but some mainstream languages (notably C) leave argument evaluation order undefined.
...a stack language that allows parentheses as a sanity check? -- a expressions are common, and allowing the programmer to wrap them in parentheses to enforce this type could make stack code much more readable. (I haven't actually used many stack languages, so this probably exists, I just haven't seen it.)
...a Lisp that can drop nested parentheses when the meaning is clear from context? This one takes some explanation.

Consider a Lisp-2 where functions are written in UpperCamelCase to distinguish them from values in lower-kebab-case. Then an expression like (Foo bar (Baz qux)) could be rewritten as (Foo bar Baz qux).

The simplest version of this assumes each argument goes to the first function to its left. That's already useful for saving parentheses, but it's not the reverse-Forth syntax I've been talking about.

If we don't make this simplifying assumption, then expressions like (A B C) are ambiguous. This would be (A (B) (C)) if B and C take 0 arguments and A takes 2, but (A (B (C))) if A and B both take 1 argument. Fixing this ambiguity requires a static type system. We could use a stack instead, but then it's not really a Lisp anymore.

(Or is it?)