An Object-Oriented Language for the '20s

March 12, 2021

Object-oriented programming is out of fashion now, and it has been for a while. Rarely are new programming languages intentionally object-oriented. And there are good reasons for this: OO often requires a lot of boilerplate, it forces code into unnatural object hierarchies, and it encourages hidden mutable state.

But, if we made a new statically-typed OO language from scratch in 2021, something in the vein of Java or C#, taking everything we've learned from functional programming and a decade-plus of scathing OO criticism, could we fix this? Especially if we had no expectation of compatibility with legacy code?

First, a few obvious, non-negotiable things any new OO language should have:

No nulls
No unsafe casts
Optional named arguments
Generics
Immutability by default

Enough has been said about these topics that they don't need much justification today.

Here are some less obvious choices that I, personally, would want in a new OO language:

Class-based discoverability
Multiple inheritance
Minimal syntax
Higher-kinded types
No exceptions
Unified classes and typeclasses
Pattern matching without destructuring

The common factor in all of these is keeping the language purely OO (not "multiparadigm"), but merging functional features into its type system while implementing them in an OO way.

Class-based discoverability

Scala has been my favorite programming language for a while now. It gets pretty close to my ideal union of OO and functional concepts. But Scala's main problem is that it has a dozen ways to do everything: Java classes, algebraic datatypes, freestanding functions, implicits, and everything in between.

Like almost all alternative JVM languages, Scala escapes the Kingdom of Nouns and defies Java's requirement that everything belongs to a class. In fact, Scala defies several pieces of OO orthodoxy, in the name of practicality and functional programming.

But perhaps we've been unfair to the Kingdom of Nouns. Java's grammar doesn't forbid verbs, it just forces a consistent word order, where the noun always comes first. And this has one major advantage: IDE and documentation discoverability.

When everything is a member of a class, every nonlocal method call comes after a dot, and that means every nonlocal method call can be autocompleted with a dropdown. A Java API can be explored 100% in the IDE, using only dropdowns and popup documentation. We can provide extension methods (like C#) to handle new operations on existing types, so utility functions shouldn't be floating around as free functions or static methods on utility classes.

Ideally, this language would only have three kinds of top-level declarations: class, extension, and alias.

Multiple inheritance

Inheritance gets a bad rap. Admittedly, there are reasons for that: most of them time, when you extend a superclass, what you really want is either an interface or some kind of composition.

But, once we've decided that everything is an object and every function belongs to an object, every form of code reuse starts to look like inheritance: mixins, traits, interfaces with default methods, composition with proxy methods, etc. And, as OO languages develop new features, these tend toward multiple inheritance in all but name. Why not just accept it?

Inheritance problems are (arguably) a mental model issue, not a language feature issue. Interfaces are a subset of classes. Connecting interface to implementation is sometimes necessary, to enforce invariants that the type system can't describe. Is-a relationships are useful (see: collections libraries), but the line is often drawn in the wrong place. Fragile base classes are just an anti-pattern to consciously avoid, and you can make the same mistake with default methods on typeclasses or interfaces.

And multiple inheritance solves most of the overthinking and bikeshedding involved in Java class hierarchies. In a language with sane multiple inheritance, inheritance looks more like composition: just pick the functionality you want and mash it together. The hierarchy is a DAG, not a tree. And once you have multiple inheritance, you don't need interfaces, because there's no longer a distinction between an interface and an abstract class.

So how do you solve name collisions? Just make method and field names fully-qualified. If Foo extends Bar and Baz, both of which define a method qux, you can refer to either method on an instance foo via foo.Bar\qux or foo.Baz\qux. I'm using backslash as a namespace separator here because it's like / but not already used as a division symbol.

Safe casts can also resolve some name collisions. If Bar and Baz both extend Qux, and a method expects a Qux argument, foo can be passed as either foo as Bar or foo as Baz.

The one place where this doesn't work is the infamous diamond problem:

class A {
  foo() { print("A") }
}

class B extends A {
  foo() { print("B") }
}

class C extends A {
  foo() { print("C") }
}

class D extends B & C

Suppose you have an object with static type A. It's an instance of D, but the compiler doesn't know that. If you call foo(), what happens? You can't disambiguate at compile time, because you don't know that it's an instance of B or C.

The easiest approach would be to take a page from Kotlin and require D to override foo and disambiguate.

Minimal syntax

Scala uses reserved words to automate common patterns: sealed, case class, object. You can make a singleton object or a sum type without these keywords, just like you could in Java, but the compiler won't know what you're doing. Why should it be possible to create these second-class-citizen types?

This is what I think OO algebraic datatypes could look like, in my experimental Scala-but-better syntax:

class Address {
  new(
    this.state: String,
    this.city: String,
    this.street: String,
    this.zip: Int
  )
}

class List(T) {
  private new()

  head: Option(T)
  tail: Option(List(T))

  static class Nil extends List(Nothing) {
    static self = new()

    head = None.self
    tail = None.self
  }

  static(T) class Cons extends List(T) {
    private(List) new(private this.car: T, private this.cdr: List(T))

    head = Some.new(car)
    tail = Some.new(cdr)
  }

  static nil = Nil.self
  static(T) alias cons = Cons\new
}

This example contains a product type, a sum type, an abstract class with two abstract fields, and a singleton object. But it doesn't use the words abstract, sealed, case class, or object. A field or method with no body is abstract. A class with any abstract members, or a protected constructor, is abstract. A class with a private constructor and static inner subclasses, all of which also have private constructors, is a sealed sum type. A class with a static self property (which is a magic keyword, can't escape those entirely) is a singleton, and implies a private constructor.

The compiler should just know these patterns when it sees them being used. They don't need extra keywords. Error messages should be informative enough to point out where the user accidentally made a class abstract, or didn't make a class completely sealed.

Some notes about the syntax:

The constructor is named new. Objects are constructed with ClassName.new(...).
Constructor parameters starting with this. are just like Scala constructor parameters prefixed with val.
Capitalization is significant, and means the same thing it does in Haskell: if it starts with a capital letter, it's a type.
Generics just use parentheses, instead of <> or []. Because classes are distinguished by capitalization, this is unambiguous. Generic methods can use two sets of parentheses, since, unlike Scala, there is no currying.
Generics are reified (like C#, unlike Java). So static sometimes has to be qualified with type parameters, to know whether it's static to List or List(T). List(A).Nil and List(B).Nil are the same class, but List(A).Cons and List(B).Cons are different.

Higher-kinded types

These are type parameters that take type parameters of their own, like T(_). The only language I know of that supports both OO classes and HKT is Scala, but it's a weird case. Scala is Java with ML bolted onto the side. If you want to do functional things in Scala, you end up in an alternate functional universe, with typeclasses instead of objects. Why can't Monad just be a class you can extend?

Here's what I think that would look like:

class Functor(F(_), A) where This extends F(A) {
  map(B)(fn: (A) {B}): F(B)
}

class Monad(F(_), A) extends Functor(F, A) where This extends F(A) {
  static(F, A) unit(value: A): F(A)

  flatMap(B)(fn: (A) {F(B)}): F(B)

  map(B)(fn: (A) {B}): F(B) = flatMap((it) { unit(fn.call(it)) })
}

More syntax notes:

Function types are written (Args) {Return}, which mirrors the lambda syntax (args) {body}.
Static members can be abstract, and can be overridden. You can call a static method on a type parameter.

The real trick here is where This extends F(A). The where clause is borrowed from Ceylon's given clause, and defines constraints on type parameters. This is a magic type that always refers to the current type, even after inheritance. If a class T inherits Monad, the constraint where This extends F(A) will become where T extends F(A).

So, to define a list monad, you would write class List(A) extends Monad(List, A). An Either monad would look like class Either(A, B) extends Monad(Either(_, B), A)—note that the first argument of Monad is significant now, not just boilerplate.

This is a real monad type. But it's also completely OO: no free functions, no algebraic datatypes, no typeclasses.

No exceptions

Alternative approaches to error handling are common in modern languages. I think that a new OO language should still have typed, hierarchically-organized error objects. But exception throwing and stack unwinding is unnecessary.

Go and Rust provide two alternate approaches to error handling. Both treat errors as values, but Go uses multiple return values to return error states, while Rust uses a Result monad. Since we've already established that this language can easily define Monad as a class, an error monad seems ideal, possibly with syntax support like Rust's try!.

Unified classes and typeclasses

Classes and typeclasses are very similar. Most languages have one or the other, but not both. Scala has both, but its typeclasses are kind of a kludge, and they're a completely separate thing from ordinary classes and interfaces.

The Monad example shows that, with a few extra features, classes can do most of what typeclasses already do in functional languages.

But we can also add extension methods and typeclass-like standalone instances, which create a type algebra similar to Rust's or Haskell's.

// Add a sum method to Lists of Ints
extension Sum of List(Int) {
  sum(): Number = reduce((a, b) {a + b}, 0)
}

// Add a Comparable instance to Lists of Comparables
extension ComparableList(A extends Comparable(A)) of List(A) as Comparable(List(A)) {
  compare(that: List(A)): Comparison = // ... implementation omitted
}

extension adds new methods, and possibly new superclasses, to an existing type. An extension must have a name (a restriction inspired by PureScript), and that name can be used in fully-qualified method and field names, or in as casts. The name ensures that disambiguation is always possible.

Here's another example from functional programming: various Monoid implementations.

class Monoid(A) where This extends A {
  static(A) empty: A
  combine(x: A): A
}

extension Add of Int as Monoid(Int) {
  static empty = 0
  combine(x: Int): Int = this + x
}

extension Multiply of Int as Monoid(Int) {
  static empty = 1
  combine(x: Int): Int = this * x
}

extension Concat(A) of List(A) as Monoid(List(A)) {
  static empty = List.nil
  combine(suffix: List(A)) = foldRight(List\cons, suffix)

  // yes, I know, I didn't define List\foldRight
}

If you wanted to pass an Int to a function that takes a Monoid, you would have to specify its Monoid instance (1 as Add, 1 as Multiply). A more common use case would be a fold method on lists of monoids: listOfInts.List(Add)\fold() or listOfInts.List(Multiply)\fold().

Pattern matching without destructuring

Sum types based on inner classes almost work, but there's still one missing piece of the algebraic data type puzzle: what about pattern matching? Case classes define a pattern-matching destructuring shape, but normal constructors don't.

Pattern matching doesn't work in pure OO. You create sum types with inheritance, and only the subtypes themselves can distinguish their type; if you need an external function to match on a sum type, you either use the visitor pattern, or cheat with instanceof. This is clunky, so Scala instead provides match and case classes, and lets you do it the functional way.

// Visitor pattern (awkward)

class PrintVisitor extends OptionVisitor(String) {
  visitSome(value: String) {
    print("got some ${value}")
  }
  visitNone() {
    print("got nothing")
  }
}

Some.new("foo").visit(PrintVistor.new())

// Match (easy, but not OO)

Some.new("foo") match {
  case Some(value) => print("got some ${value}")
  case None => print("got nothing")
}

In theory, you could use extension methods to create something like a visitor pattern on an existing sum type that doesn't support it.

private class Visitor {
  visit()
}

private extension VisitOptionBase of Option as Visitor {
  visit() {}
}

private extension VisitSome of Some(String) {
  visit() {
    // Note that we're in the scope of a `Some(String)`,
    // so `value` refers to the `Some`'s public `value` field.
    print("got some ${value}")
  }
}

private extension VisitNone of None {
  visit() {
    print("got nothing")
  }
}

Some.new("foo").visit()

But this is still a lot of boilerplate. Let's create a new match syntax that does this under the hood, while remaining as simple as Scala's match expression:

Some.new("foo").match {
  as Some = print("got some ${value}")
  as None = print("got nothing")
}

Each as clause matches a type, then calls its body in the scope of a value of that type. Notice the magic here: value is in scope for the body of as Some, and it refers to Some\value. This... might be too confusing. But it's how inner classes already work in Java (layering the inner class's scope on top of the outer class's), and it gives us a match statement without needing an entire destructuring syntax.

We could also add guards (possibly using where or if) to make these match statements more powerful, getting closer to a full-featured ML-like match.

The problem

You might have noticed something "off" about the way I used extension methods in the last section. It would be nice to use them this way, and it syncs up with OO intuition about inheritance and polymorphism. But, if extension methods are resolved like typeclasses, they aren't virtual. The extension instance is picked based on an object's compile-time type, not its runtime type, so the match statement would always call VisitOptionBase\visit on an unknown Option, and it would do nothing.

I noticed this bug as I was writing the visit example, so I might as well work it out here.

Technically, we don't need virtual extension methods. The equivalence between match and extension methods is nice but mostly irrelevant. match can just use instanceof matching under the hood, and that gets the job done.

But non-virtual extension methods still seem like a footgun. If you're thinking in OO, you expect every method to be virtual. If you define a method on Vehicle, override it with an extension on Car, then call the method on a Vehicle parameter that happens to be a Car, you'd expect to get the overridden method, but you won't.

And virtual extension methods are nearly impossible. Unlike the diamond problem, overlapping overrides can't be easily detected; they could come from anywhere. And extension methods depend on import scope, but how does that apply to subtypes that aren't even in scope?

The actual solution is probably to forbid extension methods from overriding existing methods at all. This makes them non-virtual by definition, but removes the footgun. It makes the visit example above invalid, but...

Summary

Aside from that brief detour into virtual-extension-method madness, I really like what this language is becoming. It's a lot like Scala, but conceptually simpler, with only a small set of necessary features. Which makes it similar to Kotlin, but without the baggage of JVM compatibility.

The new features that this language adds, beyond what Scala and Kotlin already provide, are OO-compatible higher-kinded types, typeclass-like extension method resolution, and an alternative approach to error handling.

I realize there's not much need for a language like this—the space is crowded enough as it is—but it's fun to dream.

My tentative name for this language is "Lift". Because scala means ladder, and an elevator is a better ladder; and also because "lifting" is a common operation in monadic functional programming. I'm open to better suggestions.