An Object-Oriented Language for the '20s
Object-oriented programming is out of fashion now, and it has been for a while. Rarely are new programming languages intentionally object-oriented. And there are good reasons for this: OO often requires a lot of boilerplate, it forces code into unnatural object hierarchies, and it encourages hidden mutable state.
But, if we made a new statically-typed OO language from scratch in 2021, something in the vein of Java or C#, taking everything we've learned from functional programming and a decade-plus of scathing OO criticism, could we fix this? Especially if we had no expectation of compatibility with legacy code?
First, a few obvious, non-negotiable things any new OO language should have:
- No nulls
- No unsafe casts
- Optional named arguments
- Generics
- Immutability by default
Enough has been said about these topics that they don't need much justification today.
Here are some less obvious choices that I, personally, would want in a new OO language:
- Class-based discoverability
- Multiple inheritance
- Minimal syntax
- Higher-kinded types
- No exceptions
- Unified classes and typeclasses
- Pattern matching without destructuring
The common factor in all of these is keeping the language purely OO (not "multiparadigm"), but merging functional features into its type system while implementing them in an OO way.
Class-based discoverability
Scala has been my favorite programming language for a while now. It gets pretty close to my ideal union of OO and functional concepts. But Scala's main problem is that it has a dozen ways to do everything: Java classes, algebraic datatypes, freestanding functions, implicits, and everything in between.
Like almost all alternative JVM languages, Scala escapes the Kingdom of Nouns and defies Java's requirement that everything belongs to a class. In fact, Scala defies several pieces of OO orthodoxy, in the name of practicality and functional programming.
But perhaps we've been unfair to the Kingdom of Nouns. Java's grammar doesn't forbid verbs, it just forces a consistent word order, where the noun always comes first. And this has one major advantage: IDE and documentation discoverability.
When everything is a member of a class, every nonlocal method call comes after a dot, and that means every nonlocal method call can be autocompleted with a dropdown. A Java API can be explored 100% in the IDE, using only dropdowns and popup documentation. We can provide extension methods (like C#) to handle new operations on existing types, so utility functions shouldn't be floating around as free functions or static methods on utility classes.
Ideally, this language would only have three kinds of top-level declarations:
class
, extension
, and alias
.
Multiple inheritance
Inheritance gets a bad rap. Admittedly, there are reasons for that: most of them time, when you extend a superclass, what you really want is either an interface or some kind of composition.
But, once we've decided that everything is an object and every function belongs to an object, every form of code reuse starts to look like inheritance: mixins, traits, interfaces with default methods, composition with proxy methods, etc. And, as OO languages develop new features, these tend toward multiple inheritance in all but name. Why not just accept it?
Inheritance problems are (arguably) a mental model issue, not a language feature issue. Interfaces are a subset of classes. Connecting interface to implementation is sometimes necessary, to enforce invariants that the type system can't describe. Is-a relationships are useful (see: collections libraries), but the line is often drawn in the wrong place. Fragile base classes are just an anti-pattern to consciously avoid, and you can make the same mistake with default methods on typeclasses or interfaces.
And multiple inheritance solves most of the overthinking and bikeshedding involved in Java class hierarchies. In a language with sane multiple inheritance, inheritance looks more like composition: just pick the functionality you want and mash it together. The hierarchy is a DAG, not a tree. And once you have multiple inheritance, you don't need interfaces, because there's no longer a distinction between an interface and an abstract class.
So how do you solve name collisions? Just make method and field names
fully-qualified. If Foo
extends Bar
and Baz
, both of which define a method
qux
, you can refer to either method on an instance foo
via foo.Bar\qux
or
foo.Baz\qux
. I'm using backslash as a namespace separator here because it's
like /
but not already used as a division symbol.
Safe casts can also resolve some name collisions. If Bar
and Baz
both extend
Qux
, and a method expects a Qux
argument, foo
can be passed as either foo as Bar
or foo as Baz
.
The one place where this doesn't work is the infamous diamond problem:
class A {
foo() { print("A") }
}
class B extends A {
foo() { print("B") }
}
class C extends A {
foo() { print("C") }
}
class D extends B & C
Suppose you have an object with static type A
. It's an instance of D
, but
the compiler doesn't know that. If you call foo()
, what happens? You can't
disambiguate at compile time, because you don't know that it's an instance of
B
or C
.
The easiest approach would be to take a page from
Kotlin and require D
to override foo
and
disambiguate.
Minimal syntax
Scala uses reserved words to automate common patterns: sealed
, case class
,
object
. You can make a singleton object or a sum type without these
keywords, just like you could in Java, but the compiler won't know what you're
doing. Why should it be possible to create these second-class-citizen types?
This is what I think OO algebraic datatypes could look like, in my experimental Scala-but-better syntax:
class Address {
new(
this.state: String,
this.city: String,
this.street: String,
this.zip: Int
)
}
class List(T) {
private new()
head: Option(T)
tail: Option(List(T))
static class Nil extends List(Nothing) {
static self = new()
head = None.self
tail = None.self
}
static(T) class Cons extends List(T) {
private(List) new(private this.car: T, private this.cdr: List(T))
head = Some.new(car)
tail = Some.new(cdr)
}
static nil = Nil.self
static(T) alias cons = Cons\new
}
This example contains a product type, a sum type, an abstract class with two
abstract fields, and a singleton object. But it doesn't use the words
abstract
, sealed
, case class
, or object
. A field or method with no body
is abstract. A class with any abstract members, or a protected
constructor, is
abstract. A class with a private constructor and static inner subclasses, all of
which also have private constructors, is a sealed sum type. A class with
a static self
property (which is a magic keyword, can't escape those entirely)
is a singleton, and implies a private constructor.
The compiler should just know these patterns when it sees them being used. They don't need extra keywords. Error messages should be informative enough to point out where the user accidentally made a class abstract, or didn't make a class completely sealed.
Some notes about the syntax:
- The constructor is named
new
. Objects are constructed withClassName.new(...)
. - Constructor parameters starting with
this.
are just like Scala constructor parameters prefixed withval
. - Capitalization is significant, and means the same thing it does in Haskell: if it starts with a capital letter, it's a type.
- Generics just use parentheses, instead of
<>
or[]
. Because classes are distinguished by capitalization, this is unambiguous. Generic methods can use two sets of parentheses, since, unlike Scala, there is no currying. - Generics are reified (like C#, unlike Java). So
static
sometimes has to be qualified with type parameters, to know whether it's static toList
orList(T)
.List(A).Nil
andList(B).Nil
are the same class, butList(A).Cons
andList(B).Cons
are different.
Higher-kinded types
These are type parameters that take type parameters of their own, like T(_)
.
The only language I know of that supports both OO classes and HKT is Scala, but
it's a weird case. Scala is Java with ML bolted onto the side. If you want to do
functional things in Scala, you end up in an alternate functional universe, with
typeclasses instead of objects. Why can't Monad
just be a class you can
extend?
Here's what I think that would look like:
class Functor(F(_), A) where This extends F(A) {
map(B)(fn: (A) {B}): F(B)
}
class Monad(F(_), A) extends Functor(F, A) where This extends F(A) {
static(F, A) unit(value: A): F(A)
flatMap(B)(fn: (A) {F(B)}): F(B)
map(B)(fn: (A) {B}): F(B) = flatMap((it) { unit(fn.call(it)) })
}
More syntax notes:
- Function types are written
(Args) {Return}
, which mirrors the lambda syntax(args) {body}
. - Static members can be abstract, and can be overridden. You can call a static method on a type parameter.
The real trick here is where This extends F(A)
. The where
clause is borrowed
from Ceylon's given
clause, and defines constraints on type
parameters. This
is a magic type that always refers to the current type, even
after inheritance. If a class T
inherits Monad
, the constraint where This extends F(A)
will become where T extends F(A)
.
So, to define a list monad, you would write class List(A) extends Monad(List, A)
. An Either
monad would look like class Either(A, B) extends Monad(Either(_, B), A)
—note that the first argument of Monad
is significant
now, not just boilerplate.
This is a real monad type. But it's also completely OO: no free functions, no algebraic datatypes, no typeclasses.
No exceptions
Alternative approaches to error handling are common in modern languages. I think that a new OO language should still have typed, hierarchically-organized error objects. But exception throwing and stack unwinding is unnecessary.
Go and Rust provide two alternate approaches to error handling. Both treat
errors as values, but Go uses multiple return values to return error states,
while Rust uses a Result
monad. Since we've already established that this
language can easily define Monad
as a class, an error monad seems ideal,
possibly with syntax support like Rust's try!
.
Unified classes and typeclasses
Classes and typeclasses are very similar. Most languages have one or the other, but not both. Scala has both, but its typeclasses are kind of a kludge, and they're a completely separate thing from ordinary classes and interfaces.
The Monad
example shows that, with a few extra features, classes can do most
of what typeclasses already do in functional languages.
But we can also add extension methods and typeclass-like standalone instances, which create a type algebra similar to Rust's or Haskell's.
// Add a sum method to Lists of Ints
extension Sum of List(Int) {
sum(): Number = reduce((a, b) {a + b}, 0)
}
// Add a Comparable instance to Lists of Comparables
extension ComparableList(A extends Comparable(A)) of List(A) as Comparable(List(A)) {
compare(that: List(A)): Comparison = // ... implementation omitted
}
extension
adds new methods, and possibly new superclasses, to an existing
type. An extension must have a name (a restriction inspired by
PureScript), and that name can be used in
fully-qualified method and field names, or in as
casts. The name ensures that
disambiguation is always possible.
Here's another example from functional programming: various Monoid implementations.
class Monoid(A) where This extends A {
static(A) empty: A
combine(x: A): A
}
extension Add of Int as Monoid(Int) {
static empty = 0
combine(x: Int): Int = this + x
}
extension Multiply of Int as Monoid(Int) {
static empty = 1
combine(x: Int): Int = this * x
}
extension Concat(A) of List(A) as Monoid(List(A)) {
static empty = List.nil
combine(suffix: List(A)) = foldRight(List\cons, suffix)
// yes, I know, I didn't define List\foldRight
}
If you wanted to pass an Int
to a function that takes a Monoid
, you would
have to specify its Monoid
instance (1 as Add
, 1 as Multiply
). A more
common use case would be a fold
method on lists of monoids:
listOfInts.List(Add)\fold()
or listOfInts.List(Multiply)\fold()
.
Pattern matching without destructuring
Sum types based on inner classes almost work, but there's still one missing piece of the algebraic data type puzzle: what about pattern matching? Case classes define a pattern-matching destructuring shape, but normal constructors don't.
Pattern matching doesn't work in pure OO. You create sum types with inheritance,
and only the subtypes themselves can distinguish their type; if you need an
external function to match on a sum type, you either use the visitor pattern, or
cheat with instanceof
. This is clunky, so Scala instead provides match
and
case classes, and lets you do it the functional way.
// Visitor pattern (awkward)
class PrintVisitor extends OptionVisitor(String) {
visitSome(value: String) {
print("got some ${value}")
}
visitNone() {
print("got nothing")
}
}
Some.new("foo").visit(PrintVistor.new())
// Match (easy, but not OO)
Some.new("foo") match {
case Some(value) => print("got some ${value}")
case None => print("got nothing")
}
In theory, you could use extension methods to create something like a visitor pattern on an existing sum type that doesn't support it.
private class Visitor {
visit()
}
private extension VisitOptionBase of Option as Visitor {
visit() {}
}
private extension VisitSome of Some(String) {
visit() {
// Note that we're in the scope of a `Some(String)`,
// so `value` refers to the `Some`'s public `value` field.
print("got some ${value}")
}
}
private extension VisitNone of None {
visit() {
print("got nothing")
}
}
Some.new("foo").visit()
But this is still a lot of boilerplate. Let's create a new match
syntax that
does this under the hood, while remaining as simple as Scala's match
expression:
Some.new("foo").match {
as Some = print("got some ${value}")
as None = print("got nothing")
}
Each as
clause matches a type, then calls its body in the scope of a value of
that type. Notice the magic here: value
is in scope for the body of as Some
, and it refers to Some\value
. This... might be too confusing. But it's
how inner classes already work in Java (layering the inner class's scope on top
of the outer class's), and it gives us a match
statement without needing an
entire destructuring syntax.
We could also add guards (possibly using where
or if
) to make these match
statements more powerful, getting closer to a full-featured ML-like match
.
The problem
You might have noticed something "off" about the way I used extension methods in
the last section. It would be nice to use them this way, and it syncs up with OO
intuition about inheritance and polymorphism. But, if extension methods are
resolved like typeclasses, they aren't virtual. The extension instance is
picked based on an object's compile-time type, not its runtime type, so the
match
statement would always call VisitOptionBase\visit
on an unknown
Option
, and it would do nothing.
I noticed this bug as I was writing the visit
example, so I might as well work
it out here.
Technically, we don't need virtual extension methods. The equivalence between
match
and extension methods is nice but mostly irrelevant. match
can just
use instanceof
matching under the hood, and that gets the job done.
But non-virtual extension methods still seem like a footgun. If you're thinking
in OO, you expect every method to be virtual. If you define a method on
Vehicle
, override it with an extension on Car
, then call the method on
a Vehicle
parameter that happens to be a Car
, you'd expect to get the
overridden method, but you won't.
And virtual extension methods are nearly impossible. Unlike the diamond problem, overlapping overrides can't be easily detected; they could come from anywhere. And extension methods depend on import scope, but how does that apply to subtypes that aren't even in scope?
The actual solution is probably to forbid extension methods from overriding
existing methods at all. This makes them non-virtual by definition, but removes
the footgun. It makes the visit
example above invalid, but...
Summary
Aside from that brief detour into virtual-extension-method madness, I really like what this language is becoming. It's a lot like Scala, but conceptually simpler, with only a small set of necessary features. Which makes it similar to Kotlin, but without the baggage of JVM compatibility.
The new features that this language adds, beyond what Scala and Kotlin already provide, are OO-compatible higher-kinded types, typeclass-like extension method resolution, and an alternative approach to error handling.
I realize there's not much need for a language like this—the space is crowded enough as it is—but it's fun to dream.
My tentative name for this language is "Lift". Because scala means ladder, and an elevator is a better ladder; and also because "lifting" is a common operation in monadic functional programming. I'm open to better suggestions.