Title page of The Social Contract, by Jean-Jacques Rousseau. Public domain.

Type systems1 are the wellspring of some of the most interesting work in computer science, and practitioners like myself deal with them everyday. This post collects some thoughts I’ve had about how type systems interact with the communities that use them.

Types as communication

Let’s say you write a simple Python function that concatenates two strings:

def concat_strings(a, b):
    return "{}{}".format(a, b)

For example, concat_strings("cat", "dog") returns the string "catdog". But format in Python works on numbers as well, so concat_strings(3, 5) returns the string "35". You really care that your concat_strings function is only used to concatenate strings, so you decide to add a comment which explains you shouldn’t do that. In Python, that’s usually done through docstrings:

def concat_strings(a, b):
    """Concatenates two strings and returns the result."""
    return "{}{}".format(a, b)

This may be enough if you’re working by yourself, but many people work in teams. Let’s say you come back to this code a few months later and you realize a teammate has accidentally used it to add two integers. So you decide to add a little bit of enforcement, using a Python function called isinstance to produce an error if either a or b aren’t strings:

def concat_strings(a, b):
    """Concatenates two strings and returns the result.

    Raises an exception if the arguments aren't strings."""
    if not isinstance(a, str) or not isinstance(b, str):
        raise ValueError("invalid arguments")

    return "{}{}".format(a, b)

This is a really big jump. With this change, concat_strings now produces an error in cases where it used to work. This is a form of what is often called dynamic analysis: every time the function is called at runtime, it checks that its inputs look correct.

Dynamic analysis works quite well! Many high-quality codebases rely on it, with a combination of tests, in various ways. But soon you start noticing patterns (the isinstance one is really common!), so you try and automate those checks a little. You turn those patterns into a system of annotations, for example:

def concat_strings(a: str, b: str) -> str:
    """Concatenates two strings and returns the result."""
    return "{}{}".format(a, b)

These annotations are similar to docstrings: they are communications from the author of the code to the other people working on it. Anyone looking at the definition of concat_strings can instantly grasp that the function is meant to take in two strings and output another. But these annotations are a special kind of communication: one that is understandable by machines just as well. You can build tools on top of them.

Here’s some examples of tools on top of these annotations that I’ve seen in practice2:

  1. Do nothing at all. The communication would purely be to humans, just like any other documentation.
  2. Insert dynamic checks that, in case of a failure, log those errors somewhere and proceed as normal.
  3. Insert dynamic checks that cause the program to fail at runtime. This is sometimes called a contract system.
  4. Check ahead of time, without running the program, that concat_strings is always called with string inputs. Such tools are called type checkers and are examples of what are called static analysis tools.

The most immediate effect is that types form a kind of documentation that is kept up-to-date. I can’t overemphasize how important this is—in my experience working on large projects, documentation falls out of date all the time. Types don’t.

The options 1–3 are possible in languages like Python, where the runtime representation of data doesn’t need to be known in advance. Other languages like C++ and Rust use these annotations to decide the layout of data in memory, so they must use a static type checker.

The politics of labels

Calling a type checker an example of a static analysis tool might raise some eyebrows. The term is usually reserved for a kind of aftermarket tool (often proprietary, though some are free) that tells you about things like missing null checks.

Gender Binary is a Lie, by Tjook. Used under CC BY-NC-ND 2.0.

I’ve been thinking a lot about labels lately. I identify with the labels queer3, trans and nonbinary. They describe who I am and mark the communities I belong to. Another label I sometimes use, but don’t necessarily identify with, is bisexual: the term describes me well, but even though the bisexual community has many lovely people, I don’t personally feel a strong affinity towards it.

Labels like static analysis are similar. They are both technical descriptions and community markers. There’s a static analysis community and a separate type systems community, which is sometimes considered part of the formal verification community. All of these communities work on similar kinds of static analysis tools, but historically they’ve had different focuses and haven’t had much overlap.

When separate communities work on things that technically look similar, there’s often a lot of untapped potential in synthesizing those approaches.

Carrots and sticks

Rust’s advances over C++ involve taking many of the ideas pioneered in static analysis tools—null checks, use-after-free and so on—and moving them into the type system4. At a technical level this doesn’t sound like that big a deal—why write software in Rust if you can just use C++ with a static analysis tool? But, for the communities using Rust, this move has profound social effects.

A typical static analysis tool is run separately, usually as part of continuous integration, and not within a standard developer build. It flags issues that are worth looking at, some of which are real and others are false positives. Usually, you can disable a false positive with a simple flag or comment. These tools have to walk a fine line: too many false positives and its users will end up ignoring it, too many false negatives and it misses bugs.

Rust programmer fighting the borrow checker, by TheZoq2 on Reddit. Image originally from Dark Souls III.

A static type system is terribly annoying, especially when you’re learning it. It’s in your face. You have to deal with it every time you compile your code. It has false positives, but they must be dealt with. They can’t be silenced through a mere comment; you must either:

  • rewrite your code to eliminate those false positives
  • bypass the compiler’s checks using unsafe code, putting a big 🚨 flashing 🚨 warning 🚨 sign on it5.

But what makes a good type system have a learning curve is also what makes it useful once you’re over the hump. I get a real sense of visceral, psychological safety while writing Rust that I don’t get with C, C++ or even Python. Once you understand the peculiar ways in which a type system talks, a type check starts feeling like a conversation with a learned colleague—one who’s got your back.

A good type system becomes even more pertinent once you start using other people’s code. You may not be sure whether your dependencies use a particular static analysis tool, but you do know that everyone’s had to fight the borrow checker. A high-quality type system is a shared, community-wide investment in a way of doing things, and one that is nearly impossible to retrofit into an existing ecosystem.

Rust bet its entire existence on producing a type system that is good enough, and error messages that are clear enough, that developers were willing to accept all the pain of learning it. I think that bet’s paid off.

Liminal code

The relationship between type systems and communities isn’t just one-way: after all, developers decide how type systems are used in practice. Nowhere is this more apparent than in languages like Rust and Haskell, whose communities use type-level proofs quite extensively.

What do I mean by that? Well, consider Rust’s String type for example. It is a wrapper around a series of bytes, with the added, purely semantic constraint that the bytes can be interpreted as valid UTF-8. The runtime representations of String and Vec<u8> are identical.

Landscape of Thorns, from a proposal to mark nuclear waste disposal sites. Concept by Mike Brill, drawing by Safdar Abidi.

Every String is a valid Vec<u8> so a “downgrade” is free, but not every Vec<u8> is a valid String. If you’re converting a Vec<u8> to a String, you must go through some sort of validation step. I like to think of this as liminal code: it lies at the edge of the Vec<u8> world, guarding access to the realm of Strings.

In my experience, liminal code tends to be a very common source of bugs. People like myself spend a lot of our time building tools and systems to ensure that liminal code is correct. The whole point of fuzzing, for example, is to find bugs in such code.

What type-level proofs do is usually not eliminate liminal code6 so much as concentrate it in one place.

  • Without them, as in a language like C with char *, it is impossible to tell at a glance if a char * is supposed to be valid UTF-8. It is common for redundant, dynamic checks to be scattered throughout a codebase as bugs are discovered.

  • In Rust, every path from Vec<u8> to String exists in one file, where it can be carefully audited. All other Rust code can trust that a String is valid UTF-8.

    There’s also an escape hatch that converts a Vec<u8> to a String without validating it, but—as you might expect—it requires unsafe.

This method of scaling up a difficult, careful, time-consuming local analysis to the rest of a system, is something type systems excel at. And there’s nothing preventing you from using type-level proofs in your own Rust code! For example, you may want to model a language identifier as a wrapper over a String7, with the restriction that it is non-empty and doesn’t start with a digit. The code to validate a string would live in one place and be tested and fuzzed. The rest of your system can trust that identifiers are valid.

You could also use it for many other sorts of proofs, such as that a set of state transitions is valid or that a cryptographic library is used correctly.

You may have noticed that the arguments for type-level proofs look a lot like the ones for static type systems. Yet, in practice they’re more common in some environments—particularly functional programming languages and Rust—than in others. I’m not exactly sure why, but if I may hazard a guess for Rust: its affine typing, clear demarcation of unsafe code, and trait system all contribute to type-level proofs being easy to express and hard to bypass. This attracts the sorts of people who want to use them in their own code, and thus is created a virtuous cycle between technology and culture.

Type systems shape communities

Scene from the science fiction movie Arrival, which tackles ideas of linguistic relativity. Image copyright Paramount.

The principle of linguistic relativity, sometimes called the Sapir-Whorf hypothesis, claims that the structure of a language affects how we think. Linguists have gone back and forth over the last few decades on its validity, with most now rejecting the original, strong formulation but accepting a more balanced view.

There’s a case to be made for its relevance for programming languages as well; I’ve tried and put together some ways in which type systems influence the way developers use them. I think this is a useful lens to look at type systems through, and I hope you agree.

The idea of writing this post arose from conversations with Inanna Malick. Thanks to Inanna, kt, Manish Goregaokar, Jane Lusby and Michael Gattozzi for reviewing drafts of this post. Any errors in it are my own.

  1. It’s useful to draw a distinction between type theory and type systems. Type theory originated in mathematics, starting with the early 20th century’s foundational crisis and Bertrand Russell’s attempt to solve a paradox that occurs in naive set theories.

    While type theory successfully addresses Russell’s paradox and other issues, most practicing mathematicians choose to use axiomatic set theories like ZFC instead. Type theory has lived on in formal logic and computer science, and has been applied in various ways through type systems. There’s been some recent foundational mathematics work within the field of homotopy type theory. ↩︎

  2. There are all sorts of in-between things possible as well, such as using dynamic analysis in some parts of your program and static analysis in others. Gradual typing consists of a rich, widely used series of ideas in this space. ↩︎

  3. Within the greater LGBTQ+ community, queer is a bit special because it’s a label that signals a rejection of labels, and an opposition to the norms of straight and gay society. This excerpt from the Xenofeminist Manifesto captures a kind of queerness:

    When the possibility of transition became real and known, the tomb under Nature’s shrine cracked, and new histories–bristling with futures–escaped the old order of ‘sex’. The disciplinary grid of gender is in no small part an attempt to mend that shattered foundation, and tame the lives that escaped it. The time has now come to tear down this shrine entirely, and not bow down before it in a piteous apology for what little autonomy has been won.

  4. There are also aftermarket dynamic analysis tools that are widely used, such as Valgrind and sanitizers. The Rust compiler moves many of these dynamic analyses into the type system as well.

    Note that I’m using type system to describe the combined effects of both the type checker and the borrow checker here: in a nutshell, anything that gets in the way of you being able to run your code. ↩︎

  5. The borrow checker used to reject many valid programs because they weren’t written in precisely the right way. The non-lexical lifetimes project has addressed many of those issues, but others remain, and will continue to do so because of the halting problem.

    The Rust community’s reaction to gratuitous use of unsafe is, I think, best understood as a defense mechanism against ingroup trust violations. This is not a justification for terrible behavior. ↩︎

  6. Though sometimes they can, especially if it is possible to make invalid states unrepresentable. These techniques synergize well: you may model, say, a number between 0–127 as a u7 wrapper around a u8, allowing for the numbers 128–255 to be invalid but internally representable. But at a higher level, a struct that has a u7 field has made those numbers unrepresentable. The practical advantage is that u7 can be tested independently from any struct that uses it.

    What I’m trying to get at here is the general idea of reducing the surface area of sensitive code. I think type-level proofs are very good at achieving this. ↩︎

  7. For performance reasons, a production system may use a wrapper over a Box<str>, &'static str or &'bump str instead. The point still holds. ↩︎