iddqd, or the hardest kind of unsafe Rust

Mirrored from the canonical version on the Oxide blog.

Update 2026-06-16: I talked about this post on Oxide and Friends, in an episode titled The Hardest Kind of Unsafe Rust. Check it out on YouTube, Transistor, or wherever else you get your podcasts.

I’m the main author of iddqd, a Rust library for maps (named after the Doom cheat code) where keys are borrowed from values. At Oxide we use it extensively in Omicron, our control plane—the software that sits at the heart of every Oxide rack, provisions resources like compute and storage for our customers, and ensures the rack stays up and running over time. iddqd maintains in-memory indexes of the kinds of large records that show up everywhere in a system like that, such as disks or sled inventories. As a result, it must be correct: if it misbehaves, our control plane can malfunction in ways that are unpredictable and hard to diagnose.

iddqd consists of a fair amount of unsafe code underneath. There’s been some recent concern over the amount of unsafe code in Rust rewrites, so I thought I’d write about some of the unsafe code in iddqd and how we try to tame it.

What problem does iddqd solve?#

With Rust’s standard library maps, keys are stored separately from values. Let’s say you want to store a map of records keyed by an email address. With std::collections::BTreeMap, you might write something like:

// Email is typically a newtype which validates that the email address is well-formed.
// In this example, we alias it to String for simplicity.
type Email = String;

struct User {
    name: String,
    age: u8,
}

let mut users = BTreeMap::<Email, User>::new();
users.insert(
    "alice@example.com".to_string(),
    User { name: "Alice".to_string(), age: 30 },
);
users.insert(
    "bob@example.com".to_string(),
    User { name: "Bob".to_string(), age: 35 },
);

This approach has what I consider to be a pretty major downside: the key (email) is not stored in the same struct as the value (the rest of the record). How would you handle this? One way is to pass around both the email and the user, for example with get_key_value:

fn process_user(email: &Email, user: &User) {
    /* ... */
}

let (email, user) = users.get_key_value("alice@example.com").unwrap();
process_user(email, user);

As an extension, you could maybe have a struct which combines both at fetch time:

struct UserRecord<'a> {
    email: &'a Email,
    user: &'a User,
}

let (email, user) = users.get_key_value("alice@example.com").unwrap();
let record = UserRecord { email, user };

In practice, this gets quite unwieldy when you have lots of different types of records that need this kind of treatment.

Alternatively, you could duplicate the email across the key and the value:

struct User {
    email: Email,
    name: String,
    age: u8,
}

let mut users = BTreeMap::<Email, User>::new();
let email = "alice@example.com".to_string();
users.insert(
    email.clone(),
    User { email, name: "Alice".to_string(), age: 30 },
);

But that has the risk that the emails stored in the key and the value fall out of sync.

iddqd provides a better alternative. With IdOrdMap, you can write something like:

struct User {
    email: Email,
    name: String,
    age: u8,
}

// You implement a trait which tells iddqd how to extract the key from the
// record.
impl IdOrdItem for User {
    // The key type can borrow from the value!
    type Key<'a> = &'a Email;

    fn key(&self) -> Self::Key<'_> {
        &self.email
    }

    // This macro expands into a small amount of boilerplate to work around
    // a Rust borrow checker limitation.
    id_upcast!();
}

// Then, you can insert records:
let mut users = IdOrdMap::<User>::new();
users
    .insert_unique(User {
        email: "alice@example.com".to_string(),
        name: "Alice".to_string(),
        age: 30,
    })
    .unwrap();
users
    .insert_unique(User {
        email: "bob@example.com".to_string(),
        name: "Bob".to_string(),
        age: 35,
    })
    .unwrap();

// And look them up by email:
assert_eq!(&users.get("alice@example.com").unwrap().name, "Alice");
assert_eq!(users.get("bob@example.com").unwrap().age, 35);

At Oxide, this has proven to be an invaluable pattern: many of the records in our control plane are quite large (think database lookups), and iddqd is a great fit for them. It also comes with several other features that directly address pain points we’ve dealt with at Oxide. A few worth calling out:

First-class support for complex keys that borrow from more than one field, without having to resort to workarounds like dynamic dispatch.
Maps with two or three keys per item, each independently indexing the same record, without the usual pattern of maintaining multiple maps by hand.
Serde implementations that serialize as sequences rather than maps, so that non-string keys can be serialized in JSON¹. Importantly, these implementations reject duplicate keys. (For backwards compatibility, serialization as maps is also supported.)

Like many of our other crates, iddqd is built for Oxide’s needs but is generally useful to the Rust community. You’re welcome to use it in your own projects as well.

On unsafe Rust#

Before I move on, I want to talk about what it means for unsafe to exist in a memory-safe language. The big concern is undefined behavior (UB): a program behaves in an unpredictable way because core assumptions made by the language or compiler have been violated. Rust calls an abstraction sound if no safe code can use it to cause UB, and unsound otherwise.

The vast majority of Rust code is safe, which (assuming that any unsafe used by the safe code is sound) means that no UB can occur. However, due to the fundamental undecidability of static analysis (see Rice’s theorem), it is impossible for the Rust compiler—or any kind of algorithm that terminates—to accept all programs without UB and reject all programs with it. Therefore, when writing such an algorithm, its authors have to make a decision: do they reject some programs without UB, accept some programs with UB, or both? The Rust compiler does the first: within the context of safe Rust, it rejects all programs with UB but also some without UB. (This is the correct choice!)

What if your program is in the no-man’s-land where it gets rejected even though you know it doesn’t have UB? To express those kinds of programs, the Rust compiler provides an escape hatch: the unsafe keyword. By writing it, the author takes responsibility for soundness: they vouch that no safe code can use this to cause UB, and that the compiler should trust them.

What does it mean for some unsafe code to not have UB? Informally, it must follow the rules of Rust—all the rules that the compiler proves for safe Rust. Some examples of these rules are:

There are no data races.
There must be no reads of uninitialized or freed memory.
There must not ever be multiple aliases of &mut references to the same region of memory.
Immutable data must not be mutated.

The rules of Rust are very hard to reason about! I consider them to be significantly harder than C, for example, particularly due to the rules around mutable aliasing². So judicious use of unsafe Rust usually involves encapsulating it behind a safe abstraction.

Unsafe Rust requires reasoning about safe Rust#

In all but the simplest of cases, it is not possible to be sure that such an abstraction is sound without also reasoning about the surrounding safe Rust. In this section, I’ll walk through three examples in increasing order of difficulty.

The first example is the split_at_mut method on slices. This method splits a mutable slice into two separate parts. split_at_mut is a safe method, but safe Rust lacks the ability to express this kind of partitioning of a slice. Thus, split_at_mut requires unsafe Rust.

The implementation is roughly:

fn split_at_mut<T>(slice: &mut [T], mid: usize) -> (&mut [T], &mut [T]) {
    assert!(mid <= slice.len());
    let ptr = slice.as_mut_ptr();
    let remaining = slice.len() - mid;
    unsafe {
        (
            std::slice::from_raw_parts_mut(ptr, mid),
            std::slice::from_raw_parts_mut(ptr.add(mid), remaining),
        )
    }
}

To be sure of the correctness of the unsafe block, you also have to consider several things about the surrounding safe code:

that the function was provided a &mut [T], and not, for example, a &[T]
the assert! ensuring that mid is in bounds
the fact that remaining is slice.len() - mid, and not, for example, (slice.len() - mid) + 1

If any of these invariants (all of which are safe Rust) are not upheld, the safe abstraction becomes unsound.

The next level up is that in many cases you also have to analyze the entire module the unsafe code is present in. Consider this vector type:

struct MyVec<T> {
    ptr: NonNull<T>,
    len: usize,
    cap: usize,
}

impl<T> MyVec<T> {
    pub fn get(&self, i: usize) -> Option<&T> {
        (i < self.len).then(|| unsafe { &*self.ptr.as_ptr().add(i) })
    }
}

The soundness of get relies on every other method in the module, safe and unsafe, ensuring that len is correct and in bounds. Encapsulation and privacy (len cannot be mutated from outside the module) are load-bearing here.

The hardest situations, though, are when unsafe code is generic and calls back into user-supplied code. In those cases, the fundamental principle is:

Safe Rust code, no matter how pathological or adversarially written, must never cause unsafe code to exhibit undefined behavior.

In other words, you no longer have the luxury of analyzing just the safe code you see; you must put yourself in the shoes of an attacker, considering every safe Rust program one could write, and be resilient to all of them. For example, it is possible for user code to misbehave in a way that leads to internal tables or data structures being corrupted. Safe but buggy Rust may cause unsafe Rust to be slower, produce incorrect results, or even panic, but it must not cause UB!

(Why assume the worst? You might argue that well-meaning Rust developers are quite unlikely to write adversarial code. One reason for having a firm rule is that guarding against adversarial code means guarding against innocent mistakes as a corollary; a bright line makes it easier to assign responsibility for protection against UB.)

An example is the ExactSizeIterator trait, which has a len method that returns the exact size of an iterator. A developer might be tempted to take advantage of this in unsafe Rust:

pub fn collect_exact<T, I: ExactSizeIterator<Item = T>>(iter: I) -> MyVec<T> {
    let n = iter.len();
    let mut v = MyVec::with_capacity(n);
    for (i, item) in iter.enumerate() {
        unsafe { std::ptr::write(v.ptr.as_ptr().add(i), item); }
    }
    v.len = n;
    v
}

This code does not cause UB if the iterator is well-behaved. But there’s no guarantee of such behavior! It is entirely possible to write a broken ExactSizeIterator implementation that returns a bogus result, all in safe Rust:

struct BadIterator {
    remaining: usize,
}

impl Iterator for BadIterator {
    type Item = u32;
    fn next(&mut self) -> Option<u32> {
        (self.remaining > 0).then(|| { self.remaining -= 1; 42 })
    }
}

impl ExactSizeIterator for BadIterator {
    fn len(&self) -> usize {
        0 // but this actually yields `self.remaining` items!
    }
}

collect_exact will write into unallocated memory in this case, so it is unsound in general³.

Another example is the std::io::Read trait, where a safe but buggy Read implementation can return the wrong number of bytes read. See Rust RFC 2930 for more discussion about this.

iddqd is in exactly this situation: it consists of generic data structures which call into user-supplied code, so it is working at this highest level of difficulty⁴.

How iddqd works#

iddqd’s architecture is pretty straightforward: it uses a combination of a list of items (internally called an ItemSet) and a table with indexes into the slots. For example, IdHashMap<T> consists of:

an ItemSet<T>, which (similar to the slab crate) internally has:
- a Vec<ItemSlot<T>>, where ItemSlot<T> is an enum with two variants:
  - Occupied(T)
  - Vacant { next: ItemIndex } (where ItemIndex represents an integer index into the Vec<ItemSlot<T>>)
- a free_head: ItemIndex, which points to either the most recently freed Vacant slot, or is a sentinel (marked as SENT in the diagrams below) indicating that no free vacant slots are available. (The combination of free_head and the Vacant slots forms a free chain.)
a hashbrown::HashTable over ItemIndex (hashbrown is the fast hash table implementation used by the standard library)

For example, an IdHashMap holding three items might look like the diagram below. The top row is the hash table over ItemIndex; the middle row is the Vec<ItemSlot<T>>.

An IdHashMap holding three items. The top row is a hash table mapping hash(k₁) → 0, hash(k₀) → 3, and hash(k₂) → 4. The middle row is the item vector: slot 0 Occupied(t₁), slot 1 Vacant (next: SENT), slot 2 Vacant (next: 1), slot 3 Occupied(t₀), slot 4 Occupied(t₂). free_head points to slot 2.

Adding a new item consists of checking free_head to see if any Vacant slots are available. If so, we read the next index out of that Vacant slot, overwrite the slot with Occupied(...), then set free_head to the next value we read. Otherwise, we push a new Occupied slot to the end. Finally, the hash table is updated, fetching the key and computing the hash, then recording the new index in the hash table.

Starting from the state above: Let’s say we want to insert a new item t₃. Calling insert(t₃) recycles slot 2 from the free chain, advances free_head to 1, and records the new hash entry:

The same IdHashMap after inserting t₃. Slot 2 is recycled from the free chain and now holds Occupied(t₃), and a new hash entry hash(k₃) → 2 is added. free_head advances to slot 1.

Removing an item given a key involves this process in reverse: first, consult the hash table using the hash computed from the key, and remove the found index from the hash table. With the index in hand, replace the ItemSlot with Vacant, setting the internal next to the current value of free_head. Finally, set free_head to the index that was just replaced with Vacant.

Continuing from the post-insert state above: Calling remove(k₀) drops the hash entry for k₀, marks slot 3 as Vacant with its next pointing at the old free_head (1), and updates free_head to 3:

The IdHashMap after removing k₀. The hash entry for k₀ is dropped, slot 3 becomes Vacant with next pointing at the old free_head (1), and free_head updates to 3.

This pattern generalizes to the other map types:

IdOrdMap follows a similar structure, except the hash table is replaced with a B-tree index⁵.
BiHashMap and TriHashMap store two and three hash tables respectively.

Unsafe Rust in iddqd#

iddqd has a few different kinds of unsafe Rust within it. Most uses of unsafe follow well-established patterns also used within Rust’s standard library—these are not dependent on user code, and a sufficiently smart borrow checker would make these uses of unsafe unnecessary. But there are a few that stand out.

Perhaps the most challenging one to reason about is mutable iteration over items. For IdOrdMap::iter_mut, iteration is ordered by the key. One might start off by writing something like:

pub struct IterMut<'a, T: IdOrdItem> {
    items: &'a mut ItemSet<T>,
    // This is an ordered iterator over the B-tree equivalent to a hash table.
    iter: btree_table::Iter<'a>,
}

impl<'a, T: IdOrdItem + 'a> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        // Obtain the next index to return, in order.
        let index = self.iter.next()?;

        // Fetch the item.
        let item: &mut T = &mut self.items[index];

        Some(item)
    }
}

But this would immediately return an error:

error: lifetime may not live long enough
   --> crates/iddqd/src/id_ord_map/iter.rs:90:9
    |
 79 | impl<'a, T: IdOrdItem + 'a> Iterator for IterMut<'a, T> {
    |      -- lifetime `'a` defined here
...
 83 |     fn next(&mut self) -> Option<Self::Item> {
    |             - let's call the lifetime of this reference `'1`
...
 90 |         Some(item)
    |         ^^^^^^^^^^ method was supposed to return data with lifetime `'a`
    |                    but it is returning data with lifetime `'1`

The reason this occurs is that Rust’s iterators require that the returned values do not borrow from the iterator itself⁶. Many uses of iterators do not need this capability—in particular, most for loops only operate on one item at a time:

let mut items = IdOrdMap::<T>::new();
// ... insert some items

for item in &mut items {
    item.value += 1;
}

But Rust’s iterators also let you create values that outlive the iterator, via e.g. the collect combinator:

let mut items = IdOrdMap::<T>::new();
// ... insert some items

let items_mut: Vec<&mut T> = items.iter_mut().collect();

items_mut cannot outlive items itself, but can outlive the iterator returned by items.iter_mut().

Crucially, this relies on knowing that all of the &mut T returned by the iterator are disjoint—or, in other words, that the iterator never yields the same value twice. (Remember that one of the rules of Rust is that there must never be multiple &mut references to the same memory.) But the Rust borrow checker just sees a succession of accesses into a list, and it has no way of knowing that the indexes are all different. If the human writing the code has knowledge that all of the indexes are different, though, they can use an unsafe pattern called lifetime extension to tell the borrow checker to let this code through:

impl<'a, T: IdOrdItem + 'a> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;

    #[inline]
    fn next(&mut self) -> Option<Self::Item> {
        // Obtain the next index to return, in order.
        let index = self.iter.next()?;

        // Fetch the item.
        let item: &mut T = &mut self.items[index];

        // Extend the lifetime of the item to 'a.
        // 
        // SAFETY: The indexes returned by `self.iter` are known to all be different,
        // so successive calls to `IterMut::next` never create mutable aliases to
        // the same memory.
        let item = unsafe { std::mem::transmute::<&mut T, &'a mut T>(item) };

        Some(item)
    }
}

A bug and a fix#

In the previous section, we saw how unsafe Rust acts as an escape hatch for facts that the Rust compiler cannot establish by itself—in this case, that the indexes returned by self.iter are all different. But are the indexes actually different? Based on the storage scheme described above, it would seem like that is the case. But this is generic code which calls into user-provided functions, so there’s a chance that sufficiently pathological user code can trick the map into storing duplicate indexes.

Let’s take one such example we recently found and fixed. Suppose you have an IdOrdMap called items storing five items with integer keys 0 through 4, inserted and stored in order. Here, the top row is the B-tree storing indexes in key order, while the bottom row is the Vec<ItemSlot<T>>:

An IdOrdMap storing five items with integer keys 0 through 4. The top row is a B-tree of (key, index) pairs in key order; the bottom row is the item vector, with each B-tree entry pointing to its matching Occupied slot.

Now suppose you use the Entry API (a standard Rust data structure pattern) to fetch an item by index 0:

// In this example, the integer keys are wrapped in another type. We're going
// to swap in a different Ord for Key in the next step.
let entry = items.entry(Key(0));

This will successfully look up the item stored at index 0 and return an OccupiedEntry. The OccupiedEntry internally stores the fact that the item was found at index 0.

Now, at this point, let’s deliberately switch the Ord implementation for Key to always return Equal, no matter the values inside. There are a few different ways to do this in safe Rust, most of which use some form of interior mutability. (This is unlikely in ordinary Rust, but that doesn’t matter: unsafe generic Rust must handle arbitrarily pathological Rust!)

Now remove the entry from the map using OccupiedEntry::remove:

let item = entry.remove();

At this point, iddqd would attempt to remove index 0 from the B-tree, descending into the tree again but comparing only by key⁷. Because we made comparisons by key always return Equal, the comparison would short-circuit at the first element it compares against. For example, if the first comparison lands on the B-tree entry (key = 2, index = 2), then that entry would be removed. But the OccupiedEntry was carrying index = 0, so it would also remove item 0 from the item set:

The IdOrdMap left inconsistent after a buggy removal. The B-tree entry for key=2/index=2 was removed instead of key=0/index=0; slot 0 is now Vacant with free_head = 0 even though its B-tree entry remains, and slot 2 (key=2) is orphaned with nothing pointing to it.

The map is no longer in a consistent state. The index 0 is still in the B-tree index but the corresponding slot in the ItemSet is vacant; meanwhile, item 2 in the ItemSet does not have any pointers to it.

At this point, though, the invariant that there aren’t duplicate indexes hasn’t been violated. Where it does get violated is with this next step: suppose the Ord implementation now switches back to being honest⁸, and you insert a new entry with key 1000. Since free_head points to item 0, iddqd will attempt to reuse that slot. It is this step that results in duplicate indexes:

The IdOrdMap after inserting key=1000 into the reused slot 0. Two B-tree entries, key=0/index=0 and key=1000/index=0, now both point at slot 0, creating duplicate indexes, while slot 2 remains orphaned.

There are now duplicate indexes pointing to the same item, and IterMut becomes unsound.

This was fixed through a combination of two things:

When descending into the B-tree, also check equality against the index. In other words, if we’re looking for a key with a known index in the B-tree, compare against both key and index. As covered in the example above, previously, we would just check that the key is the same; the pathological Ord implementation returned Equal, tricking the map into believing that the entry at (key = 2, index = 2) was the one to remove. We changed this to additionally use the index as a tie-breaker. The comparison against (key = 2, index = 2) now resolves to Less (since 0 < 2), so the search doesn’t spuriously match here.
By construction, the only way the search can now succeed is if the stored index actually equals the one we’re looking for.
If there are no matches, fall back to a linear scan. The tie-breaker eliminates spurious matches on the wrong entry, but it doesn’t guarantee that the search finds the right entry. (The B-tree is sorted by key, while the tie-breaker compares by index, and in general these two orderings are independent.) If the tree search yields no results, we must still keep the tree and the ItemSet in sync. In that case, we remove the index from the B-tree by doing a linear scan without calling into user code. The remove operation then takes linear rather than logarithmic time, but that is an acceptable tradeoff, since buggy user code is the only way to encounter the fallback.

This is the kind of long-range reasoning that is sometimes required to establish the soundness of a safe abstraction. The overall system relies on all of these moving parts working correctly across arbitrarily complex series of operations.

Validating iddqd#

Because iddqd is such a foundational data structure at Oxide, we go through great lengths to validate its correctness, including the soundness of its abstractions. There is a great deal of analytical reasoning performed by experienced Rust authors and reviewers, as described in the section above. But we also empirically validate iddqd along several different dimensions. None of the layers is sufficient by itself, but by rigorously doing all of them we can be more confident that iddqd is correct.

The layers of validation we apply are:

Technique	Catches
Analytical reasoning	Soundness errors
Example-based tests (including doctests)	Regressions, broken examples
Pathological tests with Miri	Adversarial trait impls causing UB
Model-based tests (PBT with `NaiveMap` oracle)	Divergence from oracle over random operation sequences
Panic safety fault injection	Invariant violations on mid-operation panics
LLM adversarial review	Blind spots in adversarial user code
Formal verification	Proofs over arbitrarily adversarial trait impls

In this section, I’ll provide a brief overview of each layer. Click through the links to see more details and code samples.

Analytical reasoning#

All blocks and patterns of unsafe Rust have been analyzed by at least one human reviewer, and up to three. Thanks to several of my Oxide colleagues for lending their time and expertise to these reviews—these discussions helped sharpen our reasoning considerably.

The reasoning for each unsafe block is captured in a // SAFETY: comment above the block. We use Clippy’s undocumented_unsafe_blocks lint to verify that these comments are present.

Example-based tests#

The simplest kind of empirical validation is with example-based tests: create a map, perform some operations, and ensure that the results of those operations are as expected. iddqd has unit tests for its internals and integration tests for the public API. These also live as doctests, where they double as documentation.

Pathological tests with Miri#

iddqd also has a battery of pathological tests which supply buggy implementations of Ord and other traits.

In CI, both regular and pathological tests are run under the Miri interpreter, which can detect many classes of undefined behavior. (If you’re curious about the details, we run with Miri under both Stacked and Tree Borrows.) But note that some classes of UB can be detected in the regular test environment as well. For example, in the previous section we established that the tables not containing duplicate indexes is a necessary condition for soundness; this invariant can be verified outside the Miri context as well.

Model-based testing and fault injection#

iddqd uses two layers of randomized testing: model-based comparison against a known-correct oracle, and fault injection on top of it.

For data structures, example-based tests alone are generally considered insufficient. A much stronger kind of testing is model-based testing, also known as stateful property-based testing (stateful PBT): random sequences of operations are applied to an instance of the type, and the results are compared against an oracle that is known to be correct. iddqd has extensive model-based tests against a NaiveMap oracle that is inefficient but clearly correct. (I’ve talked about this style of testing before on Oxide and Friends.)

An extension of model-based testing is fault injection, where bugs are randomly inserted into user code. For iddqd, a fruitful avenue for fault injection has been panic safety (or unwind safety): user code panicking in the middle of an operation must not cause the map’s invariants to be violated⁹. We systematically explore fault injection by generating random sequences of map operations, where each operation is associated with a (randomly selected) panic countdown. Each call into user code decrements the countdown by 1; if it reaches 0, the code panics. Randomized panic safety testing found several subtle bugs in iddqd (example), including some that escalated into unsoundness.

The model-based tests also verify internal invariants after each operation, such as the no-duplicate-index condition described in the previous section. We’ve found in practice that model-based tests are too slow to run under Miri, so we verify the invariants on which soundness (and correctness) is known to be dependent instead.

LLM adversarial review#

Another kind of validation that’s recently become available is LLM-driven adversarial review. Current-generation frontier models¹⁰ found several ways to write pathological implementations of user code that would corrupt the map. In one notable case, an LLM constructed a way for the map’s invariants to break on a custom allocator panicking and unwinding. This is a panic safety issue (and a failure mode I hadn’t thought about before), but distinct from the existing panic safety tests that only covered panics in regular user code like an Ord implementation.

LLMs can sometimes produce plausible-but-wrong soundness claims. An effective way to guard against that is red-green TDD, using Miri as an oracle: For a soundness bug, first have the LLM write a test case demonstrating the bug, running it under Miri to show undefined behavior—the red phase. Then, after fixing the bug, rerun the test to show that it now passes—the green phase.

What about formal methods?#

To formally verify iddqd, one’s first thought would be to use a model checker like Kani to establish that the maps don’t violate internal invariants. But iddqd is unfortunately a bit too complicated for Kani to handle, and the required proofs blow up beyond what is feasible for the tool.

The Creusot deductive verifier can help Rust developers prove their code is free of panics and other errors, but as of this writing it’s unable to prove invariants that must hold even if user code panics or otherwise misbehaves.

The infeasibility of proofs is a common problem with applying formal methods to implementations, so they are often applied to a higher-level specification instead. For iddqd, the closest thing to a specification is NaiveMap, but it can be easily observed to be correct without needing a formal proof.

How do we ensure that the implementation matches the specification? Model-based testing does a lot of heavy lifting here. While they aren’t a formal proof that iddqd matches NaiveMap, running model-based tests thousands of times in CI provides fairly high confidence that it does.

We keep an eye on developments in this space; the in-development cargo-anneal, which lets you embed Lean soundness proofs alongside unsafe Rust as doc comments, looks quite interesting.

If you’re working on formal verification tooling, iddqd is a great candidate to benchmark against because of its crisp invariants and relatively constrained yet non-trivial scope. We’d be especially interested in proofs that hold over arbitrary trait implementations, and of refinement between iddqd and NaiveMap. Please reach out by filing an issue or emailing me!

Update 2026-06-16: Since publication, we’ve used Soteria to verify that iddqd’s no-duplicate-index invariant holds under arbitrarily adversarial Hash and Ord implementations. Soteria is similar to Kani in spirit, though with different internals that make it more compatible with iddqd. Soteria is implemented as a symbolic execution engine that lets you declare points of nondeterminism in your code. (For example, within iddqd, we represent an adversarial Hash or Ord return value as nondeterministic, since we want to model adversarially bad user code.) Then, it:

Compiles the Rust program into a form suitable for formal verification.
Interprets this form until it reaches a point of nondeterminism, treating each such point as a symbol rather than a concrete value.
Branches at each point where control flow depends on a symbol, validating that there is no UB and that the invariants specified by the program are upheld. A lot of the cleverness lies in how the possibility space is collapsed down to something manageable. For example, Soteria doesn’t have to enumerate every possible Hash value; rather, it only has to fork in places where control flow depends on the Hash value.

For IdOrdMap from the example above, the proof runs against the real B-tree and its unsafe code, and it checks exactly the index tie-breaker and linear search fallback described here. (Remove either of them and the proof fails.)

The benefits and limits of this approach are both apparent. It covers every possible return value at once, but to stay fast enough for CI, we have to restrict the proof to operation sequences of depth 3. (In contrast, the property-based tests generate sequences of hundreds of operations.) So it acts as a complement to the existing layers of validation, not a replacement.

We plan to keep expanding the scope of formal verification within iddqd as the tooling continues to mature.

Conclusion#

Writing unsafe generic Rust is extremely difficult. Each invariant that the unsafe code relies upon has to be carefully upheld over arbitrary trait implementations, including adversarial ones. This post covers one such example: how an Ord implementation could be carefully written to trick the map into creating mutable aliases to the same memory, and how we fixed it.

No single technique can hope to catch all bugs, which is why iddqd uses several layers of validation. Humans carefully reasoning about every line of unsafe code goes a long way; example-based, pathological, and randomized tests provide empirical evidence; and frontier models can find new and surprising ways to break code that humans might not have thought of. This is a lot of machinery, but at Oxide we believe that for foundational infrastructure, this level of rigor is justified. And if you agree, we’re hiring!

Diagrams courtesy Ben Leonard. Discuss on Hacker News and Lobsters.

serde_json also supports integer keys; it serializes and deserializes them as strings. ↩︎
In C, the aliasing rules are about accesses: multiple pointers to the same memory are fine, as long as you don’t have a data race on them or violate strict aliasing. In Rust, it is UB for multiple &mut references to the same memory to ever exist, even if they’re not actively being mutated simultaneously. ↩︎
An exercise for the reader. There is a second, entirely different, kind of bug in collect_exact: it can leak memory with some user-supplied iterators. This isn’t a soundness bug, but it is a bug nonetheless. Try working out how this can happen and why this isn’t a soundness bug. (Hint: think about user code panicking.) ↩︎
For this discussion we’re restricting ourselves to single-threaded Rust, since iddqd is not a concurrent data structure. Concurrency amplifies the difficulty of reasoning about Rust several-fold further. ↩︎
The index is built on top of the standard library BTreeMap. This required making BTreeMap work with an external comparator through some further unsafe trickery. ↩︎
For a proposed version of an iterator which does let you borrow values from the iterator, see LendingIterator. ↩︎
Is this second descent necessary? BTreeMap’s own OccupiedEntry stores a cursor into the map to avoid a second descent, and iddqd could be restructured to hold one. But there are other ways to trigger the same issue without using Entry, for example by using a chaotic Ord implementation that randomly returns Equal at times. I’m using Entry here as the clearest way to demonstrate the bug. ↩︎
Restoring Ord is necessary here because, with Equal, the duplicate-detection step at the start of insert_unique would short-circuit and reject the insert. The point is that user code is free to misbehave in arbitrary ways, including changing its mind from one call to the next. ↩︎
Rust binaries can be configured to either unwind (be recovered from) or abort (tear down the process). Panic safety only matters with panic = 'unwind'. At Oxide, we ship our system software with panic = 'abort'. However, libraries cannot assume abort and must work with unwind. ↩︎
Claude Opus 4.7 and GPT-5.5. ↩︎