10 December 2025

Escape from Lean 4

I took this year’s Advent of Code as an opportunity to learn Lean 4.

Lean is a language where if you want to write arr[i] you need a proof that i is in range. If the reason i is in range is because its value is (hi + lo) / 2, and hi and lo are both in range, then your proof might require some Loogle searches, and if that comes up empty, 20 minutes of work and 1-8 lines of code.

This post is about the escape hatches in Lean.

Ways of knowing

First, Lean tries to find a proof for you in the current environment.

arr[i] is syntactic sugar for something like GetElem.getElem arr i (by get_elem_tactic). The getElem function takes a proof that the index is valid as its third argument. The by-expression invokes a tactic to find the proof for you; tactics are allowed to see the static environment in order to do their job. It reminds me of how direct eval in JS can see the dynamic environment.

So one way to brute-force the proof is a run-time check:

if valid : i < arr.size
then arr[i]
else ...

The if expression with a : puts a proof of the fact being checked into the environment. Then get_elem_tactic sees that fact and uses it to satisfy getElem’s type signature.

Of course, you then have to figure out something to do in the else branch.

Panic!

Another escape is panic.

Lean has a panic! expression and an array indexing notation arr[i]! that panics if i happens to be out of bounds.

However, panic doesn’t terminate execution in Lean. A panicking expression prints a message, then returns a default value of whatever type the context expects. Execution carries on. Absolutely wild.

Partial functions

By default, Lean requires you to prove that your functions terminate. To some extent, this is necessary for the logic to be sound.

This is usually no trouble, but if you use recursion (as the language has no built-in looping constructs) you need to establish that each recursive call operates on a "smaller" problem than its caller. Lean provides termination_by and decreasing_by clauses to help with this, but alternatively there’s an escape.

You can label a function as partial. Then it’s OK if it doesn’t terminate. (To keep the logic sound, this is only allowed if the return type can be shown to have at least one value. For data types this is, of course, super trivial, but for proposition types the only way you know there’s a value is to prove the proposition. And in generic code, you might have to impose a funny [Inhabited t] bound on the type parameter.)

Oddly, non-partial functions can call partial functions.

Printf debugging

Most I/O has to use the IO monad in Lean, but for debugging there is a welcome escape:

dbg_trace "got here"
f x y

This logs a message just before f x y gets evaluated. The fact that newlines (or equivalently ;) are used for sequencing here is a little unusual outside of a monad, but it works out really nicely.

I have not tried to understand Lean’s grammar, which is (as usual for languages in this neighborhood) wack.

Sorry

It’s possible to work for a while on a proof only to find out the line you’ve taken won’t work. Maybe it relies on an assumption that simply isn’t true. So doing things with proofs is risky in terms of your time.

And the risk is not totally optional. Sometimes values carry around proofs. You can use a Fin x instead of an Int - it contains an integer value and a proof that the value is in the range 0..x. This is great because arr[fin] always works, but it means your program is using this Fin arr.size type now, so the type structure of your program is that much stricter. Anyone wanting to enter your code will have to prove their indices are actually that type at the door.

Which is to say, sometimes Lean demands a proof and it would be hours or days of work to provide one. As someone starting out, I seem to stumble to the edge of these unseen chasms in the landscape at random times.

For better or worse... there is an escape.

The sorry keyword can stand in for any value, including proofs. The compiler warns about it, but it will produce an executable anyway! If the sorry is actually reached at run time, Lean prints a nice hardcore INTERNAL PANIC: executed 'sorry' and exits the process. But if you use sorry in a proof, you’re in Undefined Behavior territory. This program segfaults on my machine:

def main : IO Unit :=
  let arr := #["human", "world", "jorendorff"]
  let i := 3000
  have : i < arr.size := sorry
  let guy := arr[i]
  IO.println s!"Hello, {guy}!"

Cthulhu almost tamed, Cthulhu expert announces

Back in March, Herb Sutter posted a long blog post about progress taming the undefined behavior in C++.

Herb Sutter is great, the progress he's reporting seems really important to me, and I applaud all the efforts documented there. However, I feel like the post has some misleading parts and someone should point them out.

(1) Since C++11 in 2011, more and more C++ code has already become UB-free. Most people just didn’t notice.

  • Spoiler: All constexpr/consteval compile-time code is UB-free. As of C++26 almost the entire language and much of the standard library is available at compile time, and is UB-free when executed at compile time (but not when the code is executed at run time, hence the following additional work all of which is about run-time execution).

It is really significant to have a compile-time mode that's defined in the standard and includes most of the langugage. If you haven't played with it, you should. C++ compilers now include a C++ interpreter that you can use to interrogate the language. Rust has this for a subset of the language (minus the standard) and it's been super useful.

It's very strange to ask readers to believe this has had a significant impact on the safety of C++ code in the real world. The use of the phrase "more and more C++ code" reads that way to me, at least. But of course practically all C++ code runs at run time, not at compile time. That's why they call it run time. An example of something you can't do in constexpr code at compile time, even in C++26, is open a file. Or react to user input in any way.

If successful, these steps would achieve parity with the other modern memory-safe languages as measured by the number of security vulnerabilities, which would eliminate any safety-related reasons not to use C++.

The steps in question are, in my words:

  1. Standardize compile-time constexpr/consteval without UB, as described above. (done!)
  2. Ratify C++26 to add contracts, eliminate UB from uninitialized locals, and add optional standard library hardening.
  3. Catalog and address all the other run-time UB.

Needless to say, step 3 is the lion's share of the work. It's the closest thing to the “draw the rest of the owl” meme I've ever seen in real life.

I'm not saying this to downplay the work that went into C++26. Again, I actually do think that work is smart and important. In particular, Thomas Köppe's work on uninitialized reads is great stuff. There's been a sea change: the C++ standard committee as a whole is interested in reducing UB and making the language a productive partner in the work of building secure systems. That hasn't always been the case.

I appreciate Herb Sutter's positive attitude and I think everything that can be done for C++ should be done. I can imagine a future where C++ can be memory-safe, with acceptable run-time costs. But let's be honest about how long this road is and how far along it we can see.