The C language specification, in an attempt to be as portable and flexible as possible, contains a lot of undefined behavior. There are many things the spec avoids specifying so that it can be ported to platforms which may not have certain features or may incur excessive overhead from implementing them. Given the nature of undefined behavior (being that it is undefined, after all), compilers are free to do whatever they want when they encounter it (or do nothing in particular at all), so long as their resulting code still meets the requirements of the C language spec.
Let’s start by taking a look at the following function:
void f1() {
for (int i = 0; i >= 0; ++i) {}
}
This function contains undefined behavior. Do you see it? The variable i
is defined as int
(aka signed int
). The C language specification does not specify what happens when a signed integer overflows. Assuming int
is 32 bits, this means that when i
is equal to 2,147,483,647, the behavior of ++i
is undefined. With some compilers on some systems, it will simply wrap around to -2,147,483,648, but there is no guarantee that that will happen!
So, this function contains undefined behavior. Where does the clang bug come into play? Let’s put this into a full program and see what happens:
#include <stdio.h>
void f1() {
for (int i = 0; i >= 0; ++i) {}
}
void f2() {
"Hello from f2!");
puts(
}
void (*volatile p1)(void) = &f1;
void (*volatile p2)(void) = &f2;
int main(void) {
p1();return 0;
}
Intuitively, we would think that by calling p1()
(which is a pointer to f1
), we should get an infinite loop. If you compile this with clang 13 with no optimizations, you do indeed get an infinite loop! Running this program will go forever.
However, when optimizations are enabled, compilers are notorious for doing all sorts of tomfoolery. Let’s compile this program with clang 13 with optimizations enabled (-O3
, in this case). After running it this time, a funny thing happens: it prints Hello from f2!
to the console and exits successfully. What happened here?
You can see both of these taking place on Godbolt here.
It turns out that clang 13 notices that every branch of that function contains undefined behavior. It knows this is undefined behavior, and it is therefore valid (though perhaps unintuitive to the programmer) for it to just remove the offending loop entirely. At first glance this seems like valid behavior for the compiler. However, it’s clearly still doing something wrong, since f2
is called when we call f1
! As it turns out, clang 13 is taking it a step further than you might expect–instead of optimizing out the loop and continuing through the function, it simply stops emitting code for the function entirely, resulting in an empty function.
We’ve previously discussed that the compiler can do whatever it wants when it finds undefined behavior, so why is this a miscompile? As it happens, the C standard says that no two distinct functions may have the same address. This means that if you define two functions (as we have done with f1
and f2
) they are not allowed to point to the same instructions. However, if we look at the disassembly of this program, we see the following:
f1:
f2:
mov edi, offset .L.str
jmp puts
We see here that the labels f1
and f2
appear back to back and both point to the same instructions: the body of f2
. Thus, this is a miscompile, as two distinct functions f1
and f2
have the same address. Neat!
Thanks to [@m13253 on Twitter]tweet for bringing my attention to this bug. You can also read the LLVM bug report if you are interested in the nitty-gritty details.