Notes
2017-02-15
Undefined behavior in C is a common source
of bugs, and sometimes, of funny ones.
Here is my story about one.
A few months ago I was working on a
function that looked like this.
for (i=0; arr[i]!=0 && i<2; ++i) {
/* do some work */
}
When my program was compiled without
optimizations it would behave correctly,
but turning optimizations on made it
incorrect. I was able to quickly
pinpoint the above loop as the root of
my issues, but the symptom of the bug
was quite unusual: Stepping in the
debugger revealed that the loop body was
only executed once, when it should have
been running twice! Indeed, the array
arr
contained two non-zero values.
After a little bit of head scratching,
I eventually realized what the compiler
was doing. The variable arr
was declared
as int arr[2]
, so accessing its third
element is undefined
behavior. Because of this, a valid
program cannot access arr[2]
;
but if the loop body is run twice, the
test condition will check arr[2]!=0
at the end of the second iteration. The
consequence of this reasoning is that,
assuming the input program is valid, the
second loop iteration will not be run
and can be gotten rid of!
I thought this was quite a remarkable
use of undefined behavior: Invalid array
accesses do not happen in valid programs,
so if the compiler can prove such access
happens in a branch, it means that the
branch is dead code.