Undefined behaviour in C gives the compiler permission to silently remove safety checks and delete code the programmer intended to execute. The result is compiled output that does not match what the source code appears to do. This is not a compiler bug. It is an optimisation strategy that the C standard explicitly permits.
By William Murray, Founder of SpeyTech - deterministic computing for safety-critical systems. Inverness, Scottish Highlands.
This article explains what undefined behaviour is, how compilers exploit it to transform correct-looking code into incorrect programs, demonstrates five patterns where UB silently changes program behaviour, and provides the compiler flags and coding practices that make UB visible.
What Undefined Behaviour Means
Undefined behaviour is any operation whose result the C standard does not define, granting the compiler unrestricted freedom to assume it never occurs and to optimise accordingly.
The C standard (C99 section 3.4.3) defines undefined behaviour as “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.” The key phrase is “no requirements.” The compiler is not required to produce an error. It is not required to produce a warning. It is not required to do what the programmer intended. It is free to do anything at all.
In practice, “anything at all” means the compiler treats undefined behaviour as an optimisation opportunity. If the standard says a particular operation has no defined result, the compiler assumes the operation never happens. This assumption propagates through the optimiser, eliminating branches, removing checks, and reordering operations in ways that change the program’s observable behaviour.
The programmer writes a safety check. The compiler proves the check is unreachable (because reaching it would require undefined behaviour). The compiler removes the check. The safety check the programmer relied on is not present in the compiled binary.
How Compilers Exploit Undefined Behaviour
Compilers do not introduce bugs through undefined behaviour. They remove code that appears to handle situations that the standard says cannot occur. The optimiser’s reasoning is formally correct according to the standard. The problem is that the standard’s model and the programmer’s model of the code do not agree.
The Optimiser’s Logic
The optimiser works backward from the standard’s guarantees:
- The standard says operation X is undefined behaviour
- A well-formed program never exhibits undefined behaviour
- Therefore operation X never occurs in this program
- Therefore any code that only executes when X occurs is dead code
- Dead code can be removed
This chain of reasoning is applied at every optimisation level above -O0. At -O2 and -O3, the optimiser aggressively exploits UB assumptions to eliminate branches, hoist operations out of loops, and reduce instruction count. The resulting binary may bear little resemblance to the source code’s apparent control flow.
Five Patterns Where UB Deletes Your Code
Pattern 1: Signed Integer Overflow
Signed integer overflow is undefined behaviour in C. The compiler is free to assume it never happens. This has consequences for overflow checks written after the operation:
int x = get_value();
int result = x + 1;
if (result < x) {
// Overflow check - programmer expects this to catch wraparound
handle_overflow();
}
// Compiler: signed overflow is UB, so result < x is never true
// Compiler removes the entire if blockThe programmer intended to detect overflow by checking whether the result wrapped around to a negative value. The compiler reasons that since signed overflow is undefined, x + 1 is always greater than x. The overflow check is removed. If x is INT_MAX, the addition overflows silently and handle_overflow() is never called.
GCC with -O2 will remove this check entirely. The compiled binary contains no branch, no comparison, and no call to handle_overflow().
The fix is to check before the operation, not after:
int x = get_value();
if (x == INT_MAX) {
handle_overflow();
return;
}
int result = x + 1; // Safe: overflow cannot occurAlternatively, -fwrapv forces GCC and Clang to treat signed overflow as two’s complement wraparound, making the post-operation check valid. This disables some optimisations but makes signed arithmetic behave as most programmers expect.
Pattern 2: Null Pointer Elimination
If the compiler sees a pointer dereference before a null check, it can remove the null check:
void process(struct device *dev) {
int type = dev->type; // Dereference first
if (dev == NULL) {
// Null check after dereference
return;
}
configure(dev, type);
}The compiler reasons: dev->type dereferences dev. Dereferencing a null pointer is undefined behaviour. Therefore dev cannot be null at that point. Therefore dev cannot be null anywhere in the function (since no assignment occurs between the dereference and the check). Therefore the null check is dead code. The check is removed.
The Linux kernel encountered this exact bug in the TUN/TAP driver (CVE-2009-1897). A null pointer dereference occurred before the null check, GCC removed the check at -O2, and the kernel became vulnerable to a privilege escalation attack.
The fix is straightforward - check before you dereference:
void process(struct device *dev) {
if (dev == NULL) {
return;
}
int type = dev->type; // Safe: dev is known non-null
configure(dev, type);
}Pattern 3: Strict Aliasing Violations
The strict aliasing rule (C99 section 6.5, paragraph 7) states that an object can only be accessed through a pointer of its declared type, a compatible type, or char *. Accessing an object through a pointer of a different type is undefined behaviour.
This commonly breaks when programmers use type-punning to reinterpret data:
uint32_t float_to_bits(float f) {
// UB: accessing float through uint32_t pointer
return *(uint32_t *)&f;
}The compiler may assume that the uint32_t * pointer and the float variable do not alias (refer to the same memory). The optimiser may reorder reads and writes, return stale values, or eliminate the read entirely because it “knows” the uint32_t pointer cannot refer to the float.
The correct approach uses memcpy, which is the only standard-sanctioned way to reinterpret bytes between types:
uint32_t float_to_bits(float f) {
uint32_t bits;
memcpy(&bits, &f, sizeof(bits)); // Defined behaviour
return bits;
}Modern compilers optimise the memcpy approach to the same single instruction as the pointer cast, but without the undefined behaviour. The compiled output is identical. The semantics are defined.
-fno-strict-aliasing disables the aliasing assumption, making type-punning through pointer casts safe at the cost of some optimisations. The Linux kernel compiles with this flag because the kernel relies heavily on type-punning.
Pattern 4: Infinite Loop Removal
C11 introduced a rule (section 6.8.5, paragraph 6) that allows the compiler to assume that loops with no observable side effects will terminate. An infinite loop with no I/O, no volatile access, and no synchronisation operations is undefined behaviour.
void wait_for_hardware(volatile int *flag) {
// This loop is fine - volatile prevents removal
while (*flag == 0) { }
}
void spin_forever(void) {
// UB in C11: no side effects, compiler may remove
while (1) { }
unreachable_code(); // Compiler may execute this
}The practical impact is limited because most real loops have side effects. However, busy-wait loops in embedded firmware that poll non-volatile memory-mapped registers can be silently removed by an optimising compiler. The volatile qualifier prevents removal by making each read an observable side effect.
Pattern 5: Out-of-Bounds Access Assumptions
Accessing an array out of bounds is undefined behaviour. The compiler can use this to eliminate range checks in surprising ways:
int table[4] = {10, 20, 30, 40};
int lookup(int index) {
if (index < 0 || index >= 4) {
return -1; // Bounds check
}
return table[index];
}
int get_value(int index) {
int val = table[index]; // Access before check - UB if out of bounds
if (index < 0 || index >= 4) {
return -1; // Compiler may remove this
}
return val;
}In get_value, the array access occurs before the bounds check. The compiler reasons that table[index] with an out-of-bounds index is UB, so index must be in bounds. The bounds check is dead code. This mirrors the null pointer pattern: the operation that would be UB is performed before the check that would prevent it.
The fix is the same principle: validate before you use.
Why the Standard Permits Undefined Behaviour
Undefined behaviour exists in C for two historical reasons, both rooted in portability and performance.
Hardware Diversity
When C was standardised, it needed to run on machines with different integer representations (two’s complement, one’s complement, sign-magnitude), different byte orders, different alignment requirements, and different overflow behaviour. Rather than mandating one behaviour and making C slow or impossible on some hardware, the standard left these operations undefined. The compiler could generate whatever the hardware naturally produced.
C23 finally mandated two’s complement for signed integers, eliminating one source of undefined behaviour. But signed overflow remains undefined because compilers have spent decades building optimisations around the assumption that it does not occur.
Optimisation Opportunities
Undefined behaviour gives the compiler room to optimise. If signed overflow is undefined, the compiler can assume x + 1 > x is always true. This enables loop unrolling, strength reduction, and branch elimination that would be impossible if overflow had defined semantics. The performance impact is real - some benchmarks show 5-15% improvement from UB-based optimisations.
The trade-off is that code which accidentally triggers UB may be silently transformed into code with different behaviour. The compiler is optimising for the common case (well-formed code) at the expense of the uncommon case (code with latent bugs).
Compiler Flags That Control UB
Several compiler flags modify how GCC and Clang handle undefined behaviour:
-fwrapv defines signed integer overflow as two’s complement wraparound. This eliminates Pattern 1 (overflow check removal) at the cost of some optimisations. Programs that depend on overflow detection should use this flag.
-fno-strict-aliasing disables the strict aliasing assumption, allowing type-punning through pointer casts. This eliminates Pattern 3. The Linux kernel uses this flag.
-fsanitize=undefined (UBSan) instruments the program to detect undefined behaviour at runtime. Instead of silently exploiting UB, the compiler inserts checks that report violations. This is a development and testing tool, not a production flag, because of the performance overhead.
-ftrapv causes signed integer overflow to generate a trap (abort) rather than continuing silently. This is more aggressive than -fwrapv - instead of defining the overflow, it terminates the program.
The recommended approach for safety-critical development is to compile with UBSan during testing, use -fwrapv in production if overflow handling is required, and use -fno-strict-aliasing if type-punning is unavoidable.
The following table summarises which flags address which patterns:
| Flag | Pattern 1 | Pattern 2 | Pattern 3 | Pattern 4 | Pattern 5 |
|---|---|---|---|---|---|
-fwrapv | Yes | - | - | - | - |
-fno-strict-aliasing | - | - | Yes | - | - |
-fsanitize=undefined | Yes | Yes | Yes | Yes | Yes |
-ftrapv | Yes | - | - | - | - |
No single flag addresses all patterns. UBSan is the closest to comprehensive but is not suitable for production deployment. The fundamental fix is to write code that does not invoke undefined behaviour rather than relying on compiler flags to change how UB is handled.
The Safety-Critical Connection
In safety-critical systems, undefined behaviour is a certification obstacle. DO-178C requires demonstrating that the source code accurately represents the intended behaviour. If the compiler removes a safety check through UB exploitation, the compiled binary no longer matches the source code’s intent. The traceability requirement between source and object code is violated.
MISRA C addresses this directly. MISRA C:2012 Rule 1.3 states that there shall be no occurrence of undefined behaviour. This is a mandatory rule, not advisory. MISRA C also includes specific rules that target individual UB sources: Rule 12.1 requires parenthesisation of expressions to prevent evaluation order ambiguity, Rule 12.2 prohibits shift operations that exceed the bit width, and Rule 17.2 prohibits recursion to prevent stack overflow.
Fixed-point arithmetic eliminates one major source of UB by design. Q16.16 fixed-point uses explicit saturation arithmetic with overflow checks before every operation. Signed overflow never occurs because every operation is range-checked. The arithmetic is fully defined under C99.
The c-from-scratch course covers undefined behaviour, compiler optimisation, and defensive coding practices in depth.
Frequently Asked Questions
What is undefined behaviour in C?
Undefined behaviour is any operation for which the C standard imposes no requirements on the result. The compiler may assume UB never occurs in a correct program and optimise accordingly, which can result in code removal, reordering, or transformation that changes the program’s observable behaviour.
How does the compiler remove safety checks through undefined behaviour?
The compiler reasons that if an operation before a safety check would be UB, those inputs cannot occur. The safety check for those inputs is therefore dead code and can be removed. This happens silently at optimisation levels above -O0.
Why does signed integer overflow cause undefined behaviour?
The C standard leaves signed overflow undefined to enable optimisations that assume arithmetic behaves mathematically. C23 mandated two’s complement representation but kept overflow undefined because changing it would break existing optimisations.
How do I detect undefined behaviour in my code?
Compile with -fsanitize=undefined (UBSan) during development and testing. UBSan instruments the program to detect and report UB at runtime, including signed overflow, null pointer dereferences, out-of-bounds access, and alignment violations. Run UBSan across the full test suite before production deployment.
Does -O0 prevent undefined behaviour problems?
-O0 reduces the likelihood of UB-related surprises because the compiler performs fewer optimisations that exploit UB assumptions. However, UB is still present in the source code and will manifest when the code is compiled at -O2 or higher for production. Developing at -O0 and shipping at -O2 creates a class of bugs that only appear in production builds.
Conclusion
Undefined behaviour in C grants the compiler permission to remove, reorder, or transform any code that depends on operations the standard does not define. The only reliable defence is to eliminate undefined behaviour from the source rather than relying on compiler flags to change how it is exploited. As with any architectural approach, suitability depends on system requirements, risk classification, and regulatory context.