Why memcmp Fails on Structs: Padding, Floats, Silent Bugs

Diagram showing how memcmp compares struct bytes including padding, producing incorrect equality results when padding bytes differ between otherwise identical structs

memcmp on a struct compares bytes, not values, and produces silently incorrect results when structs contain padding or floating-point fields. The function answers “are the bytes identical?” which is not the same question as “are the values equal?” This distinction is responsible for an entire class of equality bugs that compile without errors, pass unit tests on one platform, and fail in production on another.

By William Murray, Founder of SpeyTech - deterministic computing for safety-critical systems. Inverness, Scottish Highlands.

This article explains the two categories of memcmp failure on structs - padding bytes and floating-point representation - demonstrates each with concrete examples, and provides the compiler flags and coding practices that prevent these bugs.

What memcmp Actually Does

Definition: Struct Byte Comparison

Struct byte comparison using memcmp reads every byte in a structure sequentially and reports whether all bytes match, without knowledge of field boundaries, types, or semantic meaning.

memcmp is defined in C99 section 7.21.4.1. It compares n bytes starting from two memory addresses and returns zero if all bytes match. It has no knowledge of what those bytes represent. It does not know where one struct field ends and another begins. It does not know whether a byte is part of a float, an integer, or padding inserted by the compiler.

This makes memcmp a byte-identity function, not an equality function. The difference matters whenever the byte representation of a value does not uniquely correspond to its semantic meaning.

memcmp on a struct is unreliable whenever the struct contains compiler-inserted padding or IEEE 754 floating-point fields.

Bug One: Padding Bytes

C compilers insert invisible padding bytes between struct fields to satisfy alignment requirements. These padding bytes are uninitialised by default. memcmp reads them. Two structs with identical field values can compare as different because their padding bytes do not match.

How Padding Works

The C standard requires that each data type be stored at a memory address that is a multiple of its alignment requirement. On most platforms, a 4-byte int must start at an address divisible by 4. An 8-byte double must start at an address divisible by 8. When struct fields have different alignment requirements, the compiler inserts padding bytes between them to satisfy these constraints.

Consider this struct on a typical 64-bit platform:

struct sensor {
    char  id;        // 1 byte at offset 0
    // 3 bytes padding (to align int to offset 4)
    int   reading;   // 4 bytes at offset 4
};
// sizeof(struct sensor) == 8, not 5

The struct occupies 8 bytes, not 5. The compiler inserts 3 bytes of padding after id to align reading on a 4-byte boundary. These 3 bytes are not initialised by field assignment.

The padding is worse with mixed types:

struct record {
    char    flag;      // 1 byte at offset 0
    // 7 bytes padding (to align double to offset 8)
    double  value;     // 8 bytes at offset 8
    char    status;    // 1 byte at offset 16
    // 7 bytes trailing padding (to align struct size to 8)
};
// sizeof(struct record) == 24, not 10

This struct has 14 bytes of padding out of 24 total bytes. More than half of the bytes that memcmp reads are uninitialised padding. The sizeof operator reports the padded size, so memcmp(&a, &b, sizeof(struct record)) reads all 24 bytes including every padding byte.

The memcmp Failure

struct sensor a, b;
a.id = 'A';
a.reading = 42;
b.id = 'A';
b.reading = 42;

// a and b have identical field values
// memcmp(&a, &b, sizeof(struct sensor)) may return non-zero

The field values are identical. The padding bytes are whatever happened to be in memory at allocation time. memcmp reads all 8 bytes, including the 3 padding bytes, and reports a difference that does not exist at the value level.

Why memset Does Not Fully Solve Padding

The common fix is to zero the struct before use:

memset(&a, 0, sizeof(struct sensor));
memset(&b, 0, sizeof(struct sensor));
a.id = 'A';
a.reading = 42;
b.id = 'A';
b.reading = 42;

// Now memcmp returns 0 - padding is zeroed

This works if every code path that creates or modifies the struct remembers to zero it first. In practice, this discipline breaks down. A struct returned from a function, copied from a network buffer, or deserialised from storage may not have zeroed padding. A single missed memset in a large codebase reintroduces the bug.

memset also does not help when structs are copied with assignment. The C standard does not require assignment to copy padding bytes, and compilers may or may not copy them. Two structs that were zeroed and then assigned identical values through different code paths may still have different padding.

Padding bytes are a reliability problem because they are invisible in the source code and their values depend on memory history rather than field assignment.

Bug Two: IEEE 754 Float Representation

The second category of memcmp failure is more subtle. Even with zeroed padding, memcmp produces incorrect results on structs containing floating-point fields. IEEE 754 defines values whose byte representation violates the assumption that “same bytes means same value” and “different bytes means different value.”

Positive Zero and Negative Zero

IEEE 754 defines two representations of zero:

float pos_zero = +0.0f;  // bytes: 0x00000000
float neg_zero = -0.0f;  // bytes: 0x80000000

// (pos_zero == neg_zero) is true  - same value
// memcmp: not equal               - different bytes

Positive zero and negative zero are equal values with different byte representations. The sign bit differs, but the IEEE 754 standard specifies that +0.0 and -0.0 compare as equal. memcmp sees different bytes and reports inequality. A struct containing a float field set to +0.0 in one instance and -0.0 in another will fail a memcmp equality check despite having semantically identical values.

Negative zero arises naturally from operations like 1.0f / -INFINITY or 0.0f * -1.0f. Code does not need to explicitly assign -0.0 for this bug to manifest.

NaN is Not Equal to Itself

IEEE 754 defines that NaN (Not a Number) is not equal to any value, including itself:

float nan1 = 0.0f / 0.0f;  // NaN, bytes: 0x7FC00000
float nan2 = 0.0f / 0.0f;  // NaN, bytes: 0x7FC00000

// (nan1 == nan2) is false  - NaN != NaN by IEEE 754
// memcmp: equal             - identical bytes

NaN values with identical bit patterns compare as unequal under IEEE 754 arithmetic but as equal under memcmp. A struct containing a NaN field will pass a memcmp check when it should fail an equality check. This is the opposite direction from the zero problem: memcmp reports equality when the values are not equal.

IEEE 754 also permits multiple NaN representations. Quiet NaN and signalling NaN have different bit patterns, and the payload bits of a NaN can vary. Two NaN values that are both “not a number” may have different byte representations, causing memcmp to report inequality between values that are both NaN.

The Two Directions of Failure

memcmp fails in both directions on float-containing structs:

Scenario	Value comparison	memcmp result	Correct answer
+0.0 vs -0.0	equal	not equal	equal
NaN vs NaN (same bits)	not equal	equal	not equal
NaN vs NaN (different bits)	not equal	not equal	not equal

One comparison says different when the values are the same. Another says same when the values are different. Both are silent. Both compile without error. memcmp has no mechanism to detect or report these cases because it operates on bytes, not on IEEE 754 semantics.

memcmp on a float-containing struct is fundamentally broken for equality testing, regardless of padding.

Where This Bug Hides in Practice

Deduplication and Caching

Systems that deduplicate records or cache computed results often use memcmp for fast equality checking. A cache keyed on a struct containing a float field will store duplicate entries for +0.0 and -0.0 values, or will incorrectly return cached results for NaN inputs. In a sensor aggregation pipeline processing millions of readings per day, the duplicate entries accumulate silently. The cache grows unbounded because readings that should be deduplicated are stored as distinct entries.

Change Detection

Monitoring systems that detect state changes by comparing current and previous struct snapshots with memcmp will generate false change notifications when padding bytes differ, or will miss genuine changes when NaN fields remain byte-identical. A safety monitoring system that suppresses alerts because “the state hasn’t changed” when the float field transitioned from a valid reading to NaN is a potential safety failure.

Network Protocol Comparison

Structs received from network buffers may have different padding byte values depending on the sender’s memory state. Two messages with identical payloads can fail memcmp comparison because the padding bytes were not zeroed before transmission. This is particularly common in embedded systems where structs are transmitted directly over serial or CAN bus without serialisation. The sender and receiver may have different compilers, different alignment rules, and different padding layouts for the same struct definition.

Serialisation Round-Trip

A struct serialised to disk and deserialised back may not have the same padding byte values as the original. The serialisation format may not preserve padding, and the deserialised struct may be allocated in different memory with different residual byte values. A system that uses memcmp to verify serialisation integrity will report corruption that does not exist.

Hash Table Keys

Using memcmp as the equality function for hash table keys containing structs leads to hash collisions that are never resolved. Two structs with identical field values but different padding bytes will hash to different buckets (if the hash function also reads padding) or the same bucket (if it doesn’t), but the equality check will fail in either case. The result is duplicate entries in the hash table that should have been matches.

Test Assertions

Unit tests that use memcmp to assert struct equality may pass on one platform and fail on another due to different padding layouts. A struct with the same field values can have different sizes on different platforms due to different alignment requirements, making memcmp-based tests non-portable. A test suite that passes on x86 and fails on ARM due to different struct padding is a waste of debugging time.

memcmp-based equality checks are unreliable in any system where structs contain padding or floating-point fields.

The Correct Approach: Field-by-Field Comparison

The only reliable approach to struct equality is comparing each field individually:

bool sensor_equal(const struct sensor *a, const struct sensor *b) {
    return a->id == b->id && a->reading == b->reading;
}

For structs containing floating-point fields, the comparison function must handle the IEEE 754 edge cases explicitly:

#include <math.h>

struct measurement {
    int   sensor_id;
    float value;
    float uncertainty;
};

bool measurement_equal(const struct measurement *a,
                       const struct measurement *b) {
    if (a->sensor_id != b->sensor_id) return false;

    // Handle NaN: both NaN is considered equal for our domain
    if (isnan(a->value) && isnan(b->value)) {
        // Domain decision: are two NaN readings "equal"?
        // This depends on the application.
    } else if (a->value != b->value) {
        return false;
    }

    if (isnan(a->uncertainty) && isnan(b->uncertainty)) {
        // Same domain decision
    } else if (a->uncertainty != b->uncertainty) {
        return false;
    }

    return true;
}

The field-by-field approach makes the comparison semantics explicit. The programmer decides whether NaN equals NaN, whether +0.0 equals -0.0, and whether floating-point comparison should use exact equality or an epsilon tolerance. memcmp makes all of these decisions for you, and makes them incorrectly.

The Cost of Field-by-Field Comparison

Field-by-field comparison is slower than memcmp for large structs. memcmp can use word-aligned reads and SIMD instructions, while field-by-field comparison generates individual load and compare instructions per field. For a struct with 20 fields, the performance difference is measurable.

In practice, the performance difference rarely matters. Struct comparison is almost never the bottleneck. And when it is, a hand-optimised comparison function that handles padding and floats correctly will still be faster than debugging a memcmp failure in production.

Performance is not a valid reason to use memcmp on structs that contain padding or floats. Correctness is not optional.

Compiler Flags and Static Analysis

GCC and Clang do not warn about memcmp on structs by default. The compiler cannot always determine whether a struct contains padding or floats at the call site, particularly when memcmp receives void * arguments.

-Wclass-memaccess (C++ only, GCC) warns when memory functions like memcmp are used on non-trivial class types. This does not help in C.

Static analysis tools provide better coverage:

Clang Static Analyzer can detect memcmp on structs with padding in some cases
Coverity flags memcmp on structs as a potential defect
MISRA C Rule 21.16 prohibits memcmp for comparison of objects with padding bytes or floating-point members
CERT C EXP42-C warns against comparing padding data

For safety-critical code compiled under DO-178C or IEC 62304, memcmp on structs is an audit finding. Certification requires demonstrating that every comparison produces the intended result. A comparison that reads uninitialised padding or misinterprets IEEE 754 values is a gap in that demonstration.

The recommended approach is to ban memcmp on structs by policy and enforce it through code review and static analysis.

The Safety-Critical Connection

In safety-critical systems, struct comparison errors are not just correctness issues - they are potential safety failures. A sensor reading comparison that reports “no change” when the reading has changed can suppress critical alerts. A state machine transition guard that uses memcmp to detect state changes can miss transitions or trigger spurious transitions.

Fixed-point arithmetic addresses the float comparison problem by construction. Q16.16 fixed-point uses int32_t exclusively for all values. Integer comparison has none of the IEEE 754 edge cases: there is no negative zero, no NaN, and identical values always have identical bit patterns. Field-by-field comparison of a fixed-point struct reduces to integer comparison, which is both correct and fast.

The c-from-scratch course covers struct layout, padding, and comparison as part of the C type system fundamentals.

Frequently Asked Questions

Why does memcmp fail on structs in C?

memcmp compares raw bytes without knowledge of struct field boundaries, compiler-inserted padding, or IEEE 754 floating-point semantics. Padding bytes are uninitialised and vary between allocations. Float values like +0.0 and -0.0 have different byte representations despite being equal. These cause memcmp to report incorrect equality results.

How do I correctly compare two structs in C?

Compare each field individually using a dedicated comparison function. For integer fields, use ==. For float fields, handle NaN and signed zero explicitly based on the application’s equality semantics. Never use memcmp on structs that contain padding or floating-point members.

Does memset zeroing the struct before use fix the memcmp problem?

memset zeroing eliminates the padding problem only if every code path that creates or modifies the struct zeros it consistently. Assignment does not guarantee padding bytes are copied, so structs initialised through different paths may still differ. memset does not fix the IEEE 754 float problem - +0.0 and -0.0 remain byte-different even in zeroed structs.

What compiler flags detect memcmp struct comparison bugs?

No standard GCC or Clang warning specifically targets memcmp on padded structs in C. Static analysis tools like Coverity and the Clang Static Analyzer provide partial coverage. MISRA C Rule 21.16 prohibits memcmp for struct comparison. The most reliable prevention is a project-wide policy enforced through code review.

Why does fixed-point arithmetic avoid the float comparison problem?

Fixed-point arithmetic represents values as integers, eliminating IEEE 754 edge cases entirely. Integer comparison in C is exact: identical values always have identical bit patterns, there is no negative zero, and there is no NaN. Struct comparison reduces to integer field comparison, which is both correct and deterministic.

Conclusion

memcmp on structs compares bytes rather than values, producing silently incorrect results when structs contain padding or IEEE 754 floating-point fields. The only correct approach is field-by-field comparison that makes equality semantics explicit. As with any architectural approach, suitability depends on system requirements, risk classification, and regulatory context.

Why memcmp Fails on Structs: Padding, Floats, Silent Bugs

What memcmp Actually Does

Bug One: Padding Bytes

How Padding Works

The memcmp Failure

Why memset Does Not Fully Solve Padding

Bug Two: IEEE 754 Float Representation

Positive Zero and Negative Zero

NaN is Not Equal to Itself

The Two Directions of Failure

Where This Bug Hides in Practice

Deduplication and Caching

Change Detection

Network Protocol Comparison

Serialisation Round-Trip

Hash Table Keys

Test Assertions

The Correct Approach: Field-by-Field Comparison

The Cost of Field-by-Field Comparison

Compiler Flags and Static Analysis

The Safety-Critical Connection

Frequently Asked Questions

Why does memcmp fail on structs in C?

How do I correctly compare two structs in C?

Does memset zeroing the struct before use fix the memcmp problem?

What compiler flags detect memcmp struct comparison bugs?

Why does fixed-point arithmetic avoid the float comparison problem?

Conclusion

About the Author

Discuss This Perspective

Why memcmp Fails on Structs: Padding, Floats, Silent Bugs

What memcmp Actually Does

Bug One: Padding Bytes

How Padding Works

The memcmp Failure

Why memset Does Not Fully Solve Padding

Bug Two: IEEE 754 Float Representation

Positive Zero and Negative Zero

NaN is Not Equal to Itself

The Two Directions of Failure

Where This Bug Hides in Practice

Deduplication and Caching

Change Detection

Network Protocol Comparison

Serialisation Round-Trip

Hash Table Keys

Test Assertions

The Correct Approach: Field-by-Field Comparison

The Cost of Field-by-Field Comparison

Compiler Flags and Static Analysis

The Safety-Critical Connection

Frequently Asked Questions

Why does memcmp fail on structs in C?

How do I correctly compare two structs in C?

Does memset zeroing the struct before use fix the memcmp problem?

What compiler flags detect memcmp struct comparison bugs?

Why does fixed-point arithmetic avoid the float comparison problem?

Conclusion

About the Author

Occasional Technical Updates

Discuss This Perspective