The Feistel Shuffle: Deterministic Data Ordering Without Randomness

Feistel network diagram showing how indices are permuted deterministically

Shuffling data is fundamental to ML training. Without shuffling, models learn the order of examples, not their content. Batch composition matters. Epoch ordering matters.

But shuffling typically relies on random number generators. Different seeds, different shuffles. Different platforms, potentially different shuffles even with the same seed. For deterministic ML, that’s a problem.

The Feistel shuffle solves this: a deterministic permutation that looks random but is completely reproducible.

The Problem with Random Shuffling

Consider the standard approach: Fisher-Yates shuffle with a PRNG.

import random
random.seed(42)
random.shuffle(data)

This is “deterministic” in the sense that the same seed produces the same shuffle — on the same platform, with the same library version, using the same PRNG implementation.

Change any of those, and guarantees disappear. Python’s random module has changed algorithms between versions. NumPy’s shuffle differs from Python’s. C’s rand() is notoriously platform-dependent.

For certifiable systems, “it works on my machine with my library version” isn’t acceptable.

Enter the Feistel Network

A Feistel network is a cryptographic construction that transforms a block of data through multiple rounds. It’s the basis of DES, Blowfish, and other block ciphers. The key property: it’s invertible. Every input maps to exactly one output, and you can reverse the process.

Applied to shuffling, we use the Feistel network as a permutation generator. Given an index from 0 to N-1, the network produces a permuted index — also from 0 to N-1. Every index maps to a unique output. The permutation is determined entirely by the seed, with no external randomness.

The Cycle-Walking Extension

There’s a catch. Feistel networks operate on fixed-size blocks — typically powers of two. If you have 1000 training samples, you need a permutation of exactly 1000 elements, not 1024.

Cycle-walking handles this. When the Feistel output falls outside your range, you apply the Feistel function again, “walking” until you land in the valid range.

uint32_t ct_feistel_permute(const ct_feistel_t *ctx, uint32_t x)
{
    uint32_t result = x;
    
    /* Walk until we land in [0, n) */
    do {
        result = feistel_round(ctx, result);
    } while (result >= ctx->n);
    
    return result;
}

This always terminates (provably), and the output is always a valid permutation of [0, N).

The Implementation

Here’s how certifiable-data implements the Feistel permutation:

typedef struct {
    uint32_t n;           /* Domain size */
    uint32_t half_bits;   /* Bits per half */
    uint32_t mask;        /* Mask for half */
    uint64_t keys[4];     /* Round keys */
} ct_feistel_t;

void ct_feistel_init(ct_feistel_t *ctx, uint32_t n, uint64_t seed)
{
    ctx->n = n;
    
    /* Find smallest power of 2 >= n */
    uint32_t bits = 1;
    while ((1u << bits) < n) bits++;
    
    ctx->half_bits = (bits + 1) / 2;
    ctx->mask = (1u << ctx->half_bits) - 1;
    
    /* Derive round keys from seed */
    ctx->keys[0] = seed ^ 0x243F6A8885A308D3ULL;
    ctx->keys[1] = seed ^ 0x13198A2E03707344ULL;
    ctx->keys[2] = seed ^ 0xA4093822299F31D0ULL;
    ctx->keys[3] = seed ^ 0x082EFA98EC4E6C89ULL;
}

The round keys are derived from the seed using constants (first digits of pi, for those curious). Same seed → same keys → same permutation.

The core Feistel round:

static uint32_t feistel_round(const ct_feistel_t *ctx, uint32_t x)
{
    uint32_t left = x >> ctx->half_bits;
    uint32_t right = x & ctx->mask;
    
    for (int round = 0; round < 4; round++) {
        uint32_t f = round_function(right, ctx->keys[round], ctx->mask);
        uint32_t new_right = left ^ f;
        left = right;
        right = new_right;
    }
    
    return (left << ctx->half_bits) | right;
}

static uint32_t round_function(uint32_t x, uint64_t key, uint32_t mask)
{
    uint64_t mixed = x * 0x9E3779B97F4A7C15ULL;
    mixed ^= key;
    mixed ^= mixed >> 32;
    return (uint32_t)(mixed & mask);
}

Four rounds is sufficient for good diffusion. The round function uses multiplication and XOR — fast, portable, deterministic.

Why This Matters for ML

Reproducible Epochs

Each epoch shuffles the data differently. With Feistel, you parameterise by epoch:

ct_feistel_init(&shuffle, num_samples, seed + epoch);

Epoch 0 with seed 42 always produces the same permutation. Epoch 1 produces a different permutation, but equally deterministic. Resume training at epoch 50, and batches are identical to the original run.

Batch Construction

Given the permutation, batch construction is trivial:

for (uint32_t i = 0; i < batch_size; i++) {
    uint32_t global_idx = batch_start + i;
    uint32_t sample_idx = ct_feistel_permute(&shuffle, global_idx);
    batch[i] = dataset[sample_idx];
}

No random sampling. No reservoir algorithms. Just index transformation.

Audit Trails

The Merkle chain commits batch composition. With Feistel shuffling, you can reconstruct any batch from (seed, epoch, batch_index). Verification doesn’t require storing the actual indices — just the parameters that generated them.

Properties

Bijection: Every input maps to exactly one output. No collisions, no missing values.

Determinism: Same (seed, domain_size) always produces the same permutation.

Uniformity: The permutation is statistically uniform — no obvious patterns.

Efficiency: O(1) per index lookup (amortised, accounting for cycle-walking).

Invertibility: Given the output, you can recover the input. Useful for debugging.

The Inverse

Sometimes you need the reverse mapping: “sample 42 appeared at what position in the shuffled order?”

uint32_t ct_feistel_invert(const ct_feistel_t *ctx, uint32_t y)
{
    uint32_t result = y;
    
    do {
        result = feistel_inverse_round(ctx, result);
    } while (result >= ctx->n);
    
    return result;
}

The inverse uses the same keys in reverse order with the left/right swap reversed. Feistel networks are their own inverse — one of their elegant properties.

Comparison to Alternatives

Fisher-Yates

Pro: Simple, well-understood
Con: Requires storing the full permutation or generating sequentially
Con: PRNG-dependent

Knuth Shuffle

Same as Fisher-Yates (they’re equivalent algorithms).

Format-Preserving Encryption (FPE)

Pro: Cryptographically strong
Con: Typically requires AES, which is overkill for shuffling
Con: More complex implementation

Feistel

Pro: Self-contained, no external dependencies
Pro: Simple to implement correctly
Pro: Invertible by construction
Pro: Works for any domain size (with cycle-walking)
Con: Slightly slower than direct random shuffle

For safety-critical ML, the determinism and simplicity outweigh the minor performance cost.

Test Vectors

For any Feistel implementation to be compliant, these test cases must pass:

/* Domain size 100, seed 0 */
ct_feistel_init(&ctx, 100, 0);
assert(ct_feistel_permute(&ctx, 0) == 73);
assert(ct_feistel_permute(&ctx, 1) == 24);
assert(ct_feistel_permute(&ctx, 50) == 91);
assert(ct_feistel_permute(&ctx, 99) == 8);

/* Bijection check */
bool seen[100] = {false};
for (uint32_t i = 0; i < 100; i++) {
    uint32_t j = ct_feistel_permute(&ctx, i);
    assert(j < 100);
    assert(!seen[j]);
    seen[j] = true;
}

If your implementation produces different values, it’s not compatible. Cross-platform bit-identity requires exact agreement.

Practical Considerations

Domain Size Changes

If your dataset size changes (new data added, samples removed), the permutation changes entirely. For incremental training, you may want to fix the domain size and handle growth separately.

Large Datasets

The cycle-walking can require multiple iterations when the domain isn’t a power of two. Worst case for n = 2^k + 1 is roughly 2x the iterations. Average case is much better. For billion-element datasets, this is negligible.

Parallelisation

Each index lookup is independent. You can parallelise batch construction trivially — each thread computes its own indices without synchronisation.

Conclusion

The Feistel shuffle replaces “shuffle with random seed and hope for the best” with “deterministic permutation from first principles.” Same seed, same dataset size, same shuffle — guaranteed, across platforms, across time.

For ML training that must be reproducible, this isn’t a minor detail. It’s a foundational requirement. When you claim “we can reproduce this training run,” the data ordering is part of that claim.

The certifiable-* ecosystem uses Feistel shuffling throughout certifiable-data. Combined with the Merkle chain, every batch is cryptographically committed and independently reconstructible.

As with any architectural approach, suitability depends on system requirements, risk classification, and regulatory context. For deterministic ML pipelines, Feistel shuffling provides the mathematical foundation for reproducible data ordering.

Explore the implementation in certifiable-data or see the permutation tests for the complete test vector suite.

The Feistel Shuffle: Deterministic Data Ordering Without Randomness

The Problem with Random Shuffling

Enter the Feistel Network

The Cycle-Walking Extension

The Implementation

Why This Matters for ML

Reproducible Epochs

Batch Construction

Audit Trails

Properties

The Inverse

Comparison to Alternatives

Fisher-Yates

Knuth Shuffle

Format-Preserving Encryption (FPE)

Feistel

Test Vectors

Practical Considerations

Domain Size Changes

Large Datasets

Parallelisation

Conclusion

About the Author

Discuss This Perspective

The Feistel Shuffle: Deterministic Data Ordering Without Randomness

The Problem with Random Shuffling

Enter the Feistel Network

The Cycle-Walking Extension

The Implementation

Why This Matters for ML

Reproducible Epochs

Batch Construction

Audit Trails

Properties

The Inverse

Comparison to Alternatives

Fisher-Yates

Knuth Shuffle

Format-Preserving Encryption (FPE)

Feistel

Test Vectors

Practical Considerations

Domain Size Changes

Large Datasets

Parallelisation

Conclusion

About the Author

Occasional Technical Updates

Discuss This Perspective