The Certifiable-* Ecosystem: Eight Projects, One Deterministic ML Pipeline

The complete certifiable-* ecosystem: 8 projects, 700+ tests, proven bit-identical across platforms

There’s a question that blocks AI adoption in safety-critical systems:

“Can you prove the model running on deployed hardware is exactly the same as what you tested?”

Not “similar”. Not “statistically equivalent”. The same — bit for bit, hash for hash, across different platforms, compilers, and architectures.

With TensorFlow Lite, PyTorch, or ONNX Runtime, the answer is no. Floating-point arithmetic varies by platform. Hash table iteration order depends on memory allocation. Thread scheduling is inherently non-deterministic.

For most applications, that doesn’t matter. For aerospace, medical devices, and autonomous vehicles — where certification requires evidence, not assumptions — it’s a fundamental barrier.

The certifiable-* ecosystem removes that barrier.

Eight Projects, One Pipeline

The ecosystem consists of eight interconnected projects, each handling one stage of the ML pipeline:

Stage	Project	Purpose	Commitment
0	certifiable-data	Data pipeline	Merkle root of batches
1	certifiable-training	Model training	Gradient chain hash
2	certifiable-quant	Quantization	Error certificate
3	certifiable-deploy	Deployment packaging	Attestation tree
4	certifiable-inference	Forward pass	Predictions hash
5	certifiable-monitor	Runtime monitoring	Ledger digest
6	certifiable-verify	Verification	Report hash
—	certifiable-harness	End-to-end orchestration	Golden reference

Every stage produces a cryptographic commitment. Every commitment chains to the next. Break any link, and verification fails.

The Core Problem: Non-Determinism

Traditional ML frameworks aren’t designed for determinism. They’re optimised for flexibility and performance:

Floating-point variance: The same model produces different outputs on different CPUs due to FMA (fused multiply-add) availability, SIMD instruction selection, and compiler optimisations.

Memory allocation: Python dictionaries, hash maps, and sets iterate in order determined by memory layout — which varies between runs.

Threading: Parallel operations complete in unpredictable order. Reduce operations accumulate floating-point errors differently depending on execution timing.

Dynamic allocation: malloc() returns different addresses, affecting pointer-based data structures and timing.

For consumer applications, these differences are invisible. For certification, they’re disqualifying.

The Solution: Determinism by Design

The certifiable-* ecosystem takes a different approach:

Fixed-Point Arithmetic (Q16.16)

Every calculation uses 32-bit fixed-point representation:

16 bits for the integer part
16 bits for the fractional part
Range: -32768.0 to +32767.99998

No floating-point operations anywhere in the pipeline. Same inputs produce same outputs on any platform that implements integer arithmetic correctly — which is all of them.

/* Q16.16 multiplication with overflow detection */
int32_t q16_mul(int32_t a, int32_t b, q16_fault_t *fault) {
    int64_t result = (int64_t)a * (int64_t)b;
    result >>= 16;
    
    if (result > Q16_MAX || result < Q16_MIN) {
        fault->overflow = 1;
        return (result > 0) ? Q16_MAX : Q16_MIN;
    }
    return (int32_t)result;
}

Static Allocation

No malloc(). All buffers declared at compile time or allocated by the caller:

/* Caller provides the buffer */
void ci_forward(const ci_model_t *model,
                const int32_t *input,
                int32_t *output,        /* Caller-allocated */
                int32_t *workspace,     /* Caller-allocated */
                ci_fault_t *fault);

No heap fragmentation. No allocation failures. Bounded memory usage provable at compile time.

Deterministic Algorithms

Sorting: Merge sort (stable, O(n log n) worst case)
Shuffling: Feistel network with cycle-walking (deterministic given seed)
Hashing: SHA-256 throughout
Reduction: Ordered accumulation (no parallel reduce)

Every algorithm chosen for determinism first, performance second.

Cryptographic Provenance

Each stage produces a 32-byte SHA-256 commitment that includes:

The stage’s own output
The previous stage’s commitment

This creates an unbroken chain from training data to deployed inference:

M_data = MerkleRoot(batch_hashes)
H_train = SHA256(M_data || gradient_chain)
H_cert = SHA256(H_train || quantization_certificate)
R_attest = SHA256(H_cert || bundle_files)
H_pred = SHA256(R_attest || predictions)
L_n = SHA256(H_pred || ledger_entries)
H_report = SHA256(L_n || verification_results)

Modify any input, and every downstream commitment changes. The chain is tamper-evident by construction.

The Harness: Proving Bit-Identity

certifiable-harness orchestrates all seven stages and compares results against a golden reference:

$ ./certifiable-harness --golden reference.golden --output result.json

═══════════════════════════════════════════════════════════════
  Certifiable Harness v1.0.0
  Platform: x86_64
═══════════════════════════════════════════════════════════════

  [0] data         ✓  (OK, 4 µs)
  [1] training     ✓  (OK, 3 µs)
  [2] quant        ✓  (OK, 3 µs)
  [3] deploy       ✓  (OK, 3 µs)
  [4] inference    ✓  (OK, 3 µs)
  [5] monitor      ✓  (OK, 4 µs)
  [6] verify       ✓  (OK, 8 µs)

  Status: ALL STAGES PASSED ✓
  Bit-identical: YES ✓
═══════════════════════════════════════════════════════════════

The golden reference is a 368-byte binary containing commitments from all seven stages. Run the harness on any platform — if the hashes match, you have mathematical proof of identical execution.

Verified Cross-Platform

The harness has been tested on:

Platform	OS	Compiler	Result
x86_64	Linux (Ubuntu)	GCC 12.2.0	✓ Bit-identical
x86_64	macOS 11.7	Apple Clang	✓ Bit-identical

Different operating systems. Different compilers. Same hashes.

What’s Implemented

Project	Tests	Key Features
certifiable-data	142	CSV parsing, Merkle trees, deterministic shuffle
certifiable-training	10 suites	Gradient descent, weight updates, chain hashing
certifiable-quant	134	FP32→Q16.16, error bounds, certificates
certifiable-deploy	147	Bundle format, manifest, attestation
certifiable-inference	8 suites	Conv2D, pooling, dense layers, activations
certifiable-monitor	253	Drift detection, ledger, policy enforcement
certifiable-verify	10 suites	Binding verification, report generation
certifiable-harness	4 suites	Orchestration, golden comparison

Total: 700+ tests across 8 projects.

Documentation for Certification

Each project includes formal documentation designed for regulatory review:

MATH-001 — Mathematical specification (definitions, algorithms, proofs)
STRUCT-001 — Data structure specification (types, layouts, invariants)
SRS-xxx — Software requirements (traceable, testable requirements)

certifiable-harness alone has 81 traceable requirements across 4 SRS documents.

Compliance Context

The ecosystem is designed to support certification under:

Standard	Domain	Key Requirements
DO-178C Level A	Aerospace	MC/DC coverage, traceability, determinism
IEC 62304 Class C	Medical devices	Risk management, verification, documentation
ISO 26262 ASIL-D	Automotive	Fault tolerance, diagnostic coverage
ISO 21448 (SOTIF)	Automotive AI	Behaviour verification, edge cases
UL 4600	Autonomous systems	Safety case, operational design domain

Deterministic execution simplifies verification. If the same inputs always produce the same outputs, testing becomes meaningful. If you can prove cross-platform identity, deployment becomes traceable.

The Trade-Offs

This approach has costs:

Performance: Fixed-point is slower than optimised floating-point on modern GPUs. The ecosystem is designed for edge deployment where determinism matters more than throughput.

Precision: Q16.16 has less dynamic range than FP32. For safety-critical applications, bounded precision with known error bounds is often preferable to unbounded precision with unknown variance.

Complexity: Eight projects is more infrastructure than dropping in TensorFlow Lite. The question is whether that infrastructure is justified by the assurance it provides.

Ecosystem: No pre-trained models, no model zoo, no community of contributors (yet). You’re building from scratch.

For consumer applications, these costs aren’t justified. For systems where certification is mandatory and determinism is required, the alternative is often “don’t use ML at all.”

Getting Started

Clone any project and run the tests:

git clone https://github.com/williamofai/certifiable-inference.git
cd certifiable-inference
mkdir build && cd build
cmake ..
make
ctest --output-on-failure

For end-to-end verification:

git clone https://github.com/williamofai/certifiable-harness.git
cd certifiable-harness
mkdir build && cd build
cmake ..
make

# Generate golden reference
./certifiable-harness --generate-golden --output result.json

# Verify (should show Bit-identical: YES)
./certifiable-harness --golden result.json.golden --output verify.json

What This Enables

When a regulator asks “how do you know the deployed model is the same as what you tested?”, the answer changes:

Before: “We have a careful deployment process.”

After: “Here’s a 368-byte golden reference. Run it on the deployed hardware. If the seven SHA-256 hashes match, the execution is mathematically identical. If they don’t, I can tell you exactly which stage diverged.”

That’s a different kind of answer.

Repositories

Project	URL
certifiable-data	https://github.com/williamofai/certifiable-data
certifiable-training	https://github.com/williamofai/certifiable-training
certifiable-quant	https://github.com/williamofai/certifiable-quant
certifiable-deploy	https://github.com/williamofai/certifiable-deploy
certifiable-inference	https://github.com/williamofai/certifiable-inference
certifiable-monitor	https://github.com/williamofai/certifiable-monitor
certifiable-verify	https://github.com/williamofai/certifiable-verify
certifiable-harness	https://github.com/williamofai/certifiable-harness

All projects are GPL-3.0 licensed. Commercial licensing available for organisations requiring proprietary deployment.

The certifiable-* ecosystem represents one approach to deterministic ML. As with any architectural choice, suitability depends on system requirements, risk classification, and regulatory context. The goal isn’t to replace general-purpose ML frameworks — it’s to enable ML in domains where those frameworks can’t currently go.

UK Patent Application GB2521625.0 — Murray Deterministic Computing Platform

The Certifiable-* Ecosystem: Eight Projects, One Deterministic ML Pipeline

Eight Projects, One Pipeline

The Core Problem: Non-Determinism

The Solution: Determinism by Design

Fixed-Point Arithmetic (Q16.16)

Static Allocation

Deterministic Algorithms

Cryptographic Provenance

The Harness: Proving Bit-Identity

Verified Cross-Platform

What’s Implemented

Documentation for Certification

Compliance Context

The Trade-Offs

Getting Started

What This Enables

Repositories

About the Author

Let's Discuss Your AI Infrastructure

The Certifiable-* Ecosystem: Eight Projects, One Deterministic ML Pipeline

Eight Projects, One Pipeline

The Core Problem: Non-Determinism

The Solution: Determinism by Design

Fixed-Point Arithmetic (Q16.16)

Static Allocation

Deterministic Algorithms

Cryptographic Provenance

The Harness: Proving Bit-Identity

Verified Cross-Platform

What’s Implemented

Documentation for Certification

Compliance Context

The Trade-Offs

Getting Started

What This Enables

Repositories

About the Author

Occasional Technical Updates

Let's Discuss Your AI Infrastructure