AI Architecture

The Certifiable-* Ecosystem: Eight Projects, One Deterministic ML Pipeline

From training data to deployed inference — bit-identical, auditable, certifiable

Published
January 19, 2026 23:00
Reading Time
8 min
The complete certifiable-* ecosystem: 8 projects, 700+ tests, proven bit-identical across platforms

There’s a question that blocks AI adoption in safety-critical systems:

“Can you prove the model running on deployed hardware is exactly the same as what you tested?”

Not “similar”. Not “statistically equivalent”. The same — bit for bit, hash for hash, across different platforms, compilers, and architectures.

With TensorFlow Lite, PyTorch, or ONNX Runtime, the answer is no. Floating-point arithmetic varies by platform. Hash table iteration order depends on memory allocation. Thread scheduling is inherently non-deterministic.

For most applications, that doesn’t matter. For aerospace, medical devices, and autonomous vehicles — where certification requires evidence, not assumptions — it’s a fundamental barrier.

The certifiable-* ecosystem removes that barrier.

Eight Projects, One Pipeline

The ecosystem consists of eight interconnected projects, each handling one stage of the ML pipeline:

StageProjectPurposeCommitment
0certifiable-dataData pipelineMerkle root of batches
1certifiable-trainingModel trainingGradient chain hash
2certifiable-quantQuantizationError certificate
3certifiable-deployDeployment packagingAttestation tree
4certifiable-inferenceForward passPredictions hash
5certifiable-monitorRuntime monitoringLedger digest
6certifiable-verifyVerificationReport hash
certifiable-harnessEnd-to-end orchestrationGolden reference

Every stage produces a cryptographic commitment. Every commitment chains to the next. Break any link, and verification fails.

The Core Problem: Non-Determinism

Traditional ML frameworks aren’t designed for determinism. They’re optimised for flexibility and performance:

Floating-point variance: The same model produces different outputs on different CPUs due to FMA (fused multiply-add) availability, SIMD instruction selection, and compiler optimisations.

Memory allocation: Python dictionaries, hash maps, and sets iterate in order determined by memory layout — which varies between runs.

Threading: Parallel operations complete in unpredictable order. Reduce operations accumulate floating-point errors differently depending on execution timing.

Dynamic allocation: malloc() returns different addresses, affecting pointer-based data structures and timing.

For consumer applications, these differences are invisible. For certification, they’re disqualifying.

The Solution: Determinism by Design

The certifiable-* ecosystem takes a different approach:

Fixed-Point Arithmetic (Q16.16)

Every calculation uses 32-bit fixed-point representation:

  • 16 bits for the integer part
  • 16 bits for the fractional part
  • Range: -32768.0 to +32767.99998

No floating-point operations anywhere in the pipeline. Same inputs produce same outputs on any platform that implements integer arithmetic correctly — which is all of them.

/* Q16.16 multiplication with overflow detection */
int32_t q16_mul(int32_t a, int32_t b, q16_fault_t *fault) {
    int64_t result = (int64_t)a * (int64_t)b;
    result >>= 16;
    
    if (result > Q16_MAX || result < Q16_MIN) {
        fault->overflow = 1;
        return (result > 0) ? Q16_MAX : Q16_MIN;
    }
    return (int32_t)result;
}

Static Allocation

No malloc(). All buffers declared at compile time or allocated by the caller:

/* Caller provides the buffer */
void ci_forward(const ci_model_t *model,
                const int32_t *input,
                int32_t *output,        /* Caller-allocated */
                int32_t *workspace,     /* Caller-allocated */
                ci_fault_t *fault);

No heap fragmentation. No allocation failures. Bounded memory usage provable at compile time.

Deterministic Algorithms

  • Sorting: Merge sort (stable, O(n log n) worst case)
  • Shuffling: Feistel network with cycle-walking (deterministic given seed)
  • Hashing: SHA-256 throughout
  • Reduction: Ordered accumulation (no parallel reduce)

Every algorithm chosen for determinism first, performance second.

Cryptographic Provenance

Each stage produces a 32-byte SHA-256 commitment that includes:

  1. The stage’s own output
  2. The previous stage’s commitment

This creates an unbroken chain from training data to deployed inference:

M_data = MerkleRoot(batch_hashes)
H_train = SHA256(M_data || gradient_chain)
H_cert = SHA256(H_train || quantization_certificate)
R_attest = SHA256(H_cert || bundle_files)
H_pred = SHA256(R_attest || predictions)
L_n = SHA256(H_pred || ledger_entries)
H_report = SHA256(L_n || verification_results)

Modify any input, and every downstream commitment changes. The chain is tamper-evident by construction.

The Harness: Proving Bit-Identity

certifiable-harness orchestrates all seven stages and compares results against a golden reference:

$ ./certifiable-harness --golden reference.golden --output result.json

═══════════════════════════════════════════════════════════════
  Certifiable Harness v1.0.0
  Platform: x86_64
═══════════════════════════════════════════════════════════════

  [0] data  (OK, 4 µs)
  [1] training  (OK, 3 µs)
  [2] quant  (OK, 3 µs)
  [3] deploy  (OK, 3 µs)
  [4] inference  (OK, 3 µs)
  [5] monitor  (OK, 4 µs)
  [6] verify  (OK, 8 µs)

  Status: ALL STAGES PASSED
  Bit-identical: YES
═══════════════════════════════════════════════════════════════

The golden reference is a 368-byte binary containing commitments from all seven stages. Run the harness on any platform — if the hashes match, you have mathematical proof of identical execution.

Verified Cross-Platform

The harness has been tested on:

PlatformOSCompilerResult
x86_64Linux (Ubuntu)GCC 12.2.0✓ Bit-identical
x86_64macOS 11.7Apple Clang✓ Bit-identical

Different operating systems. Different compilers. Same hashes.

What’s Implemented

ProjectTestsKey Features
certifiable-data142CSV parsing, Merkle trees, deterministic shuffle
certifiable-training10 suitesGradient descent, weight updates, chain hashing
certifiable-quant134FP32→Q16.16, error bounds, certificates
certifiable-deploy147Bundle format, manifest, attestation
certifiable-inference8 suitesConv2D, pooling, dense layers, activations
certifiable-monitor253Drift detection, ledger, policy enforcement
certifiable-verify10 suitesBinding verification, report generation
certifiable-harness4 suitesOrchestration, golden comparison

Total: 700+ tests across 8 projects.

Documentation for Certification

Each project includes formal documentation designed for regulatory review:

  • MATH-001 — Mathematical specification (definitions, algorithms, proofs)
  • STRUCT-001 — Data structure specification (types, layouts, invariants)
  • SRS-xxx — Software requirements (traceable, testable requirements)

certifiable-harness alone has 81 traceable requirements across 4 SRS documents.

Compliance Context

The ecosystem is designed to support certification under:

StandardDomainKey Requirements
DO-178C Level AAerospaceMC/DC coverage, traceability, determinism
IEC 62304 Class CMedical devicesRisk management, verification, documentation
ISO 26262 ASIL-DAutomotiveFault tolerance, diagnostic coverage
ISO 21448 (SOTIF)Automotive AIBehaviour verification, edge cases
UL 4600Autonomous systemsSafety case, operational design domain

Deterministic execution simplifies verification. If the same inputs always produce the same outputs, testing becomes meaningful. If you can prove cross-platform identity, deployment becomes traceable.

The Trade-Offs

This approach has costs:

Performance: Fixed-point is slower than optimised floating-point on modern GPUs. The ecosystem is designed for edge deployment where determinism matters more than throughput.

Precision: Q16.16 has less dynamic range than FP32. For safety-critical applications, bounded precision with known error bounds is often preferable to unbounded precision with unknown variance.

Complexity: Eight projects is more infrastructure than dropping in TensorFlow Lite. The question is whether that infrastructure is justified by the assurance it provides.

Ecosystem: No pre-trained models, no model zoo, no community of contributors (yet). You’re building from scratch.

For consumer applications, these costs aren’t justified. For systems where certification is mandatory and determinism is required, the alternative is often “don’t use ML at all.”

Getting Started

Clone any project and run the tests:

git clone https://github.com/williamofai/certifiable-inference.git
cd certifiable-inference
mkdir build && cd build
cmake ..
make
ctest --output-on-failure

For end-to-end verification:

git clone https://github.com/williamofai/certifiable-harness.git
cd certifiable-harness
mkdir build && cd build
cmake ..
make

# Generate golden reference
./certifiable-harness --generate-golden --output result.json

# Verify (should show Bit-identical: YES)
./certifiable-harness --golden result.json.golden --output verify.json

What This Enables

When a regulator asks “how do you know the deployed model is the same as what you tested?”, the answer changes:

Before: “We have a careful deployment process.”

After: “Here’s a 368-byte golden reference. Run it on the deployed hardware. If the seven SHA-256 hashes match, the execution is mathematically identical. If they don’t, I can tell you exactly which stage diverged.”

That’s a different kind of answer.

Repositories

ProjectURL
certifiable-datahttps://github.com/williamofai/certifiable-data
certifiable-traininghttps://github.com/williamofai/certifiable-training
certifiable-quanthttps://github.com/williamofai/certifiable-quant
certifiable-deployhttps://github.com/williamofai/certifiable-deploy
certifiable-inferencehttps://github.com/williamofai/certifiable-inference
certifiable-monitorhttps://github.com/williamofai/certifiable-monitor
certifiable-verifyhttps://github.com/williamofai/certifiable-verify
certifiable-harnesshttps://github.com/williamofai/certifiable-harness

All projects are GPL-3.0 licensed. Commercial licensing available for organisations requiring proprietary deployment.


The certifiable-* ecosystem represents one approach to deterministic ML. As with any architectural choice, suitability depends on system requirements, risk classification, and regulatory context. The goal isn’t to replace general-purpose ML frameworks — it’s to enable ML in domains where those frameworks can’t currently go.

UK Patent Application GB2521625.0 — Murray Deterministic Computing Platform

About the Author

William Murray is a Regenerative Systems Architect with 30 years of UNIX infrastructure experience, specializing in deterministic computing for safety-critical systems. Based in the Scottish Highlands, he operates SpeyTech and maintains several open-source projects including C-Sentinel and c-from-scratch.

Let's Discuss Your AI Infrastructure

Available for UK-based consulting on production ML systems and infrastructure architecture.

Get in touch
← Back to AI Architecture