Open Source GPL-3.0

certifiable-monitor

Deterministic runtime monitoring — because 'the model drifted' isn't certifiable

GitHub Repository
Published
January 19, 2026 17:00
Reading Time
6 min
certifiable-monitor: Deterministic runtime monitoring with drift detection, health FSM, and cryptographic audit ledger

Deploying an ML model isn’t the end of the story. In safety-critical systems, you need to know — with cryptographic certainty — when that model is operating outside its certified envelope. Standard monitoring tools use floating-point statistics, produce non-deterministic results, and leave no verifiable audit trail.

certifiable-monitor changes that.

View on GitHub

The Problem

When an ML model runs in production, things drift:

  • Input distributions shift — Real-world data diverges from training
  • Activations exceed bounds — Internal values go where they shouldn’t
  • Output patterns change — Predictions no longer match expectations
  • Faults accumulate silently — Overflow and saturation events go unnoticed

Current monitoring approaches have fundamental problems for certification:

Non-deterministic metrics. Floating-point drift calculations produce different results on different platforms. How do you validate something that changes?

No audit trail. When an incident occurs, there’s no cryptographic proof of what the monitor observed. It’s your word against the logs.

Ambiguous reactions. “Log a warning” isn’t a deterministic specification. What action, exactly, should the system take when TV exceeds 0.15?

For DO-178C Level A, IEC 62304 Class C, or ISO 26262 ASIL-D certification, “the model drifted so we logged it” isn’t acceptable evidence.

The Solution

certifiable-monitor provides deterministic runtime monitoring through three core mechanisms:

1. Fixed-Point Drift Detection

All statistical metrics computed in fixed-point arithmetic:

Total Variation (TV) — The safest detector, no logarithms required:

TV(p, q) = (1/2) Σ_b |p_b - q_b|

Output in Q0.32. Zero means identical distributions. UINT32_MAX means completely disjoint.

Jensen-Shannon Divergence (JSD) — Symmetric divergence measure:

JSD(p, q) = (1/2) KL(p ∥ m) + (1/2) KL(q ∥ m)

Uses a 512-entry LUT for log2 computation. No floating-point. Bit-identical on x86, ARM, and RISC-V.

Population Stability Index (PSI) — Directional sensitivity:

PSI(p, q) = Σ_b (p_b - q_b) ln(p_b / q_b)

Epsilon smoothing prevents log(0). Policy defines operational thresholds.

Same inputs produce the same drift scores. Every time. Every platform.

2. Cryptographic Audit Ledger

Every monitoring event is logged to a SHA-256 hash chain:

L_0 = SHA256("CM:LEDGER:GENESIS:v1" ∥ R ∥ H_P)
L_t = SHA256("CM:LEDGER:v1" ∥ L_{t-1} ∥ e_t)

The genesis block binds to the deployment bundle root R and policy hash H_P. Every subsequent entry chains to the previous digest.

Tampering is detectable. Truncation is detectable. Reordering is detectable. Post-incident analysis can replay the entire monitoring history with cryptographic verification.

3. Deterministic Health FSM

A state machine with formally defined transitions:

UNINIT → INIT → ENABLED → ALARM → DEGRADED → STOPPED

Fault budgets define thresholds. Violations trigger transitions. Once stopped, only manual intervention restarts. No ambiguity about system state.

What’s Implemented

253
Tests Passing
11
Test Suites
~13,700
Lines of Code
ModulePurposeTests
DVM PrimitivesSaturating arithmetic, LUT log233
Audit LedgerSHA-256 hash chain18
Drift DetectorsTV, JSD, PSI computation20
Policy ParserCOE JSON parsing, JCS hash25
Input MonitorFeature envelope checking22
Activation MonitorLayer bounds checking24
Output MonitorOutput envelope checking19
Health FSMMonitor state machine19
Reaction HandlerViolation → action mapping14
Ledger VerificationOffline chain verification32
Bit-IdentityCross-platform determinism27

Every module traces to formal specifications in CM-MATH-001, CM-STRUCT-001, and the SRS documents.

Usage Example

#include "policy.h"
#include "input.h"
#include "health.h"
#include "ledger.h"
#include "react.h"

ct_fault_flags_t faults = {0};

// Load policy and initialize ledger
cm_policy_t policy;
cm_policy_parse(policy_json, policy_len, &policy, &faults);

cm_ledger_ctx_t ledger;
cm_ledger_init(&ledger);
cm_ledger_genesis(&ledger, policy.bundle_root, policy.policy_hash, &faults);

// Initialize monitors
cm_input_ctx_t input_mon;
cm_input_init(&input_mon, &policy.input);

cm_health_ctx_t health;
cm_health_init(&health, &policy.fault_budget);
cm_health_enable(&health);

// Per-inference: check input envelope
cm_input_result_t result;
cm_input_check(&input_mon, input_vector, num_features, &result, &faults);

if (result.violations > 0) {
    // Log to cryptographic ledger
    uint8_t L_out[32];
    cm_ledger_append_violation(&ledger, window_id, CM_VIOL_INPUT_RANGE,
                               result.first_violation_idx,
                               result.first_violation_value,
                               result.first_violation_bound,
                               L_out, &faults);
    
    // Get policy-defined reaction
    cm_reaction_t action = cm_policy_get_reaction(&policy, CM_VIOL_INPUT_RANGE);
    
    // Update health state
    cm_health_report_violation(&health, CM_VIOL_INPUT_RANGE);
}

// Check if system should halt
if (cm_health_get_state(&health) == CM_HEALTH_STOPPED) {
    // Emergency stop — do not proceed with inference
}

All buffers statically allocated. No malloc. Deterministic execution path.

The Pipeline

certifiable-monitor completes the deterministic ML ecosystem:

certifiable-data → certifiable-training → certifiable-quant → certifiable-deploy → certifiable-inference

                                                                              certifiable-monitor

                                                                                   Audit Ledger

The monitor receives:

  • From certifiable-deploy: Bundle attestation root and policy hash
  • From certifiable-inference: Input vectors, activation values, output vectors, fault flags
  • From policy: Thresholds, envelopes, reaction mappings

Six interlocking projects. One coherent vision: deterministic ML from data to monitored production.

Why This Matters

Medical Devices

IEC 62304 Class C requires traceable, reproducible software. When a diagnostic AI flags an anomaly, the response must be deterministic. The audit trail must be verifiable.

Autonomous Vehicles

ISO 26262 ASIL-D demands provable behavior under all conditions. Input drift detection with cryptographic proof isn’t optional — it’s the difference between “we think the model was stable” and “here’s the hash chain proving it.”

Aerospace

DO-178C Level A requires complete requirements traceability. Every drift metric traces to CM-MATH-001. Every state transition traces to CM-ARCH-MATH-001. Every test traces to an SRS requirement.

This is the monitoring layer that makes ML certification possible.

Getting Started

git clone https://github.com/williamofai/certifiable-monitor
cd certifiable-monitor
mkdir build && cd build
cmake ..
make
make test-all  # 253 tests

Expected output:

100% tests passed, 0 tests failed out of 11

Documentation

The implementation traces to formal specifications:

  • CM-MATH-001 — Mathematical foundations (drift metrics, ledger hashing, log2 LUT)
  • CM-STRUCT-001 — Data structure specifications
  • CM-ARCH-MATH-001 — Architecture-level math (health FSM, window semantics)
  • SRS-001 through SRS-008 — Module requirements with full traceability

Every function documents its traceability reference. Every test validates a specification clause.

The Trade-Off

Deterministic monitoring isn’t free. Fixed-point arithmetic requires careful scaling. Hash chain updates add overhead. Static allocation means pre-sized buffers.

For systems where “it probably works” is acceptable, standard monitoring tools are simpler.

For systems where lives depend on the answer — where regulators demand proof, where post-incident analysis requires cryptographic verification, where “the model drifted” needs to be a traceable, reproducible, auditable event — certifiable-monitor provides the foundation.

As with any architectural approach, suitability depends on system requirements, risk classification, and regulatory context.


Built by SpeyTech in the Scottish Highlands. 30 years of UNIX systems engineering applied to making ML safe enough to certify.

View on GitHub · Documentation

About the Author

William Murray is a Regenerative Systems Architect with 30 years of UNIX infrastructure experience, specializing in deterministic computing for safety-critical systems. Based in the Scottish Highlands, he operates SpeyTech and maintains several open-source projects including C-Sentinel and c-from-scratch.

Questions or Contributions?

Open an issue on GitHub or get in touch directly.

View on GitHub Contact
← Back to Open Source