Runtime model identity, artifact identity, and signed verification infrastructure.
Each paper opened a question the previous one couldn't answer. Together they trace the same line: a neural network's structural identity is mathematically distinct from its outputs, its weights' bytes, and its agent credentials — and that distinction is measurable, formally verifiable, and operationally useful.
13 research papers4 technical notes4 patents · 0 retractedAll open-access on Zenodo
Reading paths
Four entry points through the corpus, by audience and purpose. Each path is three works long.
Structural Identity. These papers define and test the measurement primitive behind Fall Risk: a structural fingerprint observed during ordinary computation. Start here for the scientific basis of runtime model identity.
Endpoint / API. When the model is behind an API and weights are out of reach, the same identity can still be measured through public logprob endpoints. These papers cover the API-side observable, its formal security properties, and its limits.
Distillation & Provenance. Distilled models inherit some of their teacher's behavior but not its structural identity. These papers measure, falsify, and bound what survives passive and adversarial copying — across families, scales, and training recipes.
Governance & Compliance. How model identity becomes admissible evidence: regulatory mappings (EU AI Act, NIST), enterprise IAM composition, and the threat-model gap that current agent-identity standards leave open.
Artifact Identity. Verifying what is on disk, before runtime. The boundary between artifact identity (what Trustfall Lite verifies) and runtime identity (what Trustfall Deep verifies) is part of the claim hygiene.
Technical Notes. Operational notes published alongside the research series: the agent-vs-model identity distinction, a gap-invariance proof for API measurement, and a measured-substitution scenario against a live agent.
Core research
13 papers · publication order
Each paper extends a previous question. The natural reading path is in publication order — the program's questions unfolded that way for a reason.
Neural networks have a structural fingerprint — the third pre-softmax logit gap — that is invariant to temperature, architecture-stable across six families, and unforgeable under any adversarial KL budget.
When the model is behind an API and the weights are out of reach, the same identity can be measured through public logprob endpoints using PPP-residualized order-statistic geometry.
Distilled models inherit a measurable trace of their teacher; passive fine-tuning erases the trace faster than adversarial erasure does, and same-family spoofing is geometrically anti-aligned.
Provenance detection generalizes across teachers, students, and training protocols — but the cosine alignment diagnostic is mandatory; scalar distance alone produces wrong answers.
An AI system's structural identity is mathematically distinct from its behavioral character. Two models can produce identical outputs while having different identities, and the same model produces wildly different characters under different prompts.
Inference verification proofs are only as trustworthy as the binding between the proof and the model that actually ran. Hybrid proof-and-bridge attestation closes the gap.
Identity evidence comes in distinct classes — artifact, structural, provenance, behavior — and substituting one for another is formally insufficient. A theorem makes the constraint legally citable.
Structural model identity composes cleanly with JWT, SPIFFE, and existing enterprise identity primitives. Four formal composition properties make it stack-safe.
Two models trained on identical data and architecture produce different structural identities. Endpoint statistics cannot recover the formative trajectory.
Structural identity verification scales to 70B+ parameters. When a frontier vendor's model lineage is disputed, only runtime measurement settles it — disclosure after the fact does not.
Reasoning distillation produces family-dependent structural and functional responses. Mistral, Llama, and Qwen react differently — and the structural layer can decouple entirely from the functional layer.
Public toolchains strip a model's safety constraints while preserving observable behavior. The structural fingerprint changes anyway — and we can detect it.
Authorizing an agent is not the same as verifying which neural network produced its response. The 2026 identity-management products solve the first problem and assume the second.
Order-statistic gaps are invariant to log-softmax, temperature, and constant shifts. The endpoint-verification protocol's robustness is provable, not just empirical.
Three substitution scenarios run against a live gateway with valid agent credentials. Three detected. Zero false accepts. Warm-path latency under seven seconds.
Trustfall Lite verifies whether a local artifact's bytes match a signed enrollment record. It does not — and cannot — verify what runs at inference time. The boundary is the product.