Runtime Governance
Runtime governance is governance applied at the moment of execution — intercepting AI agent actions before they fire, classifying each proposed call against defined integrity constraints, before any effect is produced in the world.
The dominant model for AI safety is training-time alignment: shape the model's behavior during training so that its outputs at runtime are aligned with human values. This is necessary but not sufficient for autonomous agent systems. A well-aligned model can still call the wrong API, exceed its scope, or drift from the original intent across a chain of delegations — and training-time alignment has no mechanism to catch these failures at the moment they occur.
Runtime governance adds the missing layer: classification and enforcement at execution time, before the action fires and before the effect is irreversible.
Training-time vs. runtime
Training-time alignment shapes what a model wants to do. Runtime governance controls what the system is allowed to do at the moment of action. The two operate on different objects: training acts on model weights; runtime governance acts on individual proposed actions within a live session.
A system with training-time alignment but no runtime governance can still execute a deletion it was never meant to perform — because the alignment operates at the level of tendency, not at the level of the specific tool call. Runtime governance operates at the level of the specific tool call.
What runtime governance classifies
Every proposed action in an agent loop has four dimensions that determine whether it should be allowed to proceed: its reversibility (can this be undone?), its grounding (is this output traceable to actual source material?), its scope (does this fall within the agent's assigned domain?), and its ambiguity (is this action genuinely interpretable in more than one way with materially different consequences?).
Each dimension requires a distinct classifier. Reshimu implements these as four runtime validators — NESHER, SHOR, ARYEH, and PANIM ADAM — running in parallel on every message before it is delivered downstream. The combined result is a classification that either permits execution, blocks it, or escalates to a human or superior agent.
Speed and reliability requirements
Runtime governance that adds meaningful latency to an agent loop defeats its own purpose — agents are deployed precisely because they are fast. Reshimu's classifiers are deterministic where they have to be: no LLM calls on the critical path, rule-based or pattern-based classification, sub-5ms per call. An LLM-based classifier shares failure modes with the model it evaluates; a deterministic classifier does not.