FIRST CHAYYAH · IRREVERSIBILITY SHIELD
NESHER

Intercepts agent actions that cannot be undone. Classifies irreversibility before execution. Holds for confirmation when the answer is red.

Autonomous agents take irreversible actions — deleting records, sending communications, executing transactions, running schema migrations — with no built-in pause. NESHER is the pause. It intercepts the proposed action before it executes, classifies the level of irreversibility against a deterministic action taxonomy, and returns a hold signal that the caller can enforce. No model call. No network dependency. Same input yields the same classification every time.

Install

npm install @reshimu/nesher
yarn add @reshimu/nesher
pip install reshimu-nesher

No native dependencies. No compilation step. Works in any Node.js ≥ 18 or Python ≥ 3.9 runtime.

Quick start

TypeScript / JavaScript

import { classify } from '@reshimu/nesher'

const result = classify({
  action: 'delete_records',
  target: 'user_id:44812 — all associated rows'
})

console.log(result.level)   // 'RED'
console.log(result.hold)    // true
console.log(result.reason)  // 'permanent data deletion'

Reversible action

const result = classify({
  action: 'fetch_records',
  target: 'orders where status = pending'
})

console.log(result.level)   // 'GREEN'
console.log(result.hold)    // false

Python

from reshimu_nesher import classify

result = classify(
    action="send_email",
    target="[email protected] — payment reminder"
)

print(result.level)   # 'RED'
print(result.hold)    # True
print(result.reason)  # 'outbound communication — cannot be recalled'

Classification levels

GREEN
Action is reversible. Safe to execute without confirmation. Can be undone or retried without residual effect.
YELLOW
Partially reversible. Some side effects may not roll back cleanly. Log and flag recommended; proceed at caller's discretion.
RED
Irreversible. Action cannot be undone once it fires. Execution is held; requires explicit confirmation before proceeding.
CRITICAL
Catastrophically irreversible or high blast radius. Execution is blocked unconditionally. Requires manual review outside the agent loop.
On RED vs CRITICAL. RED means the action cannot be undone — a delete, a sent message, a committed transaction. CRITICAL means the scope of damage is unbounded or unrecoverable: bulk deletes, mass communications, schema drops. In most pipelines, RED warrants a human-in-the-loop confirmation; CRITICAL warrants a full stop pending manual review.

API reference

classify(options)

Parameters:

Field Type Required Description
action string yes The action identifier the agent is proposing to execute
target string yes Description of the target resource or scope of the action
context string no Additional context to resolve ambiguous action classifications

Returns:

Field Type Description
level 'GREEN' | 'YELLOW' | 'RED' | 'CRITICAL' Irreversibility classification
hold boolean true when level is RED or CRITICAL — execution should not proceed without confirmation
reason string Human-readable explanation of the classification

Integration guide

NESHER sits between an agent's decision to act and the execution of that action. The pattern is consistent regardless of orchestration framework — intercept before the tool call fires, not after.

import { classify } from '@reshimu/nesher'

async function beforeToolCall(action: string, target: string) {
  const result = classify({ action, target })

  if (result.level === 'CRITICAL') {
    return {
      blocked: true,
      reason: result.reason,
      requiresManualReview: true
    }
  }

  if (result.level === 'RED') {
    return {
      blocked: true,
      reason: result.reason,
      awaitingConfirmation: true
    }
  }

  return { blocked: false, level: result.level }
}

With SHOR (grounding gate)

NESHER and SHOR compose. The recommended order is: classify the action's reversibility first (NESHER), then check grounding of the output that produced it (SHOR). An UNGROUNDED output that also proposes an irreversible action is caught by both gates.

import { classify as reversibility } from '@reshimu/nesher'
import { classify as grounding } from '@reshimu/shor'

async function intercept(action: string, output: string, sources: string[]) {
  const rev  = reversibility({ action, target: output })
  const gnd  = grounding({ output, sources })

  if (rev.level === 'CRITICAL' || gnd.level === 'UNGROUNDED') {
    return { blocked: true, rev, gnd }
  }

  return { blocked: false, rev, gnd }
}

Python agent loop

from reshimu_nesher import classify

def before_tool_call(action: str, target: str) -> dict:
    result = classify(action=action, target=target)

    if result.hold:
        raise ValueError(
            f"Blocked: {result.level} action. Reason: {result.reason}"
        )

    return {"level": result.level, "hold": result.hold}

Design constraints

No LLM calls. Using a language model to evaluate the reversibility of an action that another language model just proposed is a structural trap. The evaluator inherits the generator's biases — if the model wanted to take the action, it can rationalize why it is safe. NESHER uses a deterministic action taxonomy and pattern-matched classification. Same input, same result, every time, independent of the model that produced the action.
ConstraintReason
No LLM calls The evaluator must be independent of the model it is evaluating — shared failure modes defeat the gate
No native dependencies Runs in any runtime without compilation; no binding maintenance burden
Synchronous by default Sub-5ms on typical payloads; sits in the execution path, not a sidecar
Deterministic The irreversibility of an action does not change between runs; classifications must be stable

For the full reasoning behind the no-LLM constraint, see Why we don't trust LLMs to classify the call they're about to make.

Runtime stack

NESHER is the first of four runtime integrity validators — the Chayyot — that form the Reshimu agent governance stack.

ValidatorRoleStatus
NESHER Irreversibility classifier — stops unrecoverable actions before execution v0.1.0 live
SHOR Grounding validator — checks that outputs trace to source material v0.1.0 live
ARYEH Scope classifier — detects out-of-bounds tool calls and capability overreach in development
PANIM ADAM Gray-zone escalator — routes ambiguous actions to the right handler in development

Each validator is a zero-dependency library, usable independently. The Atzmut OS MCP server integrates all four into a single intercept layer.

For the architectural pattern underlying all four, see Bearers of the Throne.