NESHER — Irreversibility Shield

FIRST CHAYYAH · IRREVERSIBILITY SHIELD

NESHER

Intercepts agent actions that cannot be undone. Classifies irreversibility before execution. Holds for confirmation when the answer is red.

npm · @reshimu/nesher · v0.1.0 PyPI · reshimu-nesher · v0.1.0 MIT License

Autonomous agents take irreversible actions — deleting records, sending communications, executing transactions, running schema migrations — with no built-in pause. NESHER is the pause. It intercepts the proposed action before it executes, classifies the level of irreversibility against a deterministic action taxonomy, and returns a hold signal that the caller can enforce. No model call. No network dependency. Same input yields the same classification every time.

Install

npm install @reshimu/nesher

yarn add @reshimu/nesher

pip install reshimu-nesher

No native dependencies. No compilation step. Works in any Node.js ≥ 18 or Python ≥ 3.9 runtime.

Quick start

TypeScript / JavaScript

import { classify } from '@reshimu/nesher'

const result = classify({
  action: 'delete_records',
  target: 'user_id:44812 — all associated rows'
})

console.log(result.level)   // 'RED'
console.log(result.hold)    // true
console.log(result.reason)  // 'permanent data deletion'

Reversible action

const result = classify({
  action: 'fetch_records',
  target: 'orders where status = pending'
})

console.log(result.level)   // 'GREEN'
console.log(result.hold)    // false

Python

from reshimu_nesher import classify

result = classify(
    action="send_email",
    target="[email protected] — payment reminder"
)

print(result.level)   # 'RED'
print(result.hold)    # True
print(result.reason)  # 'outbound communication — cannot be recalled'

Classification levels

GREEN

Action is reversible. Safe to execute without confirmation. Can be undone or retried without residual effect.

YELLOW

Partially reversible. Some side effects may not roll back cleanly. Log and flag recommended; proceed at caller's discretion.

RED

Irreversible. Action cannot be undone once it fires. Execution is held; requires explicit confirmation before proceeding.

CRITICAL

Catastrophically irreversible or high blast radius. Execution is blocked unconditionally. Requires manual review outside the agent loop.

On RED vs CRITICAL. RED means the action cannot be undone — a delete, a sent message, a committed transaction. CRITICAL means the scope of damage is unbounded or unrecoverable: bulk deletes, mass communications, schema drops. In most pipelines, RED warrants a human-in-the-loop confirmation; CRITICAL warrants a full stop pending manual review.

API reference

classify(options)

Parameters:

Field	Type	Required	Description
action	string	yes	The action identifier the agent is proposing to execute
target	string	yes	Description of the target resource or scope of the action
context	string	no	Additional context to resolve ambiguous action classifications

Returns:

Field	Type	Description
level	'GREEN' \| 'YELLOW' \| 'RED' \| 'CRITICAL'	Irreversibility classification
hold	boolean	`true` when level is RED or CRITICAL — execution should not proceed without confirmation
reason	string	Human-readable explanation of the classification

Integration guide

NESHER sits between an agent's decision to act and the execution of that action. The pattern is consistent regardless of orchestration framework — intercept before the tool call fires, not after.

import { classify } from '@reshimu/nesher'

async function beforeToolCall(action: string, target: string) {
  const result = classify({ action, target })

  if (result.level === 'CRITICAL') {
    return {
      blocked: true,
      reason: result.reason,
      requiresManualReview: true
    }
  }

  if (result.level === 'RED') {
    return {
      blocked: true,
      reason: result.reason,
      awaitingConfirmation: true
    }
  }

  return { blocked: false, level: result.level }
}

With SHOR (grounding gate)

NESHER and SHOR compose. The recommended order is: classify the action's reversibility first (NESHER), then check grounding of the output that produced it (SHOR). An UNGROUNDED output that also proposes an irreversible action is caught by both gates.

import { classify as reversibility } from '@reshimu/nesher'
import { classify as grounding } from '@reshimu/shor'

async function intercept(action: string, output: string, sources: string[]) {
  const rev  = reversibility({ action, target: output })
  const gnd  = grounding({ output, sources })

  if (rev.level === 'CRITICAL' || gnd.level === 'UNGROUNDED') {
    return { blocked: true, rev, gnd }
  }

  return { blocked: false, rev, gnd }
}

Python agent loop

from reshimu_nesher import classify

def before_tool_call(action: str, target: str) -> dict:
    result = classify(action=action, target=target)

    if result.hold:
        raise ValueError(
            f"Blocked: {result.level} action. Reason: {result.reason}"
        )

    return {"level": result.level, "hold": result.hold}

Design constraints

No LLM calls. Using a language model to evaluate the reversibility of an action that another language model just proposed is a structural trap. The evaluator inherits the generator's biases — if the model wanted to take the action, it can rationalize why it is safe. NESHER uses a deterministic action taxonomy and pattern-matched classification. Same input, same result, every time, independent of the model that produced the action.

Constraint	Reason
No LLM calls	The evaluator must be independent of the model it is evaluating — shared failure modes defeat the gate
No native dependencies	Runs in any runtime without compilation; no binding maintenance burden
Synchronous by default	Sub-5ms on typical payloads; sits in the execution path, not a sidecar
Deterministic	The irreversibility of an action does not change between runs; classifications must be stable

For the full reasoning behind the no-LLM constraint, see Why we don't trust LLMs to classify the call they're about to make.

Runtime stack

NESHER is the first of four runtime integrity validators — the Chayyot — that form the Reshimu agent governance stack.

Validator	Role	Status
NESHER	Irreversibility classifier — stops unrecoverable actions before execution	v0.1.0 live
SHOR	Grounding validator — checks that outputs trace to source material	v0.1.0 live
ARYEH	Scope classifier — detects out-of-bounds tool calls and capability overreach	in development
PANIM ADAM	Gray-zone escalator — routes ambiguous actions to the right handler	in development

Each validator is a zero-dependency library, usable independently. The Atzmut OS MCP server integrates all four into a single intercept layer.

For the architectural pattern underlying all four, see Bearers of the Throne.