@reshimu/shor · v0.1.0 · MIT · ● Live

SHOR — Grounding & Hallucination Classifier

A deterministic, non-LLM classifier that flags ungrounded entities in agent outputs before they reach a tool call or a user. Sub-50ms p99 on 50k-token contexts, zero runtime dependencies, no LLM in the loop.

github.com/reshimu/shor ↗ · npm ↗ · PyPI ↗

What SHOR does

SHOR sits between an agent's stated output and the world it is about to act on. Given the agent's text and the context it operated over — tool schemas, retrieved documents, conversation history — SHOR extracts every addressable entity in the output (numbers, identifiers, dates, quoted strings, citations, URLs, proper nouns) and verifies that each one actually appears in the context.

The output is a four-level classification you can gate on. Deterministic on the same input. No model in the loop. The narrow wedge is the point: SHOR catches the specific failure mode of grounded-looking fabrication with high precision, and is honest about what it does not catch.

Install

TypeScript / Node

npm install @reshimu/shor

Python

pip install reshimu-shor

Both packages have zero runtime dependencies. The TypeScript build is a single ESM bundle; the Python package is stdlib-only. No model downloads, no service calls, no telemetry.

Quick start

Same input, two languages. The TypeScript and Python implementations are kept at functional parity — same level, same score, same per-entity verdicts.

// TypeScript
import { classify } from '@reshimu/shor'

const result = classify({
  output:  'Q3 revenue was $4.2M from 47 customers.',
  context: 'Q3 numbers: 47 customers signed up, revenue of $4.2M for the quarter.',
})

// result.level         === 'GROUNDED'
// result.score         === 1
// result.flagForReview === false
// result.explanation   === 'All extracted entities verified in context.'

# Python
from reshimu_shor import classify

result = classify(
    output="Q3 revenue was $4.2M from 47 customers.",
    context="Q3 numbers: 47 customers signed up, revenue of $4.2M for the quarter.",
)

# result.level             == 'GROUNDED'
# result.score             == 1.0
# result.flag_for_review   == False
# result.explanation       == 'All extracted entities verified in context.'

Drop one entity out of the context and the same call returns PARTIAL with the unverified entity surfaced:

const result = classify({
  output:  'Q3 revenue was $4.2M from 47 customers.',
  context: 'Q3 numbers: 47 customers signed up, but revenue was not disclosed.',
})

// result.level         === 'PARTIAL'
// result.score         === 0.6666...
// result.flagForReview === true
// result.explanation   === "number '$4.2M' not found in context."

for (const entity of result.entities) {
  console.log(`  - "${entity.text}" [${entity.type}] found=${entity.found}`)
}
//   - "Q3"           [date]   found=true
//   - "$4.2M"        [number] found=false
//   - "47 customers" [number] found=true

Classification levels

Four levels, three gating behaviors. flagForReview is true for PARTIAL and UNGROUNDED, false for GROUNDED and INDETERMINATE — the last one means SHOR could not check, which is different from SHOR checked and found problems.

GROUNDED Every extracted entity verified in context. Proceed.

PARTIAL Some entities verified, others not. Block, escalate, or have the agent retry with explicit citations.

UNGROUNDED No extracted entities verified. Block. The output as a whole is unsupported.

INDETERMINATE Output had no extractable entities. SHOR has nothing to assert — route based on your own policy.

What SHOR catches — and what it doesn't

These limits are features. Precise tools that know their scope beat fuzzy tools that pretend to do everything.

Catches:

Fabricated specific values — dollar figures, percentages, counts, dates that do not appear in context.
Invented function and method names — db.fetchAll() when the tool schema only defines db.query().
Misquoted strings — quoted text that does not appear verbatim in any source.
Hallucinated proper nouns — invented names of people, products, or places that appear in the output but not the context.
Referenced objects, files, or paths that were never in context — src/lib/util.ts when no such file appears anywhere upstream.

Does not catch:

Paraphrased hallucinations — the output rephrases a fabrication so no specific entity is matchable.
Inferential overreach — extending a true premise to an unsupported conclusion using only words that exist in context.
Semantic equivalents — Q3 does not match third quarter; $4.2M does not match four point two million dollars. The number-expansion path is digit-only by design.
Tone, style, sentiment, or values issues.
Mesa-optimization, deceptive alignment, or other capability-level risks. SHOR is a runtime gate, not an alignment evaluation.

Full reference

The complete reference — performance benchmarks, every entity type's extraction and normalization rules, the no-LLM principle, integration examples for LangGraph and Claude Code, the comparison to alternatives, FAQ, and known edge cases — lives in the SHOR README on GitHub.

Read the full README on GitHub. File issues there too — we read them.

Open the README →