The Substr8 CLI: Six Primitives for Provable AI Agents

Last week, our AI agent went dark for six hours.

Not a crash. Not a network issue. A corrupted context — an orphaned tool call that the API couldn’t parse. The session was bricked. Six hours of work, gone. No recovery path.

This is the state of AI agent infrastructure in 2026. We’re building increasingly autonomous systems on foundations that can’t prove what happened, can’t recover from failures, and can’t audit their own behavior.

We decided to fix that. Not with better logging practices. Not with “remember to save your work.” With automatic infrastructure — a deterministic execution substrate that runs beneath your agents and handles integrity, memory, and recovery without human discipline.

Introducing Substr8 CLI v0.9.0

Today we’re releasing the latest version of our command-line toolkit for verifiable AI agents. Six primitives, one interface:

pip install substr8

substr8
├── fdaa      # Agent definition & architecture
├── gam       # Verifiable memory
├── acc       # Capability control
├── dct       # Audit chains
├── ril       # Runtime integrity (NEW)
└── gateway   # Infrastructure orchestration

Each primitive solves a specific trust problem. Together, they form a stack for agents you can actually deploy in production.

The Primitives

FDAA — File-Driven Agent Architecture

Agents as code. Define your agent in agent.yaml:

apiVersion: tower/v1
kind: Agent
metadata:
  name: analyst
  version: 1.0.0
spec:
  capabilities:
    allow: [web_search, memory_read]
    deny: [shell_exec, file_write]

When you provision an agent, FDAA computes a cryptographic hash of the entire definition. That hash becomes the agent’s identity — immutable, verifiable, auditable.

GAM — Git-Native Agent Memory

Memory that can prove its own history. GAM uses git’s Merkle tree to make every memory tamper-evident:

substr8 gam remember "User prefers morning meetings" --tag preference
substr8 gam recall "scheduling preferences"
substr8 gam verify mem_abc123  # Prove provenance

Change any memory, the hash chain breaks. That’s not a feature — it’s physics.

ACC — Agent Capability Control

Runtime enforcement of what agents can do:

$ substr8 acc check analyst web_search
✓ ALLOWED
  Reason: Allowed by rule: web_search

$ substr8 acc check analyst shell_exec
✗ DENIED
  Reason: Denied by rule: shell_exec

Capabilities are defined at provision time and enforced at runtime. The policy itself is hashed — you can prove which rules were active when the agent ran.

DCT — Deterministic Capability Tokens

Tamper-evident audit chains. Every action gets logged:

{
  "entry_id": "e-077e7504323f",
  "run_id": "33d53f41",
  "tool": "web_search",
  "decision": {
    "allowed": true,
    "policy_hash": "sha256:464e3efa..."
  },
  "prev_hash": "sha256:0000000...",
  "entry_hash": "sha256:0bd23e5..."
}

Each entry includes the hash of the previous entry. Change anything, the chain breaks. Export the whole run for compliance.

RIL — Runtime Integrity Layer (NEW)

This is what we built after the six-hour crash. And here’s the key insight: RIL runs automatically.

You don’t invoke it. You don’t think about it. It’s middleware that sits in the proxy and does its job on every request:

Request comes in
       ↓
┌─────────────────────────────────────┐
│  RIL (automatic, in fdaa-proxy)     │
│                                     │
│  1. CIA validates tool pairing      │  ← automatic
│  2. Triggers fire on events         │  ← automatic
│  3. Ledger persists state           │  ← automatic
│  4. Memory commits created          │  ← automatic
└─────────────────────────────────────┘
       ↓
Forward to API

The three components:

CIA (Context Integrity Adapter) — Validates every request before it hits the API. Catches orphaned tool calls, corrupted contexts, structural errors. Repairs them automatically or rejects them cleanly.

GAM Triggers — Fire automatically on key events:

message_received → commit the context
tool_invoked → commit the tool call
tool_completed → commit the result
turn_completed → commit the full turn
crash_recovery → commit recovery state

No human discipline required. Memory capture happens as a system primitive.

Work Ledger — Persistent execution state. If the system crashes mid-turn, the ledger knows where you were. Resume, don’t restart.

The CLI is just the inspection layer:

$ substr8 ril status        # Is it running?
$ substr8 ril validate      # Debug a payload manually
$ substr8 ril repair        # Fix something offline
$ substr8 ril ledger list   # See what was captured
$ substr8 ril triggers list # See what's configured

The automatic part is the product. The CLI lets you peek inside.

Gateway — Infrastructure Orchestration

We run our stack on Docker Swarm. The gateway command wraps deployment:

$ substr8 gateway status

platform (7/7 healthy)
towerhq (5/5 healthy)
towerhq-staging (5/5 healthy)

$ substr8 gateway health
✓ All 17 services healthy

Start, stop, upgrade, logs — all through one interface.

The E2E Demo

Here’s what “provable agent lifecycle” looks like today:

1. Define the agent:

# agent.yaml
metadata:
  name: analyst
spec:
  capabilities:
    allow: [web_search, memory_read]
    deny: [shell_exec]

2. Provision with cryptographic identity:

Agent Hash: sha256:01b06404d6150d6e...
Policy Hash: sha256:464e3efa157031ce...

3. Run governed calls:

TEST 1: web_search → ✅ ALLOWED
TEST 2: shell_exec → ❌ DENIED
TEST 3: calendar_read → ❌ DENIED (default deny)

4. Verify the audit chain:

Chain intact: ✅
Hash linking verified

5. Export for compliance:

{
  "run_id": "33d53f41",
  "agent_hash": "sha256:01b06404...",
  "entries": [...],
  "verification": { "valid": true, "chain_intact": true }
}

Every step is hashable. Every decision is logged. The whole run can be verified independently.

Why This Matters

AI agents are heading toward regulation. SOC2. HIPAA. Financial compliance. Internal audit requirements.

When someone asks “what did the agent do?”, you need a real answer. Not logs that could be edited. Not “we think it did X.” Cryptographic proof.

That’s what we’re building. The trust layer for AI agents.

Try It

pip install substr8
substr8 info
substr8 ril status
substr8 gateway health

The CLI is open. The primitives are documented. We’re building in public.

What would you need to deploy agents in production? What’s missing?