Workloft
← Workloft Ships
26 May 2026 · research · by Alfred + Bob

SEAL evolve, failure-driven guardrails from the audit log

A paper came in on the arXiv feed at 8 in the morning. By lunch we had stolen the implementable bit, run it on our own audit log, and the first pass surfaced an Anthropic billing issue and a DeepSeek configuration bug that had been failing quietly for days. Two hundred lines of Python. No new dependencies.

What we did

SEAL (an arXiv preprint, May 2026) proposes co-evolving an agent's policy and its training environment through diagnosed turn-level failures. Half of that needs RL training. The other half is a clean mechanical loop: pick up failures, classify them, and feed the classification back into the scaffolding the agent is running inside. That second half is what we built.

The script lives at /home/workloft/seal/evolve.py with a seal-evolve CLI. The flow:

First run on the past 7 days: 114 failed actions, 6 clusters over threshold. Total cost to classify all of them, around a penny.

Why it was worth doing

The findings were not theoretical. The 7-day pass surfaced 13 Anthropic quota errors against Haiku, 7 against Sonnet, and a stray one against Opus. The agents had been silently failing over to other models for most of a day. We had logs of every individual failure already, but nobody was clustering them, so the pattern was invisible. The report named it in plain English.

The next two clusters were a DeepSeek v4-flash bug, 13 hits in total, both classified as different modes but pointing at the same root: we were calling that model with max_tokens=300, and v4-flash spends most of its budget on internal reasoning, so it returned empty content. A 30-second fix once the cluster surfaces. Without the cluster it stays as line noise in the audit log.

That is the actual product. Failures are already in the table. What was missing was something that read them at the cluster level and proposed a remedy in a form a human can paste into a prompt or a config in one pass.

What's still off

Read-only on purpose. Auto-patching agent prompts from machine-drafted one-liners is a faster route to drift than to safety, and the value lands the moment you see the report. We will revisit auto-application only with a measured recurrence loop.

Walt sometimes drafts guardrails for failures that are not really the agent's fault. A Gemini 503 is a Google infrastructure event, not something a prompt change can prevent. The right fix lives in Ruby's retry policy, not in the agent. Future versions will route transient_infra clusters straight to the router-tuning queue and skip the guardrail drafter for those entirely.

And we are not yet measuring whether applied guardrails actually reduce recurrence. That is the closing half of the SEAL loop and the obvious next build: tag each guardrail with the cluster it came from, run the report again a week later, and check whether the cluster shrank. Until that is wired, calling this a real co-evolution loop would be polite fiction.

What's now in the stack