Deno open-sourced Claw Patrol,
a firewall that sits between an agent and production and gates each action
at the wire: block a DROP TABLE, pause a kubectl delete
until a human signs off. We rebuilt its core in an afternoon to find out
what actually does the security work. The credential trick is the gem. The
parser is the part that will bite you.
What we did
We reproduced the three load-bearing mechanisms deterministically, no
network and no model calls: wire-level fact extraction (parse the bytes
an agent sends into a Postgres verb plus table, a kubectl verb plus
resource plus namespace, an HTTP method plus host), a small rule engine
reading policy in an HCL subset, and on-the-wire credential injection. The
agent holds only a placeholder like {{github_pat}}; the gateway
swaps in the real token as the bytes leave. We wrote 16 functional cases
and a separate evasion probe, then attacked our own parser.
All 16 functional cases pass: destructive SQL denied, mutating SQL
escalated to a human, reads allowed, kubectl delete in prod
held for approval, unlisted hosts and commands default-denied. Credential
sealing holds cleanly: the real ghp_ token appears only in the
bytes the gateway forwards, never in the agent's buffer, and on a denied
action it is never injected at all.
Why it was worth doing
Because the evasion probe found the real lesson: blocked is not the
same as understood. A shallow verb-sniffer waved through
SELECT 1; DROP TABLE students;. It saw SELECT,
matched the read allow-rule, and forwarded a destructive statement. Three
of its other "blocks" were luck, not comprehension: it parsed
/* c */ DROP as verb /*, matched no allow-rule,
and got saved by default-deny. A hardened extractor that strips comments and
scores every statement catches all five payloads, but it is just a
re-implementation of Postgres lexing. Anything it gets wrong versus the real
server (dollar-quoting, encodings, dialect quirks) is an open door. This is
parser-differential territory, and it is the whole game for any wire-level
firewall.
What's still off
Our repro is the security logic, not the plumbing: we skipped TLS termination, WireGuard tunnelling and real TCP proxying, which is where Claw Patrol does its hard engineering. And the headline caveat is theirs as much as ours: wire gating is defence-in-depth with a ceiling set by parser fidelity, not by how clever your rules are. Treat it as one layer behind least-privilege credentials and human approval on mutations, never as the only thing between an agent and prod.
What we'd take from it
- Default-deny is the hero, not the rule engine. It turns every parser gap into a safe-fail. Keep allow-rules narrow; never trust a parser to positively certify "safe".
- Reject multi-statement payloads outright. One wire message, one statement. Do not try to reason about a stacked query.
- Credential-on-the-gateway is the highest-leverage, lowest-fragility idea. It needs no parser to be correct and removes a whole class of leak. If you adopt one thing from Claw Patrol, adopt that.