Workloft
▸ NOW LOADING WORKLOFT.LABS ◂

WORKLOFT LABS

Substrate before spectacle.

The research arm of Workloft. We track the AI frontier daily — 70+ papers a week, scored by Walt against nine substrate axes, citation-graphed via Semantic Scholar — and publish what to actually build for governed agent infrastructure.

▸ LABS API + MCP — get a free key All sections →
1623
Papers screened
5
Above threshold
9
Research axes
13
Notes published
TODAY'S LAB BENCH
Five test tubes. Five papers. Top arXiv picks for today, scored against the Workloft research axes. Live from Walt's pipeline. Hover a tube. Click for the abstract.
▸ pulling today's picks from labs-api…
REG FIT = "would this clear an FCA Risk review, a UK GDPR DPIA, or a Local Authority procurement audit?"
●●● strong fit · ●●○ moderate · ●○○ low / academic-only.
▸ WORKLOFT LABS · OUTPUT · 5 PUBLISHED · NEW EVERY ~3 DAYS

WORKLOFT PAPERS

Substrate before spectacle.

Long-form research essays from the lab. One paper, one regulated lens, ~1,000 words each. Strong opinions, weakly held. Framed for FCA-regulated firms, UK Local Authorities and NHS Trusts — the buyers who have to defend the architecture, not the demo.

▸ THIS WEEK'S MUST-READ · PAPER №14

Trajectories Write Tests

PhoneWorld's architectural point: real usage yields both controllable environments and auto-generated verifiers. Let production usage write the test suite as a side effect.

~1,650 WORDS 8 MIN READ AGENTINFRASTRUCTURE · LLMEVALUATION · REGULATEDAI READ PAPER №14
▸ THE LAB

Pick a wing. Each one opens to its own page — methodology, ledger, problems, the pipe, the watch-list.

The 9 Axes
The rubric Walt scores every paper on. Published openly because rubrics that don't survive sunlight aren't rubrics.
Methodology
Replication Ledger
Papers we've cited and actually rebuilt. What worked, what didn't, what the authors didn't write down.
Proof of work
Open Problems
Substrate-layer questions without published answers yet. Six on the bench. Drop us a line if you've hit one in production.
6 problems
@
Who We Read
Ten researchers we follow personally. The list itself is a statement of taste before any output.
10 names
The Pipe
Ingest → Score → Graph → Synthesise. How a paper becomes a Workloft signal.
Process
Labs API + MCP
Programmatic access to the lab bench. HTTP endpoints + MCP server. Free tier live — get a key.
API · MCP · Free
Substrate Score
Public 9-axis benchmark for AI agent runtimes. Vendors submit, Walt scores against the published rubric, the leaderboard is open.
Benchmark · v0