We converted our entire ships library, 95 documents, to Google's Open Knowledge Format, verified the bundle conformant with zero violations, then spot-checked an agent answering questions over the bundle versus the raw directory. Every answer was correct in both conditions, in about the same time, reading the same number of files. The raw directory even found one more relevant document than the bundle did. The interesting part is why nothing changed.
What OKF is
OKF is Google Cloud's June 2026 attempt to standardise the LLM-wiki
pattern: a knowledge base as a directory of markdown files with YAML
frontmatter, readable by humans without tooling and by agents without
SDKs. The spec is deliberately tiny. One field is required
(type); title, description, resource, tags and timestamp
are recommended; an optional index.md per directory gives
an agent progressive disclosure, a listing to read before opening
documents. That is essentially the whole standard, and the spec says
so proudly: minimally opinionated, no schema registry, no required
tooling.
What we built
A converter that turns the machine-readable markdown siblings of every
ship on this site into a conformant bundle: one concept per ship,
type: Ship, frontmatter filled from what the pages
already carry, a month-grouped hierarchy, generated indexes at every
level and the okf_version declaration at the root. Plus a
conformance checker for the spec's three rules. 95 concepts, 415 KB,
zero violations. The whole bundle is on GitHub; point your agent at it
and ask questions about two months of our builds.
The spot check
Three questions with known answers, each asked twice in a fresh headless Claude session told to answer only from local files: once with the working directory set to the raw ships folder (95 markdown files mixed in with 95 HTML pages and assets), once inside the OKF bundle. All six runs answered correctly. Total wall-clock: 109 seconds over the raw directory, 103 over the bundle. Identical file-read counts per question. On the one question that required finding every relevant document (our Vera evaluation agent, which spans eight ships), the raw directory found all eight and the bundle surfaced seven. Three questions and one run per condition is a spot check, not a benchmark, but the direction is hard to miss: no measurable difference.
Why nothing changed
Because our corpus was already OKF-shaped before we started. The machine-readable siblings we publish for answer engines already carry YAML frontmatter with a title, a one-line description and a canonical URL, under descriptive kebab-case filenames. The agent's actual retrieval strategy in both conditions was the same: grep. A standard whose job is to make knowledge bases self-describing adds nothing to a corpus that already describes itself. That is not a criticism of OKF. It is evidence the spec codified a convention that tidy corpora were converging on anyway, which is what good minimal standards do.
Where the value actually is
Interoperability, not retrieval. The moment a knowledge base crosses a boundary (handed to a teammate's agent, a client's, a stranger's), the consuming side either knows your conventions or wastes context discovering them. A standard removes that negotiation. Our bundle is now something anyone's agent can ingest without being told how we organise things, and that is a property our raw directory never had, whatever the grep results say. The analogy that holds: MCP did not make any single tool call better; it made every tool reachable. OKF does not make retrieval better; it makes corpora exchangeable.
What's now in the stack
build_bundle.py— ships library → conformant OKF bundle, rerunnable as the library grows.check_conformance.py— validates any tree against the spec's three conformance rules.- The full 95-ship bundle plus the eval harness and raw results on GitHub, with the write-up in examples/096. Steal the bundle; it is the point of the format.
What's still off
The spot check is three questions and one run per condition; treat it as direction, not measurement. Our corpus being tidy makes this a lower bound: a messy pile of untitled documents would likely gain more from conversion, and we did not test that. And the interoperability argument is currently a promise, because OKF consumers barely exist yet. The spec is three weeks old and version 0.1. We shipped the bundle anyway, since being early to a format costs a script and an afternoon; being late to one costs a migration.
FAQ
What is Google's Open Knowledge Format (OKF)?
An open specification from Google Cloud (June 2026, Apache 2.0) that
standardises the LLM-wiki pattern: a knowledge base as a directory of
markdown files with YAML frontmatter. Exactly one field is required
(type); title, description, resource, tags and timestamp
are recommended; per-directory index.md files support
progressive disclosure. It standardises structure only, not tooling,
storage or taxonomy.
Does converting a knowledge base to OKF improve agent retrieval?
Not measurably, if the corpus is already tidy. In our spot check over 95 documents that already had descriptive filenames, frontmatter and one-line descriptions, an agent answered every question correctly over both the bundle and the raw directory, in comparable time. OKF's value is interoperability between agents and organisations, not retrieval performance within one.
When is adopting OKF worth it?
When a knowledge base has to cross a boundary: shared with a team, handed to a client's agent, or published for anyone's second brain to ingest. Within a single system that already keeps consistent frontmatter and filenames, conversion is cheap but buys little today.