Back to Blog

Google OKF: Open Knowledge Standard for AI Agents

Daily News7793
Google OKF: Open Knowledge Standard for AI Agents

On June 12, 2026, Google Cloud released Open Knowledge Format, or OKF v0.1, and open-sourced the full specification, reference implementations, and sample datasets under the Apache 2.0 license on GitHub.

Unlike model releases that often attract public attention immediately, OKF is a lightweight file-based standard. Its value is more infrastructural. It focuses on a common but under-discussed problem in the AI agent ecosystem: enterprise knowledge is fragmented, hard to reuse, and difficult to deliver consistently to agents.

In many organizations, internal knowledge is scattered across database metadata, operation manuals, shared documents, repository notes, and individual engineers’ personal experience. As a result, each team often builds its own context delivery pipeline from scratch. These pipelines are rarely reusable and usually become isolated engineering assets.

OKF aims to solve this problem by providing a simple, portable, vendor-neutral format for organizing AI-agent knowledge. This article explains OKF’s definition, design principles, relationship with MCP, RAG, and AGENTS.md, official reference tools, vendor-neutral architecture, and broader industry significance.

For enterprises managing multi-model and multi-agent request traffic, 4sapi can act as an API gateway layer. It helps standardize access routing across heterogeneous agent services and reduces repetitive configuration work when teams reuse OKF knowledge bundles across systems.

1. Core Problems OKF Tries to Solve

In the current AI agent stack, model inference has improved rapidly. However, knowledge delivery remains fragmented.

Most enterprise knowledge is not stored in one clean, agent-readable system. It may exist in database schemas, internal wiki pages, shared spreadsheets, runbooks, API documents, data dictionaries, and informal team knowledge.

When an agent needs to answer a business question, this fragmentation becomes a bottleneck. For example:

Without standardized knowledge input, agents may receive incomplete context. This increases hallucination risk and forces teams to build custom parsers, converters, and retrieval rules repeatedly.

Andrej Karpathy’s widely discussed LLM Wiki concept provides an important background for OKF. The core idea is that LLMs are well suited for maintaining structured knowledge tasks that humans find repetitive, such as updating indexes, keeping cross-references, and organizing concept pages.

Over the past year, developers have built many similar practices on their own. Examples include Obsidian vaults for coding agents, AGENTS.md, CLAUDE.md, and metadata-as-code systems for data warehouses.

These practices are useful, but they lack a shared interoperability standard. Each team creates its own “dialect” for knowledge representation. OKF standardizes these emerging patterns into a common specification that can work across teams, repositories, and organizations.

2. Basic Definition and Minimal Specification of OKF

OKF was co-authored by Sam McVeety, Google Cloud Data Cloud Tech Lead, and Amir Hormati, BigQuery Tech Lead.

Its core design can be summarized in three points:

This makes OKF easy to read, edit, store, and version. Any text editor can open it. GitHub can render it. Git can track its history. A bundle can also be packaged into a tarball and moved across systems.

2.1 Knowledge Bundle Structure

The basic deployment unit of OKF is called a Knowledge Bundle.

A Knowledge Bundle is an independent directory that contains multiple Markdown concept files. Each .md file represents one business concept.

A concept file may describe:

Within a bundle, the file path acts as the unique ID of the concept. This makes the knowledge structure portable and easy to reference.

2.2 YAML Frontmatter Rules

OKF requires only one mandatory metadata field:

yaml
type:

The type field identifies the category of the concept. For example, it may describe whether the file represents a database table, metric, API document, runbook, or report.

Other fields are optional. Recommended metadata fields include:

yaml
title:
description:
resource:
tags:
timestamp:

This minimal requirement is intentional. It gives teams enough structure for interoperability without forcing every organization to use the same internal document model.

Cross-concept relationships are represented through standard Markdown links. This allows OKF bundles to form navigable knowledge graphs without requiring proprietary indexing systems.

2.3 Minimal-Opinionated Design

OKF does not try to define every detail of knowledge modeling.

It does not force teams to use a fixed section layout. It does not require a specific metadata taxonomy beyond the type field. It also does not bind the format to any single database, cloud provider, model framework, or commercial platform.

This is a practical design choice.

Different organizations describe knowledge in different ways. A data team, platform team, legal team, and AI engineering team may all need different metadata fields. OKF provides a common interoperability surface while leaving internal document structure flexible.

The goal is standardization without excessive rigidity.

3. OKF’s Positioning: Complementary to MCP, RAG, and Repository Convention Files

OKF is not designed to replace existing AI infrastructure standards. It works as a complementary knowledge layer.

Its role becomes clearer when compared with MCP, RAG, and repository convention files such as AGENTS.md and CLAUDE.md.

3.1 OKF vs MCP

MCP, or Model Context Protocol, is mainly a dynamic communication protocol. It allows agents to call tools, query live data, and interact with external systems during execution.

OKF is different. It stores static, curated, persistent knowledge.

A simple way to understand the relationship is:

text
MCP handles dynamic tool communication.
OKF provides stable knowledge content.

An MCP server may deliver OKF knowledge to an agent. But OKF itself is the content format, not the real-time communication channel.

3.2 OKF vs RAG

RAG, or Retrieval-Augmented Generation, usually focuses on semantic retrieval from large volumes of raw documents.

RAG is useful when teams need to search across broad, unstructured content. However, RAG pipelines may include noisy documents, duplicate content, inconsistent terminology, and outdated information.

OKF targets a different layer. It focuses on manually reviewed, structured, and authoritative knowledge bundles.

A practical enterprise setup can use both:

text
RAG retrieves raw or temporary information.
OKF provides curated baseline business knowledge.

For example, a RAG system may search customer support tickets. OKF may define the official meaning of each support category, database field, and escalation rule.

3.3 OKF vs AGENTS.md and CLAUDE.md

AGENTS.md, CLAUDE.md, and similar files are repository-local conventions. They describe how an AI coding agent should behave inside a specific codebase.

OKF generalizes this idea into a cross-team and cross-system format.

A single OKF bundle can be used by many repositories, agents, dashboards, and data tools. It is not tied to one codebase.

This makes OKF more suitable for shared enterprise knowledge, especially when multiple teams need to reference the same metrics, tables, APIs, or operational rules.

4. Official Reference Implementations Released with OKF v0.1

Google Cloud released three reference implementations with OKF v0.1. These tools demonstrate how OKF can support the full workflow from knowledge generation to visualization and reuse.

4.1 Knowledge Enrichment Agent

The Knowledge Enrichment Agent is an automated tool for generating OKF knowledge files from BigQuery assets.

It can scan tables and views, then create draft Markdown concept files for each data asset.

The workflow does not stop at basic metadata extraction. A secondary LLM step can enrich the files by reading authoritative documents and adding:

This reduces manual documentation work and helps teams turn existing data assets into agent-readable knowledge bundles.

4.2 Static HTML Visualization Tool

The second reference tool is a static HTML visualization tool.

It can parse an OKF bundle and render an interactive knowledge graph. The tool has no backend dependency. All parsing and rendering happen locally in the browser.

This design is useful for enterprise privacy and compliance.

Knowledge data does not need to be uploaded to an external server. Teams can inspect internal knowledge graphs locally, which is important for sensitive business data, internal schemas, and private documentation.

4.3 Open-Source Sample Knowledge Bundles

Google Cloud also released three sample OKF bundles on GitHub.

The examples include:

These samples help developers understand how OKF concept files are written in practice. They also show metadata conventions, Markdown linking patterns, and knowledge graph structures.

For new users, these examples are often more useful than the specification alone.

5. Three Core Design Principles of OKF

OKF follows three consistent design principles: minimal constraints, separation of producer and consumer, and format-first architecture.

5.1 Minimally Opinionated Standard

OKF makes only the type field mandatory.

Everything else is flexible. Teams can define their own metadata, document sections, naming rules, and classification logic.

This makes OKF easier to adopt. Teams do not need to rebuild their entire knowledge system before using it. They only need to add enough structure for cross-system recognition.

The standard defines the minimum shared layer, not the full content model.

5.2 Separation of Producer and Consumer

OKF separates knowledge creation from knowledge consumption.

Human-edited bundles can be consumed directly by AI agents. Machine-generated metadata exports can also be opened and revised by humans in ordinary text editors.

This creates a bidirectional workflow:

text
Humans can write OKF.
Machines can generate OKF.
Agents can consume OKF.
Humans can review and improve OKF.

Knowledge producers and consuming agents can change independently. The file format remains stable.

This is valuable for enterprise environments where data teams, engineering teams, and AI teams often work with different tools.

5.3 Format Rather Than Platform

OKF is a file specification, not a hosted platform.

It does not require a Google Cloud account, proprietary SDK, paid runtime, or specific LLM framework. It can be stored in Git, rendered as static documentation, packaged into archives, and processed by any tool that can read Markdown and YAML.

This greatly reduces vendor lock-in risk.

Even if a vendor stops supporting OKF in a product, existing OKF bundles remain readable, editable, and portable.

This is one of the strongest design advantages of OKF.

6. Google’s Open-Source Ecosystem Strategy and Vendor Lock-In Risk

At the same time as the OKF v0.1 release, Google Cloud updated its native Knowledge Catalog product to support direct ingestion of OKF bundles.

This follows a familiar strategy: publish an open standard, then provide native product support inside the cloud ecosystem.

However, OKF’s structure keeps vendor lock-in relatively low.

Because OKF depends only on Markdown and YAML, teams can use:

No proprietary parser or paid runtime is required.

Multi-vendor enterprises can store internal business knowledge as OKF bundles and distribute knowledge access across different agent services. A routing layer such as 4sapi can help standardize requests across heterogeneous endpoints without changing the underlying knowledge format.

7. Industry Trend Reflected by OKF

The release of OKF reflects a deeper shift in the AI agent ecosystem.

The main bottleneck is no longer only base model reasoning capability. For many enterprise use cases, the harder problem is reliable access to authoritative business context.

Even advanced models can produce poor results when internal context is scattered, outdated, or incomplete.

Before standards like OKF, enterprise knowledge engineering often relied on ad-hoc scripts, manual document cleanup, and team-specific conventions. These approaches worked locally but did not scale well across organizations.

OKF turns this work into reusable infrastructure.

From a full-stack AI architecture perspective, OKF can become a middleware layer between enterprise data systems and AI agents.

A typical enterprise stack may look like this:

text
Data warehouse
→ OKF knowledge bundles
→ Agent context layer
→ MCP / tools / RAG
→ AI agent applications

This structure helps reduce duplicated custom development. It also lowers hallucination risk by giving agents clearer and more authoritative context.

8. Conclusion

Google Cloud’s OKF v0.1 is a lightweight but important release for the AI agent ecosystem.

It formalizes the community-proven LLM Wiki pattern into a vendor-neutral knowledge format. Built on Markdown and YAML, it is easy to read, easy to edit, and easy to move across systems.

Its design is intentionally minimal. Only the type field is mandatory. This gives teams a shared interoperability layer without forcing them into rigid documentation templates.

OKF does not replace MCP, RAG, or repository convention files. Instead, it complements them. MCP handles dynamic tool communication. RAG retrieves large raw document pools. Repository files describe local project behavior. OKF provides curated, reusable knowledge bundles that can move across teams and systems.

The three reference implementations also make the standard more practical. Teams can generate knowledge from BigQuery assets, visualize bundles locally, and study real examples from public datasets.

The larger signal is clear: AI agent competition is moving beyond model performance alone. Standardized knowledge infrastructure is becoming a key part of production-grade agent systems.

For enterprises building multi-agent and multi-model architectures, OKF can reduce knowledge silos, improve context quality, and lower repeated engineering cost.

As more data platforms, agent frameworks, and developer tools add OKF support, it may become a common format for structured knowledge circulation in the AI agent ecosystem.

Tags:Google CloudOKFAI AgentsMCPRAG

Recommended reading

Explore more frontier insights and industry know-how.