CBOR-Web Standard

Machine-Readable Binary Web Content for Autonomous Agents v3.0 Draft

Based on CBOR (RFC 8949, STD 94) — IETF Internet Standard

Abstract

CBOR-Web defines a method for websites to expose a machine-native copy of their content as a parallel channel alongside existing HTML. A single file — index.cbor — placed at the root of a domain contains the entire site's content in structured binary format. AI agents fetch this file in one request and obtain every page, product, and data point without parsing HTML, CSS, or JavaScript.

Status of This Document: This is a draft specification published for community review. It is not yet an IETF Internet-Draft. The specification is stable and implemented in production on multiple sites.

1. Protocol

An AI agent discovers and reads CBOR-Web content through a single HTTP request:

GET /index.cbor HTTP/1.1
Host: example.com
Accept: application/cbor

The response is a CBOR-encoded document (Content-Type: application/cbor) beginning with the self-described CBOR tag 0xD9D9F7 (tag 55799, RFC 8949 §3.4.6).

ResponseMeaning
200 OK + application/cborCBOR-Web supported. The body contains the entire site.
404 Not FoundCBOR-Web not available. Fallback to HTML.

2. Document Structure

An index.cbor file contains a CBOR map with the following top-level keys:

KeyNameTypeRequiredDescription
0@typetextYes"cbor-web"
1@versionuintYes3 for this version
2sitemapYesDomain, name, languages, contact, geo
3securitymapYesAuthentication, access tiers (T0/T1/T2), rate limits
4navigationmapRecommendedMenus, hierarchy, breadcrumbs
5pagesarrayYesAll pages with structured content blocks
6metamapYesTimestamp, generator, signature

2.1 Page Entry

Each element in the pages array (key 5) contains:

FieldTypeRequiredDescription
"path"textYesURL path ("/", "/products")
"title"textYesPage title
"lang"textYesBCP 47 language code
"access"textYes"T0", "T1", or "T2"
"content"arrayYesOrdered content blocks
"hash"bstrRecommendedSHA-256 of serialised content
"updated"tag 1RecommendedLast modification timestamp
"_describe"textOptionalPublisher guidance for AI navigation
"_l"uintOptionalDepth level (0-4) for progressive reading

2.2 Content Blocks

Content blocks are the atomic units of a page. Each block is a CBOR map with a required "t" (type) field:

CodeTypeKeysDescription
hHeadingl (1-6), vSection heading
pParagraphvBody text
ulUnordered listv (array)Bullet list
olOrdered listv (array)Numbered list
tableTableheaders, rowsData table
ctaCall to actionv, hrefButton or link
qQuotev, attrCitation with source
imgImagesrc, altImage reference
codeCodev, langSource code
dlDefinitionsv (array of maps)Term/definition pairs

3. Discovery Methods

An AI agent uses the following methods to discover CBOR-Web support, in order of priority:

PriorityMethodExample
1 (highest)Direct accessGET /index.cbor
2DNS TXT record_cbor-web.example.com TXT "v=cbor-web; url=..."
3HTML link element<link rel="alternate" type="application/cbor" href="...">
4cbor.txtPlain text file at root with index URL
5robots.txtCBOR-Web: /index.cbor
6llms.txtSection referencing index.cbor

4. Access Tiers

TierNameAuthenticationContent
T2OpenNonePublic pages, metadata
T1AuthenticatedCBORW Token / API keyPremium content, full data
T0InstitutionaleIDAS 2.0 / X.509 EVGovernment, verified identity
Trust Chain: A signed index.cbor creates a verifiable chain: signature → DNS public key → CBORW token → Ethereum wallet → legal entity. AI agents assess source trustworthiness through this chain.

5. Comparison with Existing Standards

StandardFormatContentRelation to CBOR-Web
robots.txtTextCrawl permissionsComplementary
sitemap.xmlXMLURL listReplaced by index.cbor
llms.txtMarkdownText summary for LLMsComplementary
index.htmlHTMLHuman-readable pageParallel — index.cbor is for machines
JSON-LDJSONStructured entitiesComplementary (entities vs full content)

6. Efficiency

Measured on a production website (deltopide.fr, 25 March 2026):

MetricHTMLCBOR-WebImprovement
File size41,537 bytes2,487 bytes×17
LLM tokens~10,400~620×17
HTTP requests9+1×9
Useful signal12%100%×8
Formats to parse41
Accent encodings31 (UTF-8)

7. History

Chapter 1 — Two researchers, one idea (2013)

In 2013, Carsten Bormann (University of Bremen) and Paul Hoffman (ICANN) invented CBOR — Concise Binary Object Representation. The same data model as JSON, encoded in binary. Published as RFC 7049 by the IETF. Designed for constrained devices: thermometers, door locks, electricity meters.

Chapter 2 — Internet Standard (2020)

In December 2020, RFC 8949 replaced the original specification. CBOR became an Internet Standard (STD 94) — the highest level of validation at the IETF. Adopted by WebAuthn, COSE, CoAP. Millions of devices spoke CBOR without anyone noticing.

Chapter 3 — AI agents arrive (2024-2026)

AI agents started browsing the web — and found it absurd. 2.86 MB average per page. 86 requests to display a single page. 95% noise for 5% useful content. 50 billion AI crawler requests per day (Cloudflare 2025).

Chapter 4 — One file, the entire site (2026)

index.cbor — placed next to index.html. The HTML for humans, the CBOR for machines. A 160 KB Shopify page becomes 8 KB of CBOR. A complete 84-page site fits in 700 KB.

Chapter 5 — The ecological imperative

Data centres consumed 415 TWh of electricity in 2024, projected to reach 945 TWh by 2030 (IEA). Every unnecessary byte transferred is energy wasted. CBOR-Web doesn't just make the web faster for AI — it makes it lighter for the planet.

Chapter 6 — A standard that belongs to everyone

The CBOR-Web read protocol is published under CC0 — Public Domain. No licence to pay, no permission to ask. Anyone can read an index.cbor. The reading protocol belongs to humanity.

8. Implementations

Live Sites

SiteMethodPages
deltopide.frDirect /index.cbor + DNS TXT + cbor.txt + robots.txt + llms.txt + HTML link1
laforetnousregale.frDNS TXT5
pacific-planet.comDNS TXT6
verdetao.comDNS TXT15
eloiseplot-dieteticienne.comHTML link6
crm.laforetnousregale.frDirect /index.cbor1

Tools

ToolLanguagePurpose
text2cborRustConvert HTML websites to index.cbor
cbor-crawlRustAI-side crawler for index.cbor files

9. References

10. Licence

The read protocol is CC0 — Public Domain. The full specification is CC BY-ND 4.0. Tools are MIT.

The reading protocol belongs to everyone. The creation tools are where value lives.