---
title: "The Agent-Ready Web Standard"
type: standard
id: "agent-ready-web-standard"
version: "0.1"
description: "Technical standard for agent-ready websites: raw content, metadata schemas, llms.txt, JSON APIs, discovery, trust signals, hashes, and change feeds."
last_updated: "2026-04-24"
status: "draft"
tags:
- standard
- specification
- agent-ready
- architecture
---
# The Agent-Ready Web Standard
**v0.1 — Draft — April 2026**
This is the technical companion to the [Agent-Ready Website Checklist](/checklist). The checklist tells you *what* to build and *why*. This page specifies *how* — formats, schemas, and protocols.
Start with the [checklist](/checklist) if you haven't read it.
---
## Content Format
Store content in a structured, parseable format with embedded metadata. The recommended approach is markdown with YAML frontmatter:
```markdown
---
title: "Page Title"
type: product
id: "unique-stable-id"
description: "One-line summary."
last_updated: "2026-04-12"
---
# Page Title
Body content in markdown.
```
**Requirements:**
- Every content item of the same type has the same metadata fields
- Fields use consistent types: ISO 8601 dates, standard units, predictable value sets
- Content is self-contained — no external rendering dependencies
- The metadata schema is documented per content type
**Acceptable formats:** Markdown + YAML frontmatter, JSON files, structured XML. The format matters less than consistency and machine-readability.
---
## Metadata Schema
### Required fields (all content types)
| Field | Type | Description |
|-------|------|-------------|
| `title` | string | Human-readable title |
| `type` | string | Content category: `product`, `article`, `doc`, `page`, etc. |
| `id` | string | Permanent unique identifier (not the URL slug) |
| `description` | string | One-line summary |
| `last_updated` | date | ISO 8601 date of last substantive change |
### Recommended fields
| Field | Type | Description |
|-------|------|-------------|
| `created` or `date` | date | First publication date |
| `author` | string | Creator or maintaining organization |
| `tags` | list | Consistent taxonomy tags |
| `last_verified` | date | When factual accuracy was last confirmed |
### Type-specific fields
Each content type should define additional fields. Document the schema so agents know what to expect. Examples:
**Product/model type:**
```yaml
provider: "Company Name"
pricing:
input: "$5.00 / 1M tokens"
output: "$25.00 / 1M tokens"
context_window: 1048576
benchmarks:
mmlu: 93.1
best_for:
- "Complex reasoning"
```
**Comparison type:**
```yaml
models_compared:
- "model-a"
- "model-b"
comparison_type: "head-to-head"
```
---
## Access Protocols
### Raw content
Serve source content at predictable URLs alongside rendered HTML pages:
```
/content/[type]/[slug].md → markdown with YAML frontmatter
/content/[type]/_index.md → type index
/content/_index.md → site index
```
### JSON API
Provide a structured API at a known base URL:
```
/api/v1/index.json → all content types and counts
/api/v1/[type].json → all items of a type with full metadata
/api/v1/[type]/[slug].json → single item (optional)
```
API responses must return typed fields — not HTML fragments.
### Bulk access
Provide a way to fetch all content efficiently:
```
/llms-full.txt → all content in one file
```
Or: paginated API endpoints, a downloadable archive, or equivalent.
---
## Discovery
### llms.txt
A machine-readable index at `/llms.txt` following the [llms-txt.org](https://llms-txt.org) format:
```
# Site Name
> One-line description.
## Content Type
- [Item Title](/path/to/item): Description
```
### well-known discovery
An endpoint at `/.well-known/ai.json`:
```json
{
"name": "Site Name",
"description": "What this site contains.",
"llms_txt": "/llms.txt",
"llms_full": "/llms-full.txt",
"api": "/api/v1/",
"raw_content": "/content/",
"search": "/search-index.json",
"sitemap": "/sitemap.xml"
}
```
### Search index
A JSON file at a known path with structured metadata for all content:
```json
[
{
"title": "Page Title",
"type": "model",
"id": "page-id",
"description": "Summary.",
"url": "/models/page-id",
"tags": ["tag1", "tag2"]
}
]
```
### Sitemap
An XML sitemap that includes both:
- Human-readable page URLs: `https://example.com/models/page-id`
- Machine-readable content URLs: `https://example.com/content/models/page-id.md`
### robots.txt
Explicitly allow agent access:
```
User-agent: *
Allow: /content/
Allow: /api/
Allow: /llms.txt
Allow: /search-index.json
```
---
## Trust Signals
### Timestamps
Use three distinct dates where applicable:
| Field | Meaning |
|-------|---------|
| `last_updated` | Last substantive content change |
| `created` / `date` | First publication |
| `last_verified` | Last factual accuracy check (distinct from edits) |
### Confidence markers
For volatile fields, add recency metadata:
```yaml
pricing:
input: "$5.00 / 1M tokens"
_as_of: "2026-04-01"
_confidence: "check-provider"
```
### Content integrity
Include a hash of the body content so agents can check cache freshness:
```yaml
content_hash: "sha256:a3f2b8c..."
```
---
## Relationship Metadata
### Typed relationships
Link related content in metadata with explicit relationship types:
```yaml
related:
- id: "other-page-id"
type: comparison
relationship: "compared_in"
- id: "tool-id"
type: agent
relationship: "used_by"
```
### Change feed
Provide a JSON or RSS feed at a known URL:
```
/feed.json → JSON Feed format
/feed.xml → RSS/Atom
```
Entries must include: title, URL, date, type, and a summary.
---
## Compliance Levels
See the [checklist](/checklist) for the full maturity model. Summary:
| Level | Name | Key Requirements |
|-------|------|-----------------|
| 0 | Scrape-Only | HTML only, no structured access |
| 1 | Readable | Semantic HTML, basic meta, sitemap |
| 2 | Structured | Raw content, consistent metadata, llms.txt |
| 3 | Agent-Ready | JSON API, canonical IDs, provenance, search index |
| 4 | Agent-Native | Relationships, feeds, hashes, confidence signals, MCP |
Most of the agent-readiness benefit comes from reaching Level 2.
---
## Reference Implementation
This site — [ai-future-ready.com](https://ai-future-ready.com) — implements this standard at Level 3. Every feature described above is live and inspectable. See the [checklist](/checklist) for our honest self-assessment and the specific gaps we're working on.
The Agent-Ready Web Standard
v0.1 — Draft — April 2026
This is the technical companion to the Agent-Ready Website Checklist. The checklist tells you what to build and why. This page specifies how — formats, schemas, and protocols.
Start with the checklist if you haven't read it.
Content Format
Store content in a structured, parseable format with embedded metadata. The recommended approach is markdown with YAML frontmatter:
---
title: "Page Title"
type: product
id: "unique-stable-id"
description: "One-line summary."
last_updated: "2026-04-12"
---
# Page Title
Body content in markdown.
Requirements:
- Every content item of the same type has the same metadata fields
- Fields use consistent types: ISO 8601 dates, standard units, predictable value sets
- Content is self-contained — no external rendering dependencies
- The metadata schema is documented per content type
Acceptable formats: Markdown + YAML frontmatter, JSON files, structured XML. The format matters less than consistency and machine-readability.
Metadata Schema
Required fields (all content types)
| Field |
Type |
Description |
title |
string |
Human-readable title |
type |
string |
Content category: product, article, doc, page, etc. |
id |
string |
Permanent unique identifier (not the URL slug) |
description |
string |
One-line summary |
last_updated |
date |
ISO 8601 date of last substantive change |
Recommended fields
| Field |
Type |
Description |
created or date |
date |
First publication date |
author |
string |
Creator or maintaining organization |
tags |
list |
Consistent taxonomy tags |
last_verified |
date |
When factual accuracy was last confirmed |
Type-specific fields
Each content type should define additional fields. Document the schema so agents know what to expect. Examples:
Product/model type:
provider: "Company Name"
pricing:
input: "$5.00 / 1M tokens"
output: "$25.00 / 1M tokens"
context_window: 1048576
benchmarks:
mmlu: 93.1
best_for:
- "Complex reasoning"
Comparison type:
models_compared:
- "model-a"
- "model-b"
comparison_type: "head-to-head"
Access Protocols
Raw content
Serve source content at predictable URLs alongside rendered HTML pages:
/content/[type]/[slug].md → markdown with YAML frontmatter
/content/[type]/_index.md → type index
/content/_index.md → site index
JSON API
Provide a structured API at a known base URL:
/api/v1/index.json → all content types and counts
/api/v1/[type].json → all items of a type with full metadata
/api/v1/[type]/[slug].json → single item (optional)
API responses must return typed fields — not HTML fragments.
Bulk access
Provide a way to fetch all content efficiently:
/llms-full.txt → all content in one file
Or: paginated API endpoints, a downloadable archive, or equivalent.
Discovery
llms.txt
A machine-readable index at /llms.txt following the llms-txt.org format:
# Site Name
> One-line description.
## Content Type
- [Item Title](/path/to/item): Description
well-known discovery
An endpoint at /.well-known/ai.json:
{
"name": "Site Name",
"description": "What this site contains.",
"llms_txt": "/llms.txt",
"llms_full": "/llms-full.txt",
"api": "/api/v1/",
"raw_content": "/content/",
"search": "/search-index.json",
"sitemap": "/sitemap.xml"
}
Search index
A JSON file at a known path with structured metadata for all content:
[
{
"title": "Page Title",
"type": "model",
"id": "page-id",
"description": "Summary.",
"url": "/models/page-id",
"tags": ["tag1", "tag2"]
}
]
Sitemap
An XML sitemap that includes both:
- Human-readable page URLs:
https://example.com/models/page-id
- Machine-readable content URLs:
https://example.com/content/models/page-id.md
robots.txt
Explicitly allow agent access:
User-agent: *
Allow: /content/
Allow: /api/
Allow: /llms.txt
Allow: /search-index.json
Trust Signals
Timestamps
Use three distinct dates where applicable:
| Field |
Meaning |
last_updated |
Last substantive content change |
created / date |
First publication |
last_verified |
Last factual accuracy check (distinct from edits) |
Confidence markers
For volatile fields, add recency metadata:
pricing:
input: "$5.00 / 1M tokens"
_as_of: "2026-04-01"
_confidence: "check-provider"
Content integrity
Include a hash of the body content so agents can check cache freshness:
content_hash: "sha256:a3f2b8c..."
Relationship Metadata
Typed relationships
Link related content in metadata with explicit relationship types:
related:
- id: "other-page-id"
type: comparison
relationship: "compared_in"
- id: "tool-id"
type: agent
relationship: "used_by"
Change feed
Provide a JSON or RSS feed at a known URL:
/feed.json → JSON Feed format
/feed.xml → RSS/Atom
Entries must include: title, URL, date, type, and a summary.
Compliance Levels
See the checklist for the full maturity model. Summary:
| Level |
Name |
Key Requirements |
| 0 |
Scrape-Only |
HTML only, no structured access |
| 1 |
Readable |
Semantic HTML, basic meta, sitemap |
| 2 |
Structured |
Raw content, consistent metadata, llms.txt |
| 3 |
Agent-Ready |
JSON API, canonical IDs, provenance, search index |
| 4 |
Agent-Native |
Relationships, feeds, hashes, confidence signals, MCP |
Most of the agent-readiness benefit comes from reaching Level 2.
Reference Implementation
This site — ai-future-ready.com — implements this standard at Level 3. Every feature described above is live and inspectable. See the checklist for our honest self-assessment and the specific gaps we're working on.