AgentLang Specification
A declarative language for defining AI agents with reproducible builds, versioned dependencies, and encrypted prompts.
Architecture
AgentLang separates concerns into three layers:
┌─────────────────────────────────────────────────────────┐
│ AgentLang (Spec) │
│ Declarative YAML defining agent behavior │
└─────────────────────────────────────────────────────────┘
↓
parsed by
↓
┌─────────────────────────────────────────────────────────┐
│ Host Environment │
│ Rush | Claude Desktop | Replit | Docker | Raw OS │
│ │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Runtime (Agent Loop) │ │
│ │ Harness | LangGraph | CrewAI | Custom │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ Agent (from YAML spec) │ │ │
│ │ │ think → act → observe → repeat │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
│ Host Capabilities: │
│ filesystem | camera | microphone | OAuth | background │
└─────────────────────────────────────────────────────────┘| Layer | Responsibility | Examples |
|---|---|---|
| Spec | What the agent is (tools, prompts, permissions) | AgentLang YAML |
| Runtime | How the agent loop executes | LangGraph, CrewAI, Harness |
| Host | Where the agent runs, what capabilities it grants | Rush, Claude Desktop, Docker, OS |
Same agent spec, different runtimes, different hosts. Like how Kubernetes YAML runs on EKS, GKE, or bare metal - the spec doesn't care.
Why AgentLang?
Today, agents are defined in code. Code that drifts. Dependencies that break. Prompts committed in plaintext. No way to reproduce yesterday's agent.
AgentLang treats agents like software artifacts. You define them declaratively. You build them into signed, versioned containers. You ship them knowing exactly what's inside.
# The problem
$ pip install my-agent
# Which version of web_search? Which model? What prompts?
# Nobody knows. Every install is different.
# The solution
$ agentlang build ./my-agent
# Locked dependencies, hashed files, signed container.
# Same agent, every time.Agent Definition
An agent is defined in agent.yaml. This is the source of truth.
name: research-assistant
version: 1.0.0
title: Research Assistant
subtitle: McKinsey-grade research
description: |
Deep research across academic papers, market reports, and web sources.
Synthesizes findings into actionable reports.
developer: acme
category: Business
models:
claude-sonnet-4:
provider: anthropic
options:
temperature: 0.7
max_tokens: 8192
tools:
- name: web_search
version: ^1.2.0
- name: web_fetch
- name: artifacts
- name: sqlite
config:
tables:
citations:
columns:
id: { type: integer, primary_key: true }
url: { type: text, required: true }
title: { type: text }
relevance: { type: real }
delegate:
agents:
- ./specialists/academic-researcher.yaml
- ./specialists/market-analyst.yaml
max_depth: 3
permissions:
academic-researcher: always
market-analyst: sessionPrompt Format (PromptLang)
Prompts use a sectioned YAML format. Each section has a name and content. This compiles to markdown or XML tags depending on the target runtime.
# prompt.yaml
version: 2
role: user
sections:
- name: mindset
content: |
You are genuinely trying to discover something that doesn't exist yet.
When you derive something, ask: "Have I seen this before?"
If yes, that's a signal to try something else.
- name: who_you_are
content: |
You're the person who checks their own work obsessively.
Not because you're told to - because you genuinely want to know.
- name: your_tools
content: |
web_search - Find current information across the web
web_fetch - Read detailed content from specific URLs
artifacts - Save reports and findings for the user
sqlite - Query and store structured research data
- name: workflow
content: |
1. Check user_memory for existing context
2. Search multiple sources (academic, news, industry)
3. Cross-reference findings
4. Store citations in sqlite
5. Synthesize into artifactsCompiles to markdown:
## mindset
You are genuinely trying to discover something...
## who_you_are
You're the person who checks their own work...
## your_tools
web_search - Find current information...Or XML tags:
<mindset>
You are genuinely trying to discover something...
</mindset>
<who_you_are>
You're the person who checks their own work...
</who_you_are>Tool Specification
Tools are declared with version constraints. Runtimes resolve and lock versions.
Tool Declaration
tools:
- name: web_search
version: ^1.2.0 # Semver constraint
- name: web_fetch
version: ~1.0.0 # Patch updates only
- name: artifacts # Latest compatible
- name: storage # Abstract storage interface
config:
schema: # Tool-specific config
tables:
records:
columns:
id: { type: integer, primary_key: true }
data: { type: text, required: true }Tool Schemas (MCP Extension)
MCP defines input schemas for tools. AgentLang extends this with optional output schemas, enabling runtimes to validate responses, route to UI components, and provide reliable error handling.
# Tool definition with input + output schemas
tools:
- name: web_search
version: ^1.2.0
input_schema: # Standard MCP
type: object
properties:
query:
type: string
description: "Search query"
max_results:
type: integer
default: 10
required: [query]
output_schema: # AgentLang extension
type: object
properties:
results:
type: array
items:
type: object
properties:
title: { type: string }
url: { type: string }
snippet: { type: string }
total_count:
type: integerOutput schemas enable:
| Feature | Benefit |
|---|---|
| Response validation | Catch malformed tool outputs before LLM sees them |
| Auto UI routing | Map outputs to UI components without LLM intervention |
| Type-safe chaining | Verify tool A's output matches tool B's input |
| Error classification | Distinguish validation errors from execution errors |
MCP Servers
External tools via Model Context Protocol:
mcp_servers:
Stripe:
command: npx
args: ["-y", "@stripe/mcp", "--tools=all"]
env:
STRIPE_SECRET_KEY: "{{.env.STRIPE_SECRET_KEY}}"
tools: # Whitelist (optional)
- list_subscriptions
- list_customers
- list_invoicesHTTP Tools (OAuth APIs)
http_tools:
twitter_post_tweet:
description: "Post a tweet to Twitter/X"
endpoint: "/api/v1/twitter/tweets"
method: POST
auth: bearer
twitter_token: true # Requires OAuth
params:
text:
type: string
description: "Tweet text (max 280 chars)"
required: true
reply_to:
type: string
required: false
body_template: '{
"text": "{{text}}"
{{if reply_to}}, "reply_to": "{{reply_to}}"{{end}}
}'Bash Tools (Scoped Commands)
bash_tools:
trivy_scan:
description: "Scan for CVE vulnerabilities"
command: trivy
help_command: "trivy --help"
labels:
running: "Scanning for vulnerabilities"
finished: "Scan complete"
dependencies:
- name: trivy
check: "trivy --version"
install:
darwin: "brew install trivy"
linux: "curl -sfL https://... | sh"Permissions
Agents declare capabilities they request. Hosts decide what to grant. This separation is critical - an agent built for a sandboxed web runtime shouldn't assume it has the same access as one running on a desktop app with system privileges.
Capability Declaration
Agents declare required and optional capabilities:
permissions:
required:
- network # Internet access
- storage # Persist data between sessions
optional:
- microphone # Audio input
- camera # Video input
- screen_capture # Screenshot/recording
- filesystem # Read/write local files
- notifications # System notifications
- background # Run when user isn't active
- location # GPS/IP location
# Capability-specific config
microphone:
mode: on_demand # user_triggered | on_demand | continuous
reason: "Voice commands and meeting transcription"
camera:
mode: on_demand
reason: "Security monitoring when requested"
background:
mode: scheduled # scheduled | continuous
reason: "Hourly inbox check"Permission Modes
Different capabilities require different trust levels:
| Mode | When Granted | Example |
|---|---|---|
user_triggered | Only when user explicitly clicks/taps | Upload photo button |
on_demand | Agent can request, host may prompt | "Check my camera feed" |
continuous | Always active while agent runs | Meeting transcription |
scheduled | Runs at specified intervals | Hourly security check |
Host Responsibility
The host (runtime environment) decides how to handle permission requests:
# Host capability matrix (not in agent.yaml - runtime config)
#
# Claude Desktop -> sandboxed, no background, user_triggered only
# Replit Agent -> filesystem + network, no sensors
# Rush Desktop -> full system access, all modes
# Mobile App -> requires OS permission prompts
# Browser Extension -> limited to active tabWhen an agent requests a capability the host doesn't support:
- Required capability missing - Agent fails to start with clear error
- Optional capability missing - Agent runs in degraded mode
- Mode downgrade - Host can grant
user_triggeredwhenon_demandrequested
Autonomous Operations
For agents that operate without active user supervision:
# User asks: "Check my camera every hour while I'm away
# and send me a photo of what things look like"
permissions:
required:
- camera
- background
- notifications
camera:
mode: scheduled
reason: "Periodic security snapshots"
schedule: "0 * * * *" # Cron: every hour
background:
mode: scheduled
wake_triggers:
- schedule: "0 * * * *" # Matches camera schedule
max_runtime: 60 # Seconds per wake
notifications:
mode: on_demand
channels:
- push # Mobile push
- email # FallbackThe host must support background execution and scheduled wake. If it doesn't, the agent declares this as a required capability and fails gracefully on unsupported hosts rather than silently not working.
Capability Categories
| Category | Capabilities | Risk Level |
|---|---|---|
| Network | network, websocket, p2p | Low |
| Storage | storage, filesystem, keychain | Medium |
| Sensors | microphone, camera, screen_capture, location | High |
| System | background, notifications, clipboard, shell | High |
| Identity | oauth, wallet, signing | Critical |
Context Management
Agents declare how conversation history and context should be managed. Runtimes implement the actual storage - the spec defines the interface.
context:
# Sliding window for conversation history
window:
max_tokens: 100000
strategy: sliding # sliding | summarize | truncate
# Compaction when context fills up
compaction:
strategy: summarize # summarize | drop_oldest | checkpoint
trigger: 0.8 # Compact at 80% capacity
preserve:
- tool_results # Never drop tool outputs
- user_messages # Keep recent user turns
# Injected context (available as template vars)
inject:
- time # Current timestamp
- location # User location (if permitted)
- user_profile # From runtime user storeCompaction strategies:
| Strategy | Behavior | Use Case |
|---|---|---|
summarize | LLM summarizes older context | Long research sessions |
drop_oldest | Remove oldest messages | Stateless assistants |
checkpoint | Save full context, start fresh | Multi-phase workflows |
Multi-Agent Orchestration
AgentLang supports two orchestration modes:
Dynamic Orchestration
The LLM decides which specialists to invoke at runtime based on the task. Subagents are discovered from co-located YAML files.
# Orchestrator (agent.yaml)
tools:
- name: delegate
config:
mode: dynamic # LLM picks specialists
discovery: directory # Find co-located *.yaml files
max_depth: 3
# Directory structure - subagents discovered automatically
research-agent/
agent.yaml # Orchestrator
academic-researcher.yaml # Discovered as specialist
market-analyst.yaml # Discovered as specialist
report-writer.yaml # Discovered as specialist
# Subagent definition (academic-researcher.yaml)
name: academic-researcher
description: |
Specialist in research papers, academic publications,
and scholarly work.
# ^ Description used for capability-based routing
tools:
- name: web_search
- name: artifactsStatic Orchestration
Fixed agent sequence defined upfront. Useful for deterministic pipelines.
# Static crew (agent.yaml)
delegate:
mode: static
process: sequential # sequential | parallel | hierarchical
agents:
- path: ./researcher.yaml
task: "Research the topic thoroughly"
- path: ./writer.yaml
task: "Write a report based on research"
depends_on: [researcher] # Waits for researcher to completeDelegation Permissions
delegate:
permissions:
academic-researcher: always # No confirmation needed
market-analyst: session # Ask once per session
code-executor: always_ask # Confirm every invocationSubagents share the session context and artifacts with the orchestrator.
Generative UI (MCP-UI)
Agents can render rich interactive components instead of plain text. Components are declared in agent.yaml and rendered viarender_* tools.
ui_components:
email_card:
actions:
archive:
tool: gmail_modify_labels # Sync action
params:
remove_labels: "INBOX"
reply:
params: { messageId: "{{messageId}}" } # Agent action
video_player: {}
metric_card: {}In the prompt:
render_email_card({
messageId: "msg_123",
from: { name: "Alice", email: "alice@example.com" },
subject: "Project update",
snippet: "Latest progress report...",
timestamp: "2025-01-29T10:30:00Z"
})Two action types:
| Type | Config | Behavior |
|---|---|---|
| Sync | Has tool: field | Executes immediately, no LLM |
| Agent | No tool: field | Routes to LLM for reasoning |
Tool UI Rendering
HTTP tools can declare UI that renders automatically based on execution status:
http_tools:
generate_headshot:
endpoint: /api/v1/proxy
method: POST
params:
input_image: { type: string, required: true }
ui_component:
completed:
component: headshot_gallery
props:
images: "{{json.messages[0].images}}"
cost: "{{json.usage.cost_usd}}"
actions:
sync_actions:
download:
tool: system://download_file
params: { url: "{{selectedImage}}" }
agent_actions:
- regenerate
failed:
component: error_card
props:
message: "{{error}}"Skills
Reusable bundles that constrain which tools an agent can use and provide guided workflows.
# agents/my-agent/skills/test-helper/SKILL.md
---
name: test-helper
description: Skill for testing capabilities
allowed-tools:
- sqlite
- delegate
---
# Test Helper Skill
When testing a tool:
1. Call with minimal valid input
2. Validate response structure
3. Test edge cases
4. Document results# agent.yaml
skills:
- name: test-helper # Restricts to: sqlite, delegate onlyBuild Output
Lockfile
# agent.lock
version: "1"
resolved_at: "2025-01-29T10:30:00Z"
tools:
web_search:
version: 1.2.0
hash: sha256:e5f6a7b8...
web_fetch:
version: 1.0.0
hash: sha256:c9d0e1f2...
agents:
academic-researcher:
version: 1.0.0
hash: sha256:f7e8d9c0...
models:
claude-sonnet-4:
context_window: 200000
supports_tools: true
files:
agent.yaml: sha256:1a2b3c4d...
prompt.yaml: sha256:5e6f7a8b...Container
The .agent container is a signed, encrypted archive:
research-assistant@1.0.0.agent
├── manifest.json # Metadata + file list
├── agent.yaml # Definition (plaintext)
├── prompt.yaml.enc # Encrypted (AES-256-GCM)
├── agent.lock # Locked dependencies
├── signature.sig # Ed25519 signature
└── context/ # Bundled resources
└── templates/UX Hints
Optional ux.yaml for runtime UI improvements:
integrations:
- name: Gmail
website: https://gmail.com
- name: Stripe
website: https://stripe.com
suggestions:
- "Help me catch up on emails"
- "What's my MRR this month?"
tools:
gmail_list_messages:
labels:
running: "Loading inbox"
finished: "Loaded inbox"
suggestions:
- "Show my unread emails"
mcp_tools:
playwright-web:
web_navigate:
labels:
running: "Navigating"
finished: "Navigation complete"CLI Reference
# Build agent (resolves deps, generates lockfile, signs)
agentlang build ./my-agent
# Build with quality check (LLM reviews prompts)
agentlang build ./my-agent --quality
# Validate without building
agentlang validate ./my-agent
# Inspect container contents
agentlang inspect my-agent@1.0.0.agent
# Publish to registry
agentlang publish ./my-agent
# Run locally (requires compatible runtime)
agentlang run ./my-agent --prompt "Research X"Runtime Requirements
AgentLang defines agents. Runtimes execute them. A compatible runtime must:
- Parse agent.yaml and prompt.yaml formats
- Verify container signatures
- Decrypt prompts with provided key
- Resolve tools from lockfile versions
- Implement the standard tool interfaces
- Handle multi-agent delegation
- Render UI components (if supported)