AgentLangv0.1.0

AgentLang Specification

A declarative language for defining AI agents with reproducible builds, versioned dependencies, and encrypted prompts.

Architecture

AgentLang separates concerns into three layers:

┌─────────────────────────────────────────────────────────┐
│                    AgentLang (Spec)                     │
│         Declarative YAML defining agent behavior        │
└─────────────────────────────────────────────────────────┘
                           ↓
                      parsed by
                           ↓
┌─────────────────────────────────────────────────────────┐
│                    Host Environment                     │
│   Rush | Claude Desktop | Replit | Docker | Raw OS      │
│                                                         │
│  ┌───────────────────────────────────────────────────┐  │
│  │              Runtime (Agent Loop)                 │  │
│  │      Harness | LangGraph | CrewAI | Custom        │  │
│  │                                                   │  │
│  │  ┌─────────────────────────────────────────────┐  │  │
│  │  │          Agent (from YAML spec)             │  │  │
│  │  │     think → act → observe → repeat          │  │  │
│  │  └─────────────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────┘  │
│                                                         │
│  Host Capabilities:                                     │
│  filesystem | camera | microphone | OAuth | background  │
└─────────────────────────────────────────────────────────┘
LayerResponsibilityExamples
SpecWhat the agent is (tools, prompts, permissions)AgentLang YAML
RuntimeHow the agent loop executesLangGraph, CrewAI, Harness
HostWhere the agent runs, what capabilities it grantsRush, Claude Desktop, Docker, OS

Same agent spec, different runtimes, different hosts. Like how Kubernetes YAML runs on EKS, GKE, or bare metal - the spec doesn't care.

Why AgentLang?

Today, agents are defined in code. Code that drifts. Dependencies that break. Prompts committed in plaintext. No way to reproduce yesterday's agent.

AgentLang treats agents like software artifacts. You define them declaratively. You build them into signed, versioned containers. You ship them knowing exactly what's inside.

# The problem
$ pip install my-agent
# Which version of web_search? Which model? What prompts?
# Nobody knows. Every install is different.

# The solution
$ agentlang build ./my-agent
# Locked dependencies, hashed files, signed container.
# Same agent, every time.

Agent Definition

An agent is defined in agent.yaml. This is the source of truth.

name: research-assistant
version: 1.0.0
title: Research Assistant
subtitle: McKinsey-grade research
description: |
  Deep research across academic papers, market reports, and web sources.
  Synthesizes findings into actionable reports.

developer: acme
category: Business

models:
  claude-sonnet-4:
    provider: anthropic
    options:
      temperature: 0.7
      max_tokens: 8192

tools:
  - name: web_search
    version: ^1.2.0
  - name: web_fetch
  - name: artifacts
  - name: sqlite
    config:
      tables:
        citations:
          columns:
            id: { type: integer, primary_key: true }
            url: { type: text, required: true }
            title: { type: text }
            relevance: { type: real }

delegate:
  agents:
    - ./specialists/academic-researcher.yaml
    - ./specialists/market-analyst.yaml
  max_depth: 3
  permissions:
    academic-researcher: always
    market-analyst: session

Prompt Format (PromptLang)

Prompts use a sectioned YAML format. Each section has a name and content. This compiles to markdown or XML tags depending on the target runtime.

# prompt.yaml
version: 2
role: user
sections:
  - name: mindset
    content: |
      You are genuinely trying to discover something that doesn't exist yet.
      When you derive something, ask: "Have I seen this before?"
      If yes, that's a signal to try something else.

  - name: who_you_are
    content: |
      You're the person who checks their own work obsessively.
      Not because you're told to - because you genuinely want to know.

  - name: your_tools
    content: |
      web_search - Find current information across the web
      web_fetch - Read detailed content from specific URLs
      artifacts - Save reports and findings for the user
      sqlite - Query and store structured research data

  - name: workflow
    content: |
      1. Check user_memory for existing context
      2. Search multiple sources (academic, news, industry)
      3. Cross-reference findings
      4. Store citations in sqlite
      5. Synthesize into artifacts

Compiles to markdown:

## mindset
You are genuinely trying to discover something...

## who_you_are
You're the person who checks their own work...

## your_tools
web_search - Find current information...

Or XML tags:

<mindset>
You are genuinely trying to discover something...
</mindset>

<who_you_are>
You're the person who checks their own work...
</who_you_are>

Tool Specification

Tools are declared with version constraints. Runtimes resolve and lock versions.

Tool Declaration

tools:
  - name: web_search
    version: ^1.2.0          # Semver constraint
  - name: web_fetch
    version: ~1.0.0          # Patch updates only
  - name: artifacts          # Latest compatible
  - name: storage            # Abstract storage interface
    config:
      schema:                # Tool-specific config
        tables:
          records:
            columns:
              id: { type: integer, primary_key: true }
              data: { type: text, required: true }

Tool Schemas (MCP Extension)

MCP defines input schemas for tools. AgentLang extends this with optional output schemas, enabling runtimes to validate responses, route to UI components, and provide reliable error handling.

# Tool definition with input + output schemas
tools:
  - name: web_search
    version: ^1.2.0
    input_schema:            # Standard MCP
      type: object
      properties:
        query:
          type: string
          description: "Search query"
        max_results:
          type: integer
          default: 10
      required: [query]

    output_schema:           # AgentLang extension
      type: object
      properties:
        results:
          type: array
          items:
            type: object
            properties:
              title: { type: string }
              url: { type: string }
              snippet: { type: string }
        total_count:
          type: integer

Output schemas enable:

FeatureBenefit
Response validationCatch malformed tool outputs before LLM sees them
Auto UI routingMap outputs to UI components without LLM intervention
Type-safe chainingVerify tool A's output matches tool B's input
Error classificationDistinguish validation errors from execution errors

MCP Servers

External tools via Model Context Protocol:

mcp_servers:
  Stripe:
    command: npx
    args: ["-y", "@stripe/mcp", "--tools=all"]
    env:
      STRIPE_SECRET_KEY: "{{.env.STRIPE_SECRET_KEY}}"
    tools:                   # Whitelist (optional)
      - list_subscriptions
      - list_customers
      - list_invoices

HTTP Tools (OAuth APIs)

http_tools:
  twitter_post_tweet:
    description: "Post a tweet to Twitter/X"
    endpoint: "/api/v1/twitter/tweets"
    method: POST
    auth: bearer
    twitter_token: true      # Requires OAuth
    params:
      text:
        type: string
        description: "Tweet text (max 280 chars)"
        required: true
      reply_to:
        type: string
        required: false
    body_template: '{
      "text": "{{text}}"
      {{if reply_to}}, "reply_to": "{{reply_to}}"{{end}}
    }'

Bash Tools (Scoped Commands)

bash_tools:
  trivy_scan:
    description: "Scan for CVE vulnerabilities"
    command: trivy
    help_command: "trivy --help"
    labels:
      running: "Scanning for vulnerabilities"
      finished: "Scan complete"

dependencies:
  - name: trivy
    check: "trivy --version"
    install:
      darwin: "brew install trivy"
      linux: "curl -sfL https://... | sh"

Permissions

Agents declare capabilities they request. Hosts decide what to grant. This separation is critical - an agent built for a sandboxed web runtime shouldn't assume it has the same access as one running on a desktop app with system privileges.

Capability Declaration

Agents declare required and optional capabilities:

permissions:
  required:
    - network              # Internet access
    - storage              # Persist data between sessions

  optional:
    - microphone           # Audio input
    - camera               # Video input
    - screen_capture       # Screenshot/recording
    - filesystem           # Read/write local files
    - notifications        # System notifications
    - background           # Run when user isn't active
    - location             # GPS/IP location

  # Capability-specific config
  microphone:
    mode: on_demand        # user_triggered | on_demand | continuous
    reason: "Voice commands and meeting transcription"

  camera:
    mode: on_demand
    reason: "Security monitoring when requested"

  background:
    mode: scheduled        # scheduled | continuous
    reason: "Hourly inbox check"

Permission Modes

Different capabilities require different trust levels:

ModeWhen GrantedExample
user_triggeredOnly when user explicitly clicks/tapsUpload photo button
on_demandAgent can request, host may prompt"Check my camera feed"
continuousAlways active while agent runsMeeting transcription
scheduledRuns at specified intervalsHourly security check

Host Responsibility

The host (runtime environment) decides how to handle permission requests:

# Host capability matrix (not in agent.yaml - runtime config)
#
# Claude Desktop    -> sandboxed, no background, user_triggered only
# Replit Agent      -> filesystem + network, no sensors
# Rush Desktop      -> full system access, all modes
# Mobile App        -> requires OS permission prompts
# Browser Extension -> limited to active tab

When an agent requests a capability the host doesn't support:

Autonomous Operations

For agents that operate without active user supervision:

# User asks: "Check my camera every hour while I'm away
#              and send me a photo of what things look like"

permissions:
  required:
    - camera
    - background
    - notifications

  camera:
    mode: scheduled
    reason: "Periodic security snapshots"
    schedule: "0 * * * *"        # Cron: every hour

  background:
    mode: scheduled
    wake_triggers:
      - schedule: "0 * * * *"    # Matches camera schedule
    max_runtime: 60              # Seconds per wake

  notifications:
    mode: on_demand
    channels:
      - push                     # Mobile push
      - email                    # Fallback

The host must support background execution and scheduled wake. If it doesn't, the agent declares this as a required capability and fails gracefully on unsupported hosts rather than silently not working.

Capability Categories

CategoryCapabilitiesRisk Level
Networknetwork, websocket, p2pLow
Storagestorage, filesystem, keychainMedium
Sensorsmicrophone, camera, screen_capture, locationHigh
Systembackground, notifications, clipboard, shellHigh
Identityoauth, wallet, signingCritical

Context Management

Agents declare how conversation history and context should be managed. Runtimes implement the actual storage - the spec defines the interface.

context:
  # Sliding window for conversation history
  window:
    max_tokens: 100000
    strategy: sliding        # sliding | summarize | truncate

  # Compaction when context fills up
  compaction:
    strategy: summarize      # summarize | drop_oldest | checkpoint
    trigger: 0.8             # Compact at 80% capacity
    preserve:
      - tool_results         # Never drop tool outputs
      - user_messages        # Keep recent user turns

  # Injected context (available as template vars)
  inject:
    - time                   # Current timestamp
    - location               # User location (if permitted)
    - user_profile           # From runtime user store

Compaction strategies:

StrategyBehaviorUse Case
summarizeLLM summarizes older contextLong research sessions
drop_oldestRemove oldest messagesStateless assistants
checkpointSave full context, start freshMulti-phase workflows

Multi-Agent Orchestration

AgentLang supports two orchestration modes:

Dynamic Orchestration

The LLM decides which specialists to invoke at runtime based on the task. Subagents are discovered from co-located YAML files.

# Orchestrator (agent.yaml)
tools:
  - name: delegate
    config:
      mode: dynamic            # LLM picks specialists
      discovery: directory     # Find co-located *.yaml files
      max_depth: 3

# Directory structure - subagents discovered automatically
research-agent/
  agent.yaml                   # Orchestrator
  academic-researcher.yaml     # Discovered as specialist
  market-analyst.yaml          # Discovered as specialist
  report-writer.yaml           # Discovered as specialist

# Subagent definition (academic-researcher.yaml)
name: academic-researcher
description: |
  Specialist in research papers, academic publications,
  and scholarly work.
  # ^ Description used for capability-based routing

tools:
  - name: web_search
  - name: artifacts

Static Orchestration

Fixed agent sequence defined upfront. Useful for deterministic pipelines.

# Static crew (agent.yaml)
delegate:
  mode: static
  process: sequential          # sequential | parallel | hierarchical
  agents:
    - path: ./researcher.yaml
      task: "Research the topic thoroughly"
    - path: ./writer.yaml
      task: "Write a report based on research"
      depends_on: [researcher]  # Waits for researcher to complete

Delegation Permissions

delegate:
  permissions:
    academic-researcher: always    # No confirmation needed
    market-analyst: session        # Ask once per session
    code-executor: always_ask      # Confirm every invocation

Subagents share the session context and artifacts with the orchestrator.

Generative UI (MCP-UI)

Agents can render rich interactive components instead of plain text. Components are declared in agent.yaml and rendered viarender_* tools.

ui_components:
  email_card:
    actions:
      archive:
        tool: gmail_modify_labels    # Sync action
        params:
          remove_labels: "INBOX"
      reply:
        params: { messageId: "{{messageId}}" }  # Agent action

  video_player: {}
  metric_card: {}

In the prompt:

render_email_card({
  messageId: "msg_123",
  from: { name: "Alice", email: "alice@example.com" },
  subject: "Project update",
  snippet: "Latest progress report...",
  timestamp: "2025-01-29T10:30:00Z"
})

Two action types:

TypeConfigBehavior
SyncHas tool: fieldExecutes immediately, no LLM
AgentNo tool: fieldRoutes to LLM for reasoning

Tool UI Rendering

HTTP tools can declare UI that renders automatically based on execution status:

http_tools:
  generate_headshot:
    endpoint: /api/v1/proxy
    method: POST
    params:
      input_image: { type: string, required: true }
    ui_component:
      completed:
        component: headshot_gallery
        props:
          images: "{{json.messages[0].images}}"
          cost: "{{json.usage.cost_usd}}"
        actions:
          sync_actions:
            download:
              tool: system://download_file
              params: { url: "{{selectedImage}}" }
          agent_actions:
            - regenerate
      failed:
        component: error_card
        props:
          message: "{{error}}"

Skills

Reusable bundles that constrain which tools an agent can use and provide guided workflows.

# agents/my-agent/skills/test-helper/SKILL.md
---
name: test-helper
description: Skill for testing capabilities
allowed-tools:
  - sqlite
  - delegate
---

# Test Helper Skill

When testing a tool:
1. Call with minimal valid input
2. Validate response structure
3. Test edge cases
4. Document results
# agent.yaml
skills:
  - name: test-helper    # Restricts to: sqlite, delegate only

Build Output

Lockfile

# agent.lock
version: "1"
resolved_at: "2025-01-29T10:30:00Z"

tools:
  web_search:
    version: 1.2.0
    hash: sha256:e5f6a7b8...
  web_fetch:
    version: 1.0.0
    hash: sha256:c9d0e1f2...

agents:
  academic-researcher:
    version: 1.0.0
    hash: sha256:f7e8d9c0...

models:
  claude-sonnet-4:
    context_window: 200000
    supports_tools: true

files:
  agent.yaml: sha256:1a2b3c4d...
  prompt.yaml: sha256:5e6f7a8b...

Container

The .agent container is a signed, encrypted archive:

research-assistant@1.0.0.agent
├── manifest.json        # Metadata + file list
├── agent.yaml           # Definition (plaintext)
├── prompt.yaml.enc      # Encrypted (AES-256-GCM)
├── agent.lock           # Locked dependencies
├── signature.sig        # Ed25519 signature
└── context/             # Bundled resources
    └── templates/

UX Hints

Optional ux.yaml for runtime UI improvements:

integrations:
  - name: Gmail
    website: https://gmail.com
  - name: Stripe
    website: https://stripe.com

suggestions:
  - "Help me catch up on emails"
  - "What's my MRR this month?"

tools:
  gmail_list_messages:
    labels:
      running: "Loading inbox"
      finished: "Loaded inbox"
    suggestions:
      - "Show my unread emails"

mcp_tools:
  playwright-web:
    web_navigate:
      labels:
        running: "Navigating"
        finished: "Navigation complete"

CLI Reference

# Build agent (resolves deps, generates lockfile, signs)
agentlang build ./my-agent

# Build with quality check (LLM reviews prompts)
agentlang build ./my-agent --quality

# Validate without building
agentlang validate ./my-agent

# Inspect container contents
agentlang inspect my-agent@1.0.0.agent

# Publish to registry
agentlang publish ./my-agent

# Run locally (requires compatible runtime)
agentlang run ./my-agent --prompt "Research X"

Runtime Requirements

AgentLang defines agents. Runtimes execute them. A compatible runtime must: