Agent Gateway API Reference

Last updated: 2026-04-03 · Epic: MVP-2453 · GitHub

The Agent Gateway is a standalone REST API service wrapping the Claude Agent SDK. It exposes Claude Code's agentic capabilities (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch) over HTTP with NDJSON streaming, session management, workspace file CRUD, and automatic retry with exponential backoff.

Contents

  1. Authentication
  2. API Overview
  3. Health Check
  4. Query (Run Agent)
  5. Query Events (Replay)
  6. Sessions
  7. Settings
  8. Logging
  9. SSH Keys
  10. Anthropic Auth Status
  11. Anthropic Auth Login
  12. Anthropic Auth Submit Code
  13. Git Clone
  14. Git Pull
  15. Git Status
  16. Workspace (Memory, Agents, Skills)
  17. Tool Registry
  18. NDJSON Event Reference
  19. Session Management
  20. Deployment Guide
  21. Architecture

Authentication

All endpoints except /health require a Bearer token. API keys are configured via the API_KEYS environment variable as comma-separated label:secret pairs.

# .env
API_KEYS=myapp:sk-gw-abc123,cicd:sk-gw-def456

Send the secret as a Bearer token in the Authorization header:

Authorization: Bearer sk-gw-abc123

The label (e.g., myapp) appears in server logs for audit purposes. On invalid or missing credentials, the API returns 401:

{"error": "Missing or malformed Authorization header"}
{"error": "Invalid API key"}

API Overview

MethodPathAuthDescription
GET/healthNoHealth check
POST/v1/queryYesRun agent query (NDJSON stream)
GET/v1/query/:queryId/eventsYesReplay/resume event stream
GET/v1/sessionsYesList active sessions
DELETE/v1/sessions/:idYesDelete a session
GET/v1/settingsYesGet session settings
PUT/v1/settingsYesUpdate session settings
GET/v1/loggingYesGet log level
PUT/v1/loggingYesSet log level
POST/v1/ssh-keysYesUpload SSH keys
POST/v1/workspace/git/cloneYesClone a git repo (or pull if exists)
POST/v1/workspace/git/pullYesPull latest changes
GET/v1/workspace/git/statusYesGet repo status (branch, commit, dirty)
GET/v1/auth/statusYesCheck Anthropic login status
POST/v1/auth/loginYesStart Anthropic OAuth flow
POST/v1/auth/submit-codeYesSubmit OAuth authorization code
GET/v1/memoryYesList memory files
GET/v1/memory/*YesRead memory file
PUT/v1/memory/*YesWrite memory file
DELETE/v1/memory/*YesDelete memory file
GET/v1/agentsYesList agent files
GET/v1/agents/*YesRead agent file
PUT/v1/agents/*YesWrite agent file
DELETE/v1/agents/*YesDelete agent file
GET/v1/skillsYesList skill files
GET/v1/skills/*YesRead skill file
PUT/v1/skills/*YesWrite skill file
DELETE/v1/skills/*YesDelete skill file
PUT/v1/tools/:nameYesRegister/update a webhook tool
GET/v1/toolsYesList all registered tools
GET/v1/tools/:nameYesGet a single tool
DELETE/v1/tools/:nameYesDelete a tool

Health Check

GET /health No Auth

Returns server status, version, uptime, and active session count.

Response

{
  "status": "ok",
  "version": "0.1.0",
  "uptime": 3600,
  "sessions": 2
}

Example

curl http://localhost:3001/health

Query (Run Agent)

POST /v1/query Auth Required

Runs a Claude agent query. Returns an NDJSON stream (application/x-ndjson) of events as the agent works. The connection stays open until the query completes or the client disconnects.

Request Body

FieldTypeRequiredDescription
queryIdstringYesUnique ID for this query (client-generated UUID)
promptstringYesThe prompt to send to the agent
sessionIdstringNoSession ID for conversation continuity
systemPromptstringNoSystem prompt to configure agent behavior
modelstringNoClaude model to use (e.g., claude-sonnet-4-20250514)
allowedToolsstring[]NoRestrict tools (default: all 8 tools)
useSessionbooleanNoSet to false to skip session management
sshTargetstringNoSSH target for context (informational)
user_idstringNoUser identifier passed to tool webhook context
conversation_idstringNoConversation identifier passed to tool webhook context

Response (NDJSON stream)

Each line is a JSON object with a seq (sequence number) and type. See NDJSON Event Reference for all event types.

{"seq":0,"type":"text","content":"Let me check the disk usage..."}
{"seq":1,"type":"tool_use","toolName":"Bash","toolUseId":"tu_abc","input":"df -h","startedAt":1711461600000}
{"seq":2,"type":"tool_result","toolName":"Bash","toolUseId":"tu_abc","output":"/dev/sda1  50G  35G  15G  70% /","durationMs":245}
{"seq":3,"type":"text","content":"The disk is at 70% usage."}
{"seq":4,"type":"done","inputTokens":1250,"outputTokens":340,"costUsd":0.0087,"sessionId":"abc-123","context":{"usedTokens":1590,"contextWindow":200000,"percentUsed":0.8,"cacheReadTokens":0,"cacheCreationTokens":0}}

Example

curl -N -X POST http://localhost:3001/v1/query \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "queryId": "q-001",
    "prompt": "What files are in the current directory?",
    "sessionId": "session-1",
    "systemPrompt": "You are a helpful assistant."
  }'

Query Events (Replay)

GET /v1/query/:queryId/events Auth Required

Replays cached events for a query. If the query is still running, the connection stays open and streams new events in real time. Supports resuming from a specific sequence number.

Query Parameters

ParamTypeDefaultDescription
afterinteger-1Only return events with seq > after

Response

NDJSON stream identical to POST /v1/query. Returns 404 if the queryId is not found or has expired from the cache.

Example

# Replay all events
curl -N http://localhost:3001/v1/query/q-001/events \
  -H "Authorization: Bearer sk-gw-abc123"

# Resume from sequence 5
curl -N "http://localhost:3001/v1/query/q-001/events?after=5" \
  -H "Authorization: Bearer sk-gw-abc123"

Sessions

GET /v1/sessions Auth Required

Lists all active sessions with their model and last-used timestamp.

Response

{
  "sessions": [
    {
      "id": "session-1",
      "model": "claude-sonnet-4-20250514",
      "lastUsed": 1711461600000
    }
  ],
  "count": 1
}

Example

curl http://localhost:3001/v1/sessions \
  -H "Authorization: Bearer sk-gw-abc123"

DELETE /v1/sessions/:id Auth Required

Deletes a session by ID. Returns 404 if the session does not exist.

Response

{"deleted": true}

Example

curl -X DELETE http://localhost:3001/v1/sessions/session-1 \
  -H "Authorization: Bearer sk-gw-abc123"

Settings

GET /v1/settings Auth Required

Returns current session settings.

Response

{"sessionIdleTimeoutMs": 0}

Example

curl http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/settings Auth Required

Updates session settings. Currently supports sessionIdleTimeoutMs.

Sessions

A session is a mapping from a client-provided sessionId to a Claude Agent SDK conversation session. Sessions maintain conversation context — follow-up queries sent with the same sessionId have access to all previous messages in that conversation. Sessions are persisted to SESSION_PERSIST_PATH and survive server restarts.

sessionIdleTimeoutMs

Default0 — disabled, sessions persist indefinitely
Cleanup granularity5 minutes — a background timer runs every 5 minutes and evicts sessions that have been idle longer than the configured timeout. Sessions may therefore exist up to 5 minutes past their timeout.

When set to 0 (the default), no cleanup runs and every unique sessionId stays in memory and on disk forever. This is convenient for long-lived conversational agents but means memory usage grows unboundedly if many distinct session IDs are used.

When set to a value greater than 0, the cleanup timer is activated. Any session that has not received a query within the configured duration is removed from memory and from disk. The next query that arrives with that sessionId starts a fresh session — all previous conversation context is lost.

Request Body

{"sessionIdleTimeoutMs": 3600000}

Response

{"sessionIdleTimeoutMs": 3600000}

Example

# Set 1-hour idle timeout (sessions idle for >1 h are evicted; check runs every 5 min)
curl -X PUT http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"sessionIdleTimeoutMs": 3600000}'

# Disable cleanup — sessions live forever (default)
curl -X PUT http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"sessionIdleTimeoutMs": 0}'

Logging

GET /v1/logging Auth Required

Returns the current log level.

Response

{"level": "info"}

Example

curl http://localhost:3001/v1/logging \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/logging Auth Required

Changes the log level at runtime. Valid levels: off, info, debug.

Request Body

{"level": "debug"}

Response

{"level": "debug", "previous": "info"}

Example

curl -X PUT http://localhost:3001/v1/logging \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"level": "debug"}'

SSH Keys

POST /v1/ssh-keys Auth Required

Uploads an SSH key pair to the gateway container. The agent can then use SSH to connect to remote servers via Bash tool calls. Keys are written to ~/.ssh/ with proper permissions (600 for private, 644 for public). An SSH config is auto-generated with StrictHostKeyChecking accept-new.

Request Body

FieldTypeRequiredDescription
privateKeystringYesPEM-encoded private key
publicKeystringNoPublic key (derived from private if omitted)
filenamestringNoKey filename (default: id_ed25519, alphanumeric + _- only)

Response

{
  "status": "ok",
  "message": "SSH keys updated",
  "publicKey": "ssh-ed25519 AAAA... user@host"
}

Example

curl -X POST http://localhost:3001/v1/ssh-keys \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "privateKey": "-----BEGIN OPENSSH PRIVATE KEY-----\n...\n-----END OPENSSH PRIVATE KEY-----",
    "filename": "id_ed25519"
  }'

Git Clone

POST /v1/workspace/git/clone

Clone a git repository into the workspace projects directory. If the target path already exists with a .git directory, performs a git pull instead.

Request Body

{
  "url": "git@github.com:org/repo.git",   // Required: repository URL
  "path": "my-project",                    // Required: relative path under projects/
  "branch": "main",                        // Optional: branch to clone
  "sshKey": "-----BEGIN OPENSSH..."        // Optional: SSH private key for auth
}

Response (200)

{
  "status": "cloned",    // or "pulled" if target already existed
  "path": "my-project",
  "branch": "main",
  "commit": "a1b2c3d"
}

Security

SSH keys are written to a temp file with 0600 permissions and deleted immediately after the git operation. Keys are never persisted on disk.

Example

curl -X POST http://localhost:3001/v1/workspace/git/clone \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "git@github.com:org/repo.git",
    "path": "my-project",
    "branch": "main"
  }'

Git Pull

POST /v1/workspace/git/pull

Pull latest changes for an existing repository.

Request Body

{
  "path": "my-project",                    // Required: relative path under projects/
  "sshKey": "-----BEGIN OPENSSH..."        // Optional: SSH private key for auth
}

Response (200)

{
  "status": "updated",   // or "up-to-date"
  "branch": "main",
  "commit": "a1b2c3d"
}

Example

curl -X POST http://localhost:3001/v1/workspace/git/pull \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"path": "my-project"}'

Git Status

GET /v1/workspace/git/status?path=my-project

Get the git status of a project directory.

Query Parameters

path (required): relative path under projects/

Response (200) — repo exists

{
  "exists": true,
  "branch": "main",
  "commit": "a1b2c3d",
  "dirty": false,
  "lastCommitDate": "2026-03-29T10:00:00+02:00"
}

Response (200) — repo not found

{"exists": false}

Example

curl http://localhost:3001/v1/workspace/git/status?path=my-project \
  -H "Authorization: Bearer sk-gw-abc123"

Anthropic Auth: Status

GET /v1/auth/status Auth Required

Checks whether the Claude CLI inside the container has a valid Anthropic login. This is separate from the gateway's API key auth -- it determines whether the SDK can make API calls to Anthropic.

Response (logged in)

{"loggedIn": true, "email": "user@example.com"}

Response (not logged in)

{"loggedIn": false}

Example

curl http://localhost:3001/v1/auth/status \
  -H "Authorization: Bearer sk-gw-abc123"

Anthropic Auth: Login

POST /v1/auth/login Auth Required

Initiates the Anthropic OAuth login flow. Starts a Claude CLI session inside tmux, navigates through the login prompts, and returns the OAuth authorization URL. The user must open this URL in a browser, authorize, and copy the authorization code.

Response

{"url": "https://console.anthropic.com/oauth/authorize?..."}

Error (timeout)

{"error": "Timeout waiting for auth URL"}

Example

curl -X POST http://localhost:3001/v1/auth/login \
  -H "Authorization: Bearer sk-gw-abc123"

Anthropic Auth: Submit Code

POST /v1/auth/submit-code Auth Required

Submits the OAuth authorization code obtained from the browser. The gateway sends it to the running Claude CLI tmux session and polls for login success. Requires a prior call to POST /v1/auth/login.

Request Body

FieldTypeRequiredDescription
codestringYesOAuth authorization code from the browser

Response (success)

{"success": true, "loggedIn": true, "email": "user@example.com"}

Error

{"error": "No auth login session running. Call POST /v1/auth/login first."}

Example

curl -X POST http://localhost:3001/v1/auth/submit-code \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"code": "ey..."}'

Workspace (Memory, Agents, Skills)

The gateway provides CRUD endpoints for three workspace sections: memory, agents, and skills. Files are stored under $WORKSPACE_ROOT (default: $HOME/.claude). All paths are protected against directory traversal attacks.

The following operations are available for each section (memory, agents, skills):

GET /v1/{section} Auth Required

Lists all files in the section. Returns a recursive file listing with path, size, and modification time.

Response

{
  "files": [
    {
      "path": "MEMORY.md",
      "size": 1234,
      "modified": "2026-03-26T10:00:00.000Z"
    },
    {
      "path": "project/notes.md",
      "size": 567,
      "modified": "2026-03-25T15:30:00.000Z"
    }
  ]
}

Example

curl http://localhost:3001/v1/memory \
  -H "Authorization: Bearer sk-gw-abc123"

GET /v1/{section}/{path} Auth Required

Reads a file. Returns application/json for .json files, text/plain for all others.

Example

curl http://localhost:3001/v1/memory/MEMORY.md \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/{section}/{path} Auth Required

Creates or overwrites a file. Parent directories are created automatically. Send the file content as the request body (JSON or plain text).

Response

{"status": "ok", "path": "memory/MEMORY.md"}

Example

# Write plain text
curl -X PUT http://localhost:3001/v1/memory/MEMORY.md \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: text/plain" \
  -d '# Agent Memory

This server manages web infrastructure.'

# Write JSON
curl -X PUT http://localhost:3001/v1/agents/config.json \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"name": "server-admin", "model": "claude-sonnet-4-20250514"}'

DELETE /v1/{section}/{path} Auth Required

Deletes a file. Returns 404 if the file does not exist.

Response

{"status": "ok"}

Example

curl -X DELETE http://localhost:3001/v1/memory/old-notes.md \
  -H "Authorization: Bearer sk-gw-abc123"

Tool Registry

The Tool Registry allows external tools to be registered and made available to the Claude agent. Each tool defines a webhook_url that is called when the agent invokes the tool. Registered tools are wrapped as in-process MCP servers and injected into the Claude Agent SDK alongside the built-in tools (Bash, Read, Write, etc.).

Tools persist to TOOLS_PERSIST_PATH (default: /home/node/.claude/tools.json) and survive server restarts.

PUT /v1/tools/:name Auth Required

Registers a new tool or updates an existing one. Returns 201 for new tools, 200 for updates.

Request Body

FieldTypeRequiredDescription
descriptionstringYesHuman-readable description of what the tool does
input_schemaobjectYesJSON Schema describing the tool's input parameters (must have a type field)
webhook_urlstringYesURL to POST to when the agent invokes this tool
timeout_msnumberNoWebhook timeout in milliseconds (default: 30000)

Response (201 Created / 200 OK)

{
  "name": "lookup-customer",
  "description": "Look up a customer by email address",
  "input_schema": {
    "type": "object",
    "properties": {
      "email": { "type": "string", "description": "Customer email" }
    },
    "required": ["email"]
  },
  "webhook_url": "https://api.example.com/tools/lookup-customer",
  "timeout_ms": 30000
}

Example

curl -X PUT http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Look up a customer by email address",
    "input_schema": {
      "type": "object",
      "properties": {
        "email": { "type": "string", "description": "Customer email" }
      },
      "required": ["email"]
    },
    "webhook_url": "https://api.example.com/tools/lookup-customer",
    "timeout_ms": 10000
  }'

GET /v1/tools Auth Required

Lists all registered tools.

Response

{
  "tools": [
    {
      "name": "lookup-customer",
      "description": "Look up a customer by email address",
      "input_schema": { "type": "object", "properties": { "email": { "type": "string" } } },
      "webhook_url": "https://api.example.com/tools/lookup-customer",
      "timeout_ms": 10000
    }
  ]
}

Example

curl http://localhost:3001/v1/tools \
  -H "Authorization: Bearer sk-gw-abc123"

GET /v1/tools/:name Auth Required

Returns a single tool definition. Returns 404 if the tool does not exist.

Example

curl http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123"

DELETE /v1/tools/:name Auth Required

Deletes a tool. Returns 204 on success, 404 if the tool does not exist.

Example

curl -X DELETE http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123"

Webhook Execution Flow

When the Claude agent invokes a registered tool during a query, the gateway POSTs to the tool's webhook_url:

1

Agent invokes tool

The Claude agent decides to use a registered tool (e.g., lookup-customer) based on its description and input schema.

2

Gateway POSTs to webhook

The MCP server handler sends a POST request to the tool's webhook_url with the following body:

{
  "tool_use_id": "tu_abc123",
  "tool_name": "lookup-customer",
  "input": { "email": "user@example.com" },
  "context": {
    "user_id": null,
    "conversation_id": null,
    "session_id": "session-1",
    "api_key_label": "myapp"
  }
}

The client's Bearer token is forwarded in the Authorization header.

3

Webhook responds

The external service processes the request and returns:

{
  "output": "Customer found: Jane Doe, Plan: Enterprise",
  "metadata": { "customer_id": 42 }
}

On error (timeout, HTTP error, network failure), an error result is returned to the agent, which can retry or use an alternative approach.

4

Agent continues

The tool result is passed back to the Claude agent as a tool_result event, and the agent continues its work.

NDJSON Event Reference

All events include a seq field (incrementing integer starting at 0) and a type field. Events are streamed as newline-delimited JSON (application/x-ndjson).

text

Assistant text output (streamed incrementally).

{"seq": 0, "type": "text", "content": "Let me check..."}

tool_use

Agent is invoking a tool. input is a human-readable summary (command for Bash, file path for Read/Write/Edit, pattern for Glob/Grep).

{"seq": 1, "type": "tool_use", "toolName": "Bash", "toolUseId": "tu_abc123", "input": "df -h /var", "startedAt": 1711461600000}

tool_result

Tool execution completed. Output is truncated to 3000 characters. durationMs is null if timing data is unavailable.

Note: There is no success field. The gateway reports tool completion, not tool success/failure. The AI agent internally decides whether a tool result represents success or an error and acts accordingly. Consumers should treat a tool_result event as "tool completed" (e.g., show a checkmark), regardless of the output content. If the output contains an error message, the AI will handle the retry or fallback logic itself.

{"seq": 2, "type": "tool_result", "toolName": "Bash", "toolUseId": "tu_abc123", "output": "/dev/sda1  50G  35G  15G  70% /var", "durationMs": 245}

rate_limited

Rate limit detected. The gateway will retry automatically. Emitted both when the retry layer detects an error and when the SDK stream reports a rate limit event.

// Retry layer (error-based)
{"seq": 3, "type": "rate_limited", "status": "retrying", "attempt": 1, "waitMs": 1000}

// SDK stream event
{"seq": 3, "type": "rate_limited", "status": "waiting", "retryAfterMs": 5000}

sdk_status

Claude SDK status change. Currently emitted when context compaction starts ("compacting") and ends (null).

{"seq": 4, "type": "sdk_status", "status": "compacting"}
{"seq": 5, "type": "sdk_status", "status": null}

sdk_compact_complete

Context compaction completed. Reports the trigger reason and pre-compaction token count.

{"seq": 6, "type": "sdk_compact_complete", "trigger": "auto", "preTokens": 180000}

done

Query completed successfully. Includes token usage, cost, and context window statistics. Always the last event on success.

{
  "seq": 7,
  "type": "done",
  "inputTokens": 1250,
  "outputTokens": 340,
  "costUsd": 0.0087,
  "sessionId": "abc-123-def",
  "context": {
    "usedTokens": 1590,
    "contextWindow": 200000,
    "percentUsed": 0.8,
    "cacheReadTokens": 500,
    "cacheCreationTokens": 200
  }
}

error

Query failed with an error. Always the last event on failure.

{"seq": 3, "type": "error", "content": "Connection refused: ssh root@example.com"}

Session Management

Sessions enable multi-turn conversations with the Claude agent. A client provides a sessionId with each query, and the gateway maps it to an internal Claude SDK session UUID.

How Sessions Work

1

Client sends query with sessionId

Client includes a stable sessionId (e.g., "chat-user-42") in the POST /v1/query request body.

2

Gateway resolves session

If a session exists with the same ID, systemPrompt, and model, the existing Claude session is resumed using the SDK's resume option with the stored SDK session ID. If the prompt or model changed, a new Claude session is created under the same client ID using the sessionId option. After each query completes, the gateway syncs the stored SDK session ID with the actual session_id returned by the SDK, ensuring subsequent queries resume the correct conversation.

3

Session persists across restarts

Sessions are persisted to disk (debounced write to SESSION_PERSIST_PATH). On restart, sessions are restored and expired ones filtered out.

4

Idle cleanup (optional)

If SESSION_IDLE_TIMEOUT_MS > 0, a background timer runs every 5 minutes and removes sessions that have been idle longer than the timeout.

Stateless Queries

Set "useSession": false in the request body to skip session management entirely. A random session ID is generated for the query, and no session state is stored.

Managing Sessions via API

Deployment Guide

Standalone Docker Compose

The simplest way to run the Agent Gateway:

# Clone and configure
git clone https://github.com/diem2001/agent-gateway.git
cd agent-gateway
cp .env.example .env

# Edit .env — set at minimum:
#   ANTHROPIC_API_KEY=sk-ant-...
#   API_KEYS=myapp:your-secret-key

# Build and start
docker compose up -d --build

# Verify
curl http://localhost:3001/health

Environment Variables

VariableDefaultDescription
ANTHROPIC_API_KEY--Anthropic API key (or use OAuth via POST /v1/auth/login)
API_KEYSdefault:changemeClient auth keys (label:secret,...)
LOG_LEVELinfoLogging level (off, info, debug)
SESSION_IDLE_TIMEOUT_MS0Auto-expire idle sessions (0 = disabled)
PORT3001HTTP listen port
HOST0.0.0.0Bind address
SESSION_PERSIST_PATH/home/node/.claude/sessions.jsonSession persistence file (inside agent_home bind-mount)
EVENT_CACHE_TTL_MS1800000Query event cache TTL (30 min)
WORKSPACE_ROOT$HOME/.claudeRoot dir for memory/agents/skills
TOOLS_PERSIST_PATH/home/node/.claude/tools.jsonTool registry persistence file

Docker Architecture

Container Architecture
docker compose
ext:3001
int:3001
agent-gateway
Express 5 + Claude Agent SDK
POST /v1/query (NDJSON stream)
Session mgmt + Workspace CRUD + Tool Registry
SSH client + tmux (for auth flow)
Node.js 22

What the Container Includes

The Docker image is based on node:22-bookworm and includes system tools required by the Claude Agent SDK:

Entrypoint Behavior

The container starts as root and performs setup before dropping to the node user:

  1. Creates workspace directories (~/.claude/memory, ~/.claude/agents, ~/.claude/skills, ~/.ssh, ~/.local/bin)
  2. Writes default Claude settings with broad tool permissions (if not already present)
  3. Installs Claude Code CLI via curl install script (if not already installed)
  4. Ensures PATH includes ~/.local/bin in .bashrc
  5. Restores SSH key config from bind-mount if keys exist in ~/.ssh/
  6. Drops to node user via gosu and starts the Express server

Persistent Data

The agent_home directory is bind-mounted at /home/node and stores:

Embedding in an Existing Stack

To add the Agent Gateway as a service in an existing docker-compose.yml:

services:
  agent-gateway:
    build: ./path/to/agent-gateway
    container_name: agent-gateway
    ports:
      - "127.0.0.1:3001:3001"
    volumes:
      - ./agent_home:/home/node
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - API_KEYS=${API_KEYS:-default:changeme}
      - LOG_LEVEL=${LOG_LEVEL:-info}
      - SESSION_PERSIST_PATH=/home/node/.claude/sessions.json
      - TOOLS_PERSIST_PATH=/home/node/.claude/tools.json
      - SESSION_IDLE_TIMEOUT_MS=${SESSION_IDLE_TIMEOUT_MS:-0}
    healthcheck:
      test: ["CMD", "curl", "-sf", "http://localhost:3001/health"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 10s
    restart: unless-stopped

Other services can reach the gateway at http://agent-gateway:3001 on the Docker network. The port 127.0.0.1:3001 binding ensures it is not exposed publicly -- use a reverse proxy (nginx, Traefik) for external access with SSL.

Reverse Proxy (nginx)

location /agent-gateway/ {
    proxy_pass http://127.0.0.1:3001/;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_buffering off;           # Required for NDJSON streaming
    proxy_read_timeout 300s;       # Agent queries can be long-running
}

Important: disable proxy buffering

NDJSON streaming requires proxy_buffering off in nginx. Without it, events are buffered and delivered in batches instead of real-time. The gateway also sets X-Accel-Buffering: no on streaming responses as a fallback.

Architecture

Request Flow

1

HTTP Request

Client sends POST /v1/query with Bearer token, queryId, prompt, and optional sessionId.

2

Auth Middleware

Validates API key against API_KEYS env var. Attaches client label for audit logging. Rejects with 401 if invalid.

3

Session Resolution

Looks up or creates a Claude SDK session. Reuses existing session if prompt and model match. Creates new if first query or config changed.

4

Agent SDK Execution

Runs query() from Claude Agent SDK with tools (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch). Streams events via NDJSON as the agent works.

5

Retry Layer

On rate limits or empty responses: exponential backoff (1s, 2s, 4s), up to 3 retries, 60s total budget. Emits rate_limited events to the client.

6

Event Cache + Done

All events are cached by queryId for replay via GET /v1/query/:id/events. Cache entries expire after 30 minutes (configurable). Final done event includes token usage and cost.

Component Diagram

Query Flow
Client
POST /v1/query
Auth
Auth
validate Bearer
Sessions
Sessions
resolve/create
Agent SDK
Agent SDK
built-in tool calls (Bash, Read, ...)
System
System
tool output
Agent SDK
Agent SDK
registered tool call (via MCP server)
Webhook
Webhook
webhook response
Agent SDK
Response Stream
Agent SDK
NDJSON events (text, tool_use, tool_result, ...)
Client
Agent SDK
cache events by queryId
Event Cache

Links