Agent Gateway API Reference

Last updated: 2026-04-03 · Epic: MVP-2453 · GitHub

The Agent Gateway is a standalone REST API service wrapping the Claude Agent SDK. It exposes Claude Code's agentic capabilities (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch) over HTTP with NDJSON streaming, session management, workspace file CRUD, and automatic retry with exponential backoff.

Authentication

All endpoints except /health require a Bearer token. API keys are configured via the API_KEYS environment variable as comma-separated label:secret pairs.

# .env
API_KEYS=myapp:sk-gw-abc123,cicd:sk-gw-def456

Send the secret as a Bearer token in the Authorization header:

Authorization: Bearer sk-gw-abc123

The label (e.g., myapp) appears in server logs for audit purposes. On invalid or missing credentials, the API returns 401:

{"error": "Missing or malformed Authorization header"}
{"error": "Invalid API key"}

API Overview

Method	Path	Auth	Description
GET	`/health`	No	Health check
POST	`/v1/query`	Yes	Run agent query (NDJSON stream)
GET	`/v1/query/:queryId/events`	Yes	Replay/resume event stream
GET	`/v1/sessions`	Yes	List active sessions
DELETE	`/v1/sessions/:id`	Yes	Delete a session
GET	`/v1/settings`	Yes	Get session settings
PUT	`/v1/settings`	Yes	Update session settings
GET	`/v1/logging`	Yes	Get log level
PUT	`/v1/logging`	Yes	Set log level
POST	`/v1/ssh-keys`	Yes	Upload SSH keys
POST	`/v1/workspace/git/clone`	Yes	Clone a git repo (or pull if exists)
POST	`/v1/workspace/git/pull`	Yes	Pull latest changes
GET	`/v1/workspace/git/status`	Yes	Get repo status (branch, commit, dirty)
GET	`/v1/auth/status`	Yes	Check Anthropic login status
POST	`/v1/auth/login`	Yes	Start Anthropic OAuth flow
POST	`/v1/auth/submit-code`	Yes	Submit OAuth authorization code
GET	`/v1/memory`	Yes	List memory files
GET	`/v1/memory/*`	Yes	Read memory file
PUT	`/v1/memory/*`	Yes	Write memory file
DELETE	`/v1/memory/*`	Yes	Delete memory file
GET	`/v1/agents`	Yes	List agent files
GET	`/v1/agents/*`	Yes	Read agent file
PUT	`/v1/agents/*`	Yes	Write agent file
DELETE	`/v1/agents/*`	Yes	Delete agent file
GET	`/v1/skills`	Yes	List skill files
GET	`/v1/skills/*`	Yes	Read skill file
PUT	`/v1/skills/*`	Yes	Write skill file
DELETE	`/v1/skills/*`	Yes	Delete skill file
PUT	`/v1/tools/:name`	Yes	Register/update a webhook tool
GET	`/v1/tools`	Yes	List all registered tools
GET	`/v1/tools/:name`	Yes	Get a single tool
DELETE	`/v1/tools/:name`	Yes	Delete a tool

Health Check

GET /health No Auth

Returns server status, version, uptime, and active session count.

Response

{
  "status": "ok",
  "version": "0.1.0",
  "uptime": 3600,
  "sessions": 2
}

Example

curl http://localhost:3001/health

Query (Run Agent)

POST /v1/query Auth Required

Runs a Claude agent query. Returns an NDJSON stream (application/x-ndjson) of events as the agent works. The connection stays open until the query completes or the client disconnects.

Request Body

Field	Type	Required	Description
`queryId`	string	Yes	Unique ID for this query (client-generated UUID)
`prompt`	string	Yes	The prompt to send to the agent
`sessionId`	string	No	Session ID for conversation continuity
`systemPrompt`	string	No	System prompt to configure agent behavior
`model`	string	No	Claude model to use (e.g., `claude-sonnet-4-20250514`)
`allowedTools`	string[]	No	Restrict tools (default: all 8 tools)
`useSession`	boolean	No	Set to `false` to skip session management
`sshTarget`	string	No	SSH target for context (informational)
`user_id`	string	No	User identifier passed to tool webhook context
`conversation_id`	string	No	Conversation identifier passed to tool webhook context

Response (NDJSON stream)

Each line is a JSON object with a seq (sequence number) and type. See NDJSON Event Reference for all event types.

{"seq":0,"type":"text","content":"Let me check the disk usage..."}
{"seq":1,"type":"tool_use","toolName":"Bash","toolUseId":"tu_abc","input":"df -h","startedAt":1711461600000}
{"seq":2,"type":"tool_result","toolName":"Bash","toolUseId":"tu_abc","output":"/dev/sda1  50G  35G  15G  70% /","durationMs":245}
{"seq":3,"type":"text","content":"The disk is at 70% usage."}
{"seq":4,"type":"done","inputTokens":1250,"outputTokens":340,"costUsd":0.0087,"sessionId":"abc-123","context":{"usedTokens":1590,"contextWindow":200000,"percentUsed":0.8,"cacheReadTokens":0,"cacheCreationTokens":0}}

Example

curl -N -X POST http://localhost:3001/v1/query \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "queryId": "q-001",
    "prompt": "What files are in the current directory?",
    "sessionId": "session-1",
    "systemPrompt": "You are a helpful assistant."
  }'

Query Events (Replay)

GET /v1/query/:queryId/events Auth Required

Replays cached events for a query. If the query is still running, the connection stays open and streams new events in real time. Supports resuming from a specific sequence number.

Query Parameters

Param	Type	Default	Description
`after`	integer	-1	Only return events with `seq > after`

Response

NDJSON stream identical to POST /v1/query. Returns 404 if the queryId is not found or has expired from the cache.

Example

# Replay all events
curl -N http://localhost:3001/v1/query/q-001/events \
  -H "Authorization: Bearer sk-gw-abc123"

# Resume from sequence 5
curl -N "http://localhost:3001/v1/query/q-001/events?after=5" \
  -H "Authorization: Bearer sk-gw-abc123"

Sessions

GET /v1/sessions Auth Required

Lists all active sessions with their model and last-used timestamp.

Response

{
  "sessions": [
    {
      "id": "session-1",
      "model": "claude-sonnet-4-20250514",
      "lastUsed": 1711461600000
    }
  ],
  "count": 1
}

Example

curl http://localhost:3001/v1/sessions \
  -H "Authorization: Bearer sk-gw-abc123"

DELETE /v1/sessions/:id Auth Required

Deletes a session by ID. Returns 404 if the session does not exist.

Response

{"deleted": true}

Example

curl -X DELETE http://localhost:3001/v1/sessions/session-1 \
  -H "Authorization: Bearer sk-gw-abc123"

Settings

GET /v1/settings Auth Required

Returns current session settings.

Response

{"sessionIdleTimeoutMs": 0}

Example

curl http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/settings Auth Required

Updates session settings. Currently supports sessionIdleTimeoutMs.

Sessions

A session is a mapping from a client-provided sessionId to a Claude Agent SDK conversation session. Sessions maintain conversation context — follow-up queries sent with the same sessionId have access to all previous messages in that conversation. Sessions are persisted to SESSION_PERSIST_PATH and survive server restarts.

sessionIdleTimeoutMs

Default	`0` — disabled, sessions persist indefinitely
Cleanup granularity	5 minutes — a background timer runs every 5 minutes and evicts sessions that have been idle longer than the configured timeout. Sessions may therefore exist up to 5 minutes past their timeout.

When set to 0 (the default), no cleanup runs and every unique sessionId stays in memory and on disk forever. This is convenient for long-lived conversational agents but means memory usage grows unboundedly if many distinct session IDs are used.

When set to a value greater than 0, the cleanup timer is activated. Any session that has not received a query within the configured duration is removed from memory and from disk. The next query that arrives with that sessionId starts a fresh session — all previous conversation context is lost.

Request Body

{"sessionIdleTimeoutMs": 3600000}

Response

{"sessionIdleTimeoutMs": 3600000}

Example

# Set 1-hour idle timeout (sessions idle for >1 h are evicted; check runs every 5 min)
curl -X PUT http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"sessionIdleTimeoutMs": 3600000}'

# Disable cleanup — sessions live forever (default)
curl -X PUT http://localhost:3001/v1/settings \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"sessionIdleTimeoutMs": 0}'

Logging

GET /v1/logging Auth Required

Returns the current log level.

Response

{"level": "info"}

Example

curl http://localhost:3001/v1/logging \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/logging Auth Required

Changes the log level at runtime. Valid levels: off, info, debug.

Request Body

{"level": "debug"}

Response

{"level": "debug", "previous": "info"}

Example

curl -X PUT http://localhost:3001/v1/logging \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"level": "debug"}'

SSH Keys

POST /v1/ssh-keys Auth Required

Uploads an SSH key pair to the gateway container. The agent can then use SSH to connect to remote servers via Bash tool calls. Keys are written to ~/.ssh/ with proper permissions (600 for private, 644 for public). An SSH config is auto-generated with StrictHostKeyChecking accept-new.

Request Body

Field	Type	Required	Description
`privateKey`	string	Yes	PEM-encoded private key
`publicKey`	string	No	Public key (derived from private if omitted)
`filename`	string	No	Key filename (default: `id_ed25519`, alphanumeric + `_-` only)

Response

{
  "status": "ok",
  "message": "SSH keys updated",
  "publicKey": "ssh-ed25519 AAAA... user@host"
}

Example

curl -X POST http://localhost:3001/v1/ssh-keys \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "privateKey": "-----BEGIN OPENSSH PRIVATE KEY-----\n...\n-----END OPENSSH PRIVATE KEY-----",
    "filename": "id_ed25519"
  }'

Git Clone

POST /v1/workspace/git/clone

Clone a git repository into the workspace projects directory. If the target path already exists with a .git directory, performs a git pull instead.

Request Body

{
  "url": "git@github.com:org/repo.git",   // Required: repository URL
  "path": "my-project",                    // Required: relative path under projects/
  "branch": "main",                        // Optional: branch to clone
  "sshKey": "-----BEGIN OPENSSH..."        // Optional: SSH private key for auth
}

Response (200)

{
  "status": "cloned",    // or "pulled" if target already existed
  "path": "my-project",
  "branch": "main",
  "commit": "a1b2c3d"
}

Security

SSH keys are written to a temp file with 0600 permissions and deleted immediately after the git operation. Keys are never persisted on disk.

Example

curl -X POST http://localhost:3001/v1/workspace/git/clone \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "git@github.com:org/repo.git",
    "path": "my-project",
    "branch": "main"
  }'

Git Pull

POST /v1/workspace/git/pull

Pull latest changes for an existing repository.

Request Body

{
  "path": "my-project",                    // Required: relative path under projects/
  "sshKey": "-----BEGIN OPENSSH..."        // Optional: SSH private key for auth
}

Response (200)

{
  "status": "updated",   // or "up-to-date"
  "branch": "main",
  "commit": "a1b2c3d"
}

Example

curl -X POST http://localhost:3001/v1/workspace/git/pull \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"path": "my-project"}'

Git Status

GET /v1/workspace/git/status?path=my-project

Get the git status of a project directory.

Query Parameters

path (required): relative path under projects/

Response (200) — repo exists

{
  "exists": true,
  "branch": "main",
  "commit": "a1b2c3d",
  "dirty": false,
  "lastCommitDate": "2026-03-29T10:00:00+02:00"
}

Response (200) — repo not found

{"exists": false}

Example

curl http://localhost:3001/v1/workspace/git/status?path=my-project \
  -H "Authorization: Bearer sk-gw-abc123"

Anthropic Auth: Status

GET /v1/auth/status Auth Required

Checks whether the Claude CLI inside the container has a valid Anthropic login. This is separate from the gateway's API key auth -- it determines whether the SDK can make API calls to Anthropic.

Response (logged in)

{"loggedIn": true, "email": "user@example.com"}

Response (not logged in)

{"loggedIn": false}

Example

curl http://localhost:3001/v1/auth/status \
  -H "Authorization: Bearer sk-gw-abc123"

POST /v1/auth/login Auth Required

Initiates the Anthropic OAuth login flow. Starts a Claude CLI session inside tmux, navigates through the login prompts, and returns the OAuth authorization URL. The user must open this URL in a browser, authorize, and copy the authorization code.

Response

{"url": "https://console.anthropic.com/oauth/authorize?..."}

Error (timeout)

{"error": "Timeout waiting for auth URL"}

Example

curl -X POST http://localhost:3001/v1/auth/login \
  -H "Authorization: Bearer sk-gw-abc123"

Anthropic Auth: Submit Code

POST /v1/auth/submit-code Auth Required

Submits the OAuth authorization code obtained from the browser. The gateway sends it to the running Claude CLI tmux session and polls for login success. Requires a prior call to POST /v1/auth/login.

Request Body

Field	Type	Required	Description
`code`	string	Yes	OAuth authorization code from the browser

Response (success)

{"success": true, "loggedIn": true, "email": "user@example.com"}

Error

{"error": "No auth login session running. Call POST /v1/auth/login first."}

Example

curl -X POST http://localhost:3001/v1/auth/submit-code \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"code": "ey..."}'

Workspace (Memory, Agents, Skills)

The gateway provides CRUD endpoints for three workspace sections: memory, agents, and skills. Files are stored under $WORKSPACE_ROOT (default: $HOME/.claude). All paths are protected against directory traversal attacks.

The following operations are available for each section (memory, agents, skills):

GET /v1/{section} Auth Required

Lists all files in the section. Returns a recursive file listing with path, size, and modification time.

Response

{
  "files": [
    {
      "path": "MEMORY.md",
      "size": 1234,
      "modified": "2026-03-26T10:00:00.000Z"
    },
    {
      "path": "project/notes.md",
      "size": 567,
      "modified": "2026-03-25T15:30:00.000Z"
    }
  ]
}

Example

curl http://localhost:3001/v1/memory \
  -H "Authorization: Bearer sk-gw-abc123"

GET /v1/{section}/{path} Auth Required

Reads a file. Returns application/json for .json files, text/plain for all others.

Example

curl http://localhost:3001/v1/memory/MEMORY.md \
  -H "Authorization: Bearer sk-gw-abc123"

PUT /v1/{section}/{path} Auth Required

Creates or overwrites a file. Parent directories are created automatically. Send the file content as the request body (JSON or plain text).

Response

{"status": "ok", "path": "memory/MEMORY.md"}

Example

# Write plain text
curl -X PUT http://localhost:3001/v1/memory/MEMORY.md \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: text/plain" \
  -d '# Agent Memory

This server manages web infrastructure.'

# Write JSON
curl -X PUT http://localhost:3001/v1/agents/config.json \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{"name": "server-admin", "model": "claude-sonnet-4-20250514"}'

DELETE /v1/{section}/{path} Auth Required

Deletes a file. Returns 404 if the file does not exist.

Response

{"status": "ok"}

Example

curl -X DELETE http://localhost:3001/v1/memory/old-notes.md \
  -H "Authorization: Bearer sk-gw-abc123"

Tool Registry

The Tool Registry allows external tools to be registered and made available to the Claude agent. Each tool defines a webhook_url that is called when the agent invokes the tool. Registered tools are wrapped as in-process MCP servers and injected into the Claude Agent SDK alongside the built-in tools (Bash, Read, Write, etc.).

Tools persist to TOOLS_PERSIST_PATH (default: /home/node/.claude/tools.json) and survive server restarts.

PUT /v1/tools/:name Auth Required

Registers a new tool or updates an existing one. Returns 201 for new tools, 200 for updates.

Request Body

Field	Type	Required	Description
`description`	string	Yes	Human-readable description of what the tool does
`input_schema`	object	Yes	JSON Schema describing the tool's input parameters (must have a `type` field)
`webhook_url`	string	Yes	URL to POST to when the agent invokes this tool
`timeout_ms`	number	No	Webhook timeout in milliseconds (default: `30000`)

Response (201 Created / 200 OK)

{
  "name": "lookup-customer",
  "description": "Look up a customer by email address",
  "input_schema": {
    "type": "object",
    "properties": {
      "email": { "type": "string", "description": "Customer email" }
    },
    "required": ["email"]
  },
  "webhook_url": "https://api.example.com/tools/lookup-customer",
  "timeout_ms": 30000
}

Example

curl -X PUT http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Look up a customer by email address",
    "input_schema": {
      "type": "object",
      "properties": {
        "email": { "type": "string", "description": "Customer email" }
      },
      "required": ["email"]
    },
    "webhook_url": "https://api.example.com/tools/lookup-customer",
    "timeout_ms": 10000
  }'

GET /v1/tools Auth Required

Lists all registered tools.

Response

{
  "tools": [
    {
      "name": "lookup-customer",
      "description": "Look up a customer by email address",
      "input_schema": { "type": "object", "properties": { "email": { "type": "string" } } },
      "webhook_url": "https://api.example.com/tools/lookup-customer",
      "timeout_ms": 10000
    }
  ]
}

Example

curl http://localhost:3001/v1/tools \
  -H "Authorization: Bearer sk-gw-abc123"

GET /v1/tools/:name Auth Required

Returns a single tool definition. Returns 404 if the tool does not exist.

Example

curl http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123"

DELETE /v1/tools/:name Auth Required

Deletes a tool. Returns 204 on success, 404 if the tool does not exist.

Example

curl -X DELETE http://localhost:3001/v1/tools/lookup-customer \
  -H "Authorization: Bearer sk-gw-abc123"

Webhook Execution Flow

When the Claude agent invokes a registered tool during a query, the gateway POSTs to the tool's webhook_url:

Agent invokes tool

The Claude agent decides to use a registered tool (e.g., lookup-customer) based on its description and input schema.

Gateway POSTs to webhook

The MCP server handler sends a POST request to the tool's webhook_url with the following body:

{
  "tool_use_id": "tu_abc123",
  "tool_name": "lookup-customer",
  "input": { "email": "user@example.com" },
  "context": {
    "user_id": null,
    "conversation_id": null,
    "session_id": "session-1",
    "api_key_label": "myapp"
  }
}

The client's Bearer token is forwarded in the Authorization header.

Webhook responds

The external service processes the request and returns:

{
  "output": "Customer found: Jane Doe, Plan: Enterprise",
  "metadata": { "customer_id": 42 }
}

On error (timeout, HTTP error, network failure), an error result is returned to the agent, which can retry or use an alternative approach.

Agent continues

The tool result is passed back to the Claude agent as a tool_result event, and the agent continues its work.

NDJSON Event Reference

All events include a seq field (incrementing integer starting at 0) and a type field. Events are streamed as newline-delimited JSON (application/x-ndjson).

text

Assistant text output (streamed incrementally).

{"seq": 0, "type": "text", "content": "Let me check..."}

tool_use

Agent is invoking a tool. input is a human-readable summary (command for Bash, file path for Read/Write/Edit, pattern for Glob/Grep).

{"seq": 1, "type": "tool_use", "toolName": "Bash", "toolUseId": "tu_abc123", "input": "df -h /var", "startedAt": 1711461600000}

tool_result

Tool execution completed. Output is truncated to 3000 characters. durationMs is null if timing data is unavailable.

Note: There is no success field. The gateway reports tool completion, not tool success/failure. The AI agent internally decides whether a tool result represents success or an error and acts accordingly. Consumers should treat a tool_result event as "tool completed" (e.g., show a checkmark), regardless of the output content. If the output contains an error message, the AI will handle the retry or fallback logic itself.

{"seq": 2, "type": "tool_result", "toolName": "Bash", "toolUseId": "tu_abc123", "output": "/dev/sda1  50G  35G  15G  70% /var", "durationMs": 245}

rate_limited

Rate limit detected. The gateway will retry automatically. Emitted both when the retry layer detects an error and when the SDK stream reports a rate limit event.

// Retry layer (error-based)
{"seq": 3, "type": "rate_limited", "status": "retrying", "attempt": 1, "waitMs": 1000}

// SDK stream event
{"seq": 3, "type": "rate_limited", "status": "waiting", "retryAfterMs": 5000}

sdk_status

Claude SDK status change. Currently emitted when context compaction starts ("compacting") and ends (null).

{"seq": 4, "type": "sdk_status", "status": "compacting"}
{"seq": 5, "type": "sdk_status", "status": null}

sdk_compact_complete

Context compaction completed. Reports the trigger reason and pre-compaction token count.

{"seq": 6, "type": "sdk_compact_complete", "trigger": "auto", "preTokens": 180000}

done

Query completed successfully. Includes token usage, cost, and context window statistics. Always the last event on success.

{
  "seq": 7,
  "type": "done",
  "inputTokens": 1250,
  "outputTokens": 340,
  "costUsd": 0.0087,
  "sessionId": "abc-123-def",
  "context": {
    "usedTokens": 1590,
    "contextWindow": 200000,
    "percentUsed": 0.8,
    "cacheReadTokens": 500,
    "cacheCreationTokens": 200
  }
}

error

Query failed with an error. Always the last event on failure.

{"seq": 3, "type": "error", "content": "Connection refused: ssh root@example.com"}

Session Management

Sessions enable multi-turn conversations with the Claude agent. A client provides a sessionId with each query, and the gateway maps it to an internal Claude SDK session UUID.

How Sessions Work

Client sends query with sessionId

Client includes a stable sessionId (e.g., "chat-user-42") in the POST /v1/query request body.

Gateway resolves session

If a session exists with the same ID, systemPrompt, and model, the existing Claude session is resumed using the SDK's resume option with the stored SDK session ID. If the prompt or model changed, a new Claude session is created under the same client ID using the sessionId option. After each query completes, the gateway syncs the stored SDK session ID with the actual session_id returned by the SDK, ensuring subsequent queries resume the correct conversation.

Session persists across restarts

Sessions are persisted to disk (debounced write to SESSION_PERSIST_PATH). On restart, sessions are restored and expired ones filtered out.

Idle cleanup (optional)

If SESSION_IDLE_TIMEOUT_MS > 0, a background timer runs every 5 minutes and removes sessions that have been idle longer than the timeout.

Stateless Queries

Set "useSession": false in the request body to skip session management entirely. A random session ID is generated for the query, and no session state is stored.

Managing Sessions via API

GET /v1/sessions -- list all active sessions with model and lastUsed timestamp
DELETE /v1/sessions/:id -- remove a specific session
PUT /v1/settings -- configure sessionIdleTimeoutMs

Deployment Guide

Standalone Docker Compose

The simplest way to run the Agent Gateway:

# Clone and configure
git clone https://github.com/diem2001/agent-gateway.git
cd agent-gateway
cp .env.example .env

# Edit .env — set at minimum:
#   ANTHROPIC_API_KEY=sk-ant-...
#   API_KEYS=myapp:your-secret-key

# Build and start
docker compose up -d --build

# Verify
curl http://localhost:3001/health

Environment Variables

Variable	Default	Description
`ANTHROPIC_API_KEY`	--	Anthropic API key (or use OAuth via `POST /v1/auth/login`)
`API_KEYS`	`default:changeme`	Client auth keys (`label:secret,...`)
`LOG_LEVEL`	`info`	Logging level (`off`, `info`, `debug`)
`SESSION_IDLE_TIMEOUT_MS`	`0`	Auto-expire idle sessions (0 = disabled)
`PORT`	`3001`	HTTP listen port
`HOST`	`0.0.0.0`	Bind address
`SESSION_PERSIST_PATH`	`/home/node/.claude/sessions.json`	Session persistence file (inside agent_home bind-mount)
`EVENT_CACHE_TTL_MS`	`1800000`	Query event cache TTL (30 min)
`WORKSPACE_ROOT`	`$HOME/.claude`	Root dir for memory/agents/skills
`TOOLS_PERSIST_PATH`	`/home/node/.claude/tools.json`	Tool registry persistence file

Docker Architecture

Container Architecture

docker compose

ext:3001

int:3001

agent-gateway

Express 5 + Claude Agent SDK

POST /v1/query (NDJSON stream)

Session mgmt + Workspace CRUD + Tool Registry

SSH client + tmux (for auth flow)

Node.js 22

What the Container Includes

The Docker image is based on node:22-bookworm and includes system tools required by the Claude Agent SDK:

bash, coreutils, findutils, grep, procps -- for Bash, Glob, Grep tools
git -- for repository operations
openssh-client -- for SSH connectivity to remote servers
tmux -- for the Anthropic OAuth login flow
rsync, wget, curl, jq, mc, python3 -- general utilities
gosu -- privilege de-escalation (root to node user)
Claude Code CLI -- installed on first startup via curl -fsSL https://claude.ai/install.sh | bash

Entrypoint Behavior

The container starts as root and performs setup before dropping to the node user:

Creates workspace directories (~/.claude/memory, ~/.claude/agents, ~/.claude/skills, ~/.ssh, ~/.local/bin)
Writes default Claude settings with broad tool permissions (if not already present)
Installs Claude Code CLI via curl install script (if not already installed)
Ensures PATH includes ~/.local/bin in .bashrc
Restores SSH key config from bind-mount if keys exist in ~/.ssh/
Drops to node user via gosu and starts the Express server

Persistent Data

The agent_home directory is bind-mounted at /home/node and stores:

/home/node/.claude/sessions.json -- session state (survives container restarts)
/home/node/.claude/tools.json -- registered tool definitions (survives container restarts)
/home/node/.claude/skills/ -- published skills
/home/node/.claude/agents/ -- agent configurations
/home/node/.claude/memory/ -- workspace memory
/home/node/.claude/projects/ -- Claude session data
/home/node/.ssh/ -- SSH keys (uploaded via POST /v1/ssh-keys)

Embedding in an Existing Stack

To add the Agent Gateway as a service in an existing docker-compose.yml:

services:
  agent-gateway:
    build: ./path/to/agent-gateway
    container_name: agent-gateway
    ports:
      - "127.0.0.1:3001:3001"
    volumes:
      - ./agent_home:/home/node
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - API_KEYS=${API_KEYS:-default:changeme}
      - LOG_LEVEL=${LOG_LEVEL:-info}
      - SESSION_PERSIST_PATH=/home/node/.claude/sessions.json
      - TOOLS_PERSIST_PATH=/home/node/.claude/tools.json
      - SESSION_IDLE_TIMEOUT_MS=${SESSION_IDLE_TIMEOUT_MS:-0}
    healthcheck:
      test: ["CMD", "curl", "-sf", "http://localhost:3001/health"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 10s
    restart: unless-stopped

Other services can reach the gateway at http://agent-gateway:3001 on the Docker network. The port 127.0.0.1:3001 binding ensures it is not exposed publicly -- use a reverse proxy (nginx, Traefik) for external access with SSL.

Reverse Proxy (nginx)

location /agent-gateway/ {
    proxy_pass http://127.0.0.1:3001/;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_buffering off;           # Required for NDJSON streaming
    proxy_read_timeout 300s;       # Agent queries can be long-running
}

Important: disable proxy buffering

NDJSON streaming requires proxy_buffering off in nginx. Without it, events are buffered and delivered in batches instead of real-time. The gateway also sets X-Accel-Buffering: no on streaming responses as a fallback.

Architecture

Request Flow

HTTP Request

Client sends POST /v1/query with Bearer token, queryId, prompt, and optional sessionId.

Auth Middleware

Validates API key against API_KEYS env var. Attaches client label for audit logging. Rejects with 401 if invalid.

Session Resolution

Looks up or creates a Claude SDK session. Reuses existing session if prompt and model match. Creates new if first query or config changed.

Agent SDK Execution

Runs query() from Claude Agent SDK with tools (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch). Streams events via NDJSON as the agent works.

Retry Layer

On rate limits or empty responses: exponential backoff (1s, 2s, 4s), up to 3 retries, 60s total budget. Emits rate_limited events to the client.

Event Cache + Done

All events are cached by queryId for replay via GET /v1/query/:id/events. Cache entries expire after 30 minutes (configurable). Final done event includes token usage and cost.

Component Diagram

Query Flow

Client

POST /v1/query

→

Auth

validate Bearer

→

Sessions

resolve/create

→

Agent SDK

built-in tool calls (Bash, Read, ...)

→

System

tool output

→

Agent SDK

registered tool call (via MCP server)

→

Webhook

webhook response

→

Agent SDK

Response Stream

Agent SDK

NDJSON events (text, tool_use, tool_result, ...)

→

Client

Agent SDK

cache events by queryId

→

Event Cache

Agent Gateway API Reference

Contents

Authentication

API Overview

Health Check

Response

Example

Query (Run Agent)

Request Body

Response (NDJSON stream)

Example

Query Events (Replay)

Query Parameters

Response

Example

Sessions

Response

Example

Response

Example

Settings

Response

Example

Sessions

sessionIdleTimeoutMs

Request Body

Response

Example

Logging

Response

Example

Request Body

Response

Example

SSH Keys

Request Body

Response

Example

Git Clone

Request Body

Response (200)

Security

Example

Git Pull

Request Body

Response (200)

Example

Git Status

Query Parameters

Response (200) — repo exists

Response (200) — repo not found

Example

Anthropic Auth: Status

Response (logged in)

Response (not logged in)

Example

Anthropic Auth: Login

Response

Error (timeout)

Example

Anthropic Auth: Submit Code

Request Body

Response (success)

Error

Example

Workspace (Memory, Agents, Skills)

Response

Example

Example

Response

Example

Response

Example

Tool Registry

Request Body

Response (201 Created / 200 OK)

Example

Response

Example

Example