Agent Gateway API Reference
The Agent Gateway is a standalone REST API service wrapping the Claude Agent SDK. It exposes Claude Code's agentic capabilities (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch) over HTTP with NDJSON streaming, session management, workspace file CRUD, and automatic retry with exponential backoff.
Contents
- Authentication
- API Overview
- Health Check
- Query (Run Agent)
- Query Events (Replay)
- Sessions
- Settings
- Logging
- SSH Keys
- Anthropic Auth Status
- Anthropic Auth Login
- Anthropic Auth Submit Code
- Git Clone
- Git Pull
- Git Status
- Workspace (Memory, Agents, Skills)
- Tool Registry
- NDJSON Event Reference
- Session Management
- Deployment Guide
- Architecture
Authentication
All endpoints except /health require a Bearer token. API keys are configured via the API_KEYS environment variable as comma-separated label:secret pairs.
# .env
API_KEYS=myapp:sk-gw-abc123,cicd:sk-gw-def456
Send the secret as a Bearer token in the Authorization header:
Authorization: Bearer sk-gw-abc123
The label (e.g., myapp) appears in server logs for audit purposes. On invalid or missing credentials, the API returns 401:
{"error": "Missing or malformed Authorization header"}
{"error": "Invalid API key"}
API Overview
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /health | No | Health check |
| POST | /v1/query | Yes | Run agent query (NDJSON stream) |
| GET | /v1/query/:queryId/events | Yes | Replay/resume event stream |
| GET | /v1/sessions | Yes | List active sessions |
| DELETE | /v1/sessions/:id | Yes | Delete a session |
| GET | /v1/settings | Yes | Get session settings |
| PUT | /v1/settings | Yes | Update session settings |
| GET | /v1/logging | Yes | Get log level |
| PUT | /v1/logging | Yes | Set log level |
| POST | /v1/ssh-keys | Yes | Upload SSH keys |
| POST | /v1/workspace/git/clone | Yes | Clone a git repo (or pull if exists) |
| POST | /v1/workspace/git/pull | Yes | Pull latest changes |
| GET | /v1/workspace/git/status | Yes | Get repo status (branch, commit, dirty) |
| GET | /v1/auth/status | Yes | Check Anthropic login status |
| POST | /v1/auth/login | Yes | Start Anthropic OAuth flow |
| POST | /v1/auth/submit-code | Yes | Submit OAuth authorization code |
| GET | /v1/memory | Yes | List memory files |
| GET | /v1/memory/* | Yes | Read memory file |
| PUT | /v1/memory/* | Yes | Write memory file |
| DELETE | /v1/memory/* | Yes | Delete memory file |
| GET | /v1/agents | Yes | List agent files |
| GET | /v1/agents/* | Yes | Read agent file |
| PUT | /v1/agents/* | Yes | Write agent file |
| DELETE | /v1/agents/* | Yes | Delete agent file |
| GET | /v1/skills | Yes | List skill files |
| GET | /v1/skills/* | Yes | Read skill file |
| PUT | /v1/skills/* | Yes | Write skill file |
| DELETE | /v1/skills/* | Yes | Delete skill file |
| PUT | /v1/tools/:name | Yes | Register/update a webhook tool |
| GET | /v1/tools | Yes | List all registered tools |
| GET | /v1/tools/:name | Yes | Get a single tool |
| DELETE | /v1/tools/:name | Yes | Delete a tool |
Health Check
GET /health No Auth
Returns server status, version, uptime, and active session count.
Response
{
"status": "ok",
"version": "0.1.0",
"uptime": 3600,
"sessions": 2
}
Example
curl http://localhost:3001/health
Query (Run Agent)
POST /v1/query Auth Required
Runs a Claude agent query. Returns an NDJSON stream (application/x-ndjson) of events as the agent works. The connection stays open until the query completes or the client disconnects.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
queryId | string | Yes | Unique ID for this query (client-generated UUID) |
prompt | string | Yes | The prompt to send to the agent |
sessionId | string | No | Session ID for conversation continuity |
systemPrompt | string | No | System prompt to configure agent behavior |
model | string | No | Claude model to use (e.g., claude-sonnet-4-20250514) |
allowedTools | string[] | No | Restrict tools (default: all 8 tools) |
useSession | boolean | No | Set to false to skip session management |
sshTarget | string | No | SSH target for context (informational) |
user_id | string | No | User identifier passed to tool webhook context |
conversation_id | string | No | Conversation identifier passed to tool webhook context |
Response (NDJSON stream)
Each line is a JSON object with a seq (sequence number) and type. See NDJSON Event Reference for all event types.
{"seq":0,"type":"text","content":"Let me check the disk usage..."}
{"seq":1,"type":"tool_use","toolName":"Bash","toolUseId":"tu_abc","input":"df -h","startedAt":1711461600000}
{"seq":2,"type":"tool_result","toolName":"Bash","toolUseId":"tu_abc","output":"/dev/sda1 50G 35G 15G 70% /","durationMs":245}
{"seq":3,"type":"text","content":"The disk is at 70% usage."}
{"seq":4,"type":"done","inputTokens":1250,"outputTokens":340,"costUsd":0.0087,"sessionId":"abc-123","context":{"usedTokens":1590,"contextWindow":200000,"percentUsed":0.8,"cacheReadTokens":0,"cacheCreationTokens":0}}
Example
curl -N -X POST http://localhost:3001/v1/query \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{
"queryId": "q-001",
"prompt": "What files are in the current directory?",
"sessionId": "session-1",
"systemPrompt": "You are a helpful assistant."
}'
Query Events (Replay)
GET /v1/query/:queryId/events Auth Required
Replays cached events for a query. If the query is still running, the connection stays open and streams new events in real time. Supports resuming from a specific sequence number.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
after | integer | -1 | Only return events with seq > after |
Response
NDJSON stream identical to POST /v1/query. Returns 404 if the queryId is not found or has expired from the cache.
Example
# Replay all events
curl -N http://localhost:3001/v1/query/q-001/events \
-H "Authorization: Bearer sk-gw-abc123"
# Resume from sequence 5
curl -N "http://localhost:3001/v1/query/q-001/events?after=5" \
-H "Authorization: Bearer sk-gw-abc123"
Sessions
GET /v1/sessions Auth Required
Lists all active sessions with their model and last-used timestamp.
Response
{
"sessions": [
{
"id": "session-1",
"model": "claude-sonnet-4-20250514",
"lastUsed": 1711461600000
}
],
"count": 1
}
Example
curl http://localhost:3001/v1/sessions \
-H "Authorization: Bearer sk-gw-abc123"
DELETE /v1/sessions/:id Auth Required
Deletes a session by ID. Returns 404 if the session does not exist.
Response
{"deleted": true}
Example
curl -X DELETE http://localhost:3001/v1/sessions/session-1 \
-H "Authorization: Bearer sk-gw-abc123"
Settings
GET /v1/settings Auth Required
Returns current session settings.
Response
{"sessionIdleTimeoutMs": 0}
Example
curl http://localhost:3001/v1/settings \
-H "Authorization: Bearer sk-gw-abc123"
PUT /v1/settings Auth Required
Updates session settings. Currently supports sessionIdleTimeoutMs.
Sessions
A session is a mapping from a client-provided sessionId to a Claude Agent SDK
conversation session. Sessions maintain conversation context — follow-up queries sent with the same
sessionId have access to all previous messages in that conversation.
Sessions are persisted to SESSION_PERSIST_PATH and survive server restarts.
sessionIdleTimeoutMs
| Default | 0 — disabled, sessions persist indefinitely |
|---|---|
| Cleanup granularity | 5 minutes — a background timer runs every 5 minutes and evicts sessions that have been idle longer than the configured timeout. Sessions may therefore exist up to 5 minutes past their timeout. |
When set to 0 (the default), no cleanup runs and every unique sessionId stays
in memory and on disk forever. This is convenient for long-lived conversational agents but means memory
usage grows unboundedly if many distinct session IDs are used.
When set to a value greater than 0, the cleanup timer is activated. Any session that has
not received a query within the configured duration is removed from memory and from disk.
The next query that arrives with that sessionId starts a fresh session —
all previous conversation context is lost.
Request Body
{"sessionIdleTimeoutMs": 3600000}
Response
{"sessionIdleTimeoutMs": 3600000}
Example
# Set 1-hour idle timeout (sessions idle for >1 h are evicted; check runs every 5 min)
curl -X PUT http://localhost:3001/v1/settings \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"sessionIdleTimeoutMs": 3600000}'
# Disable cleanup — sessions live forever (default)
curl -X PUT http://localhost:3001/v1/settings \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"sessionIdleTimeoutMs": 0}'
Logging
GET /v1/logging Auth Required
Returns the current log level.
Response
{"level": "info"}
Example
curl http://localhost:3001/v1/logging \
-H "Authorization: Bearer sk-gw-abc123"
PUT /v1/logging Auth Required
Changes the log level at runtime. Valid levels: off, info, debug.
Request Body
{"level": "debug"}
Response
{"level": "debug", "previous": "info"}
Example
curl -X PUT http://localhost:3001/v1/logging \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"level": "debug"}'
SSH Keys
POST /v1/ssh-keys Auth Required
Uploads an SSH key pair to the gateway container. The agent can then use SSH to connect to remote servers via Bash tool calls. Keys are written to ~/.ssh/ with proper permissions (600 for private, 644 for public). An SSH config is auto-generated with StrictHostKeyChecking accept-new.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
privateKey | string | Yes | PEM-encoded private key |
publicKey | string | No | Public key (derived from private if omitted) |
filename | string | No | Key filename (default: id_ed25519, alphanumeric + _- only) |
Response
{
"status": "ok",
"message": "SSH keys updated",
"publicKey": "ssh-ed25519 AAAA... user@host"
}
Example
curl -X POST http://localhost:3001/v1/ssh-keys \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{
"privateKey": "-----BEGIN OPENSSH PRIVATE KEY-----\n...\n-----END OPENSSH PRIVATE KEY-----",
"filename": "id_ed25519"
}'
Git Clone
POST /v1/workspace/git/clone
Clone a git repository into the workspace projects directory. If the target path already exists with a .git directory, performs a git pull instead.
Request Body
{
"url": "git@github.com:org/repo.git", // Required: repository URL
"path": "my-project", // Required: relative path under projects/
"branch": "main", // Optional: branch to clone
"sshKey": "-----BEGIN OPENSSH..." // Optional: SSH private key for auth
}
Response (200)
{
"status": "cloned", // or "pulled" if target already existed
"path": "my-project",
"branch": "main",
"commit": "a1b2c3d"
}
Security
SSH keys are written to a temp file with 0600 permissions and deleted immediately after the git operation. Keys are never persisted on disk.
Example
curl -X POST http://localhost:3001/v1/workspace/git/clone \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{
"url": "git@github.com:org/repo.git",
"path": "my-project",
"branch": "main"
}'
Git Pull
POST /v1/workspace/git/pull
Pull latest changes for an existing repository.
Request Body
{
"path": "my-project", // Required: relative path under projects/
"sshKey": "-----BEGIN OPENSSH..." // Optional: SSH private key for auth
}
Response (200)
{
"status": "updated", // or "up-to-date"
"branch": "main",
"commit": "a1b2c3d"
}
Example
curl -X POST http://localhost:3001/v1/workspace/git/pull \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"path": "my-project"}'
Git Status
GET /v1/workspace/git/status?path=my-project
Get the git status of a project directory.
Query Parameters
path (required): relative path under projects/
Response (200) — repo exists
{
"exists": true,
"branch": "main",
"commit": "a1b2c3d",
"dirty": false,
"lastCommitDate": "2026-03-29T10:00:00+02:00"
}
Response (200) — repo not found
{"exists": false}
Example
curl http://localhost:3001/v1/workspace/git/status?path=my-project \
-H "Authorization: Bearer sk-gw-abc123"
Anthropic Auth: Status
GET /v1/auth/status Auth Required
Checks whether the Claude CLI inside the container has a valid Anthropic login. This is separate from the gateway's API key auth -- it determines whether the SDK can make API calls to Anthropic.
Response (logged in)
{"loggedIn": true, "email": "user@example.com"}
Response (not logged in)
{"loggedIn": false}
Example
curl http://localhost:3001/v1/auth/status \
-H "Authorization: Bearer sk-gw-abc123"
Anthropic Auth: Login
POST /v1/auth/login Auth Required
Initiates the Anthropic OAuth login flow. Starts a Claude CLI session inside tmux, navigates through the login prompts, and returns the OAuth authorization URL. The user must open this URL in a browser, authorize, and copy the authorization code.
Response
{"url": "https://console.anthropic.com/oauth/authorize?..."}
Error (timeout)
{"error": "Timeout waiting for auth URL"}
Example
curl -X POST http://localhost:3001/v1/auth/login \
-H "Authorization: Bearer sk-gw-abc123"
Anthropic Auth: Submit Code
POST /v1/auth/submit-code Auth Required
Submits the OAuth authorization code obtained from the browser. The gateway sends it to the running Claude CLI tmux session and polls for login success. Requires a prior call to POST /v1/auth/login.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
code | string | Yes | OAuth authorization code from the browser |
Response (success)
{"success": true, "loggedIn": true, "email": "user@example.com"}
Error
{"error": "No auth login session running. Call POST /v1/auth/login first."}
Example
curl -X POST http://localhost:3001/v1/auth/submit-code \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"code": "ey..."}'
Workspace (Memory, Agents, Skills)
The gateway provides CRUD endpoints for three workspace sections: memory, agents, and skills. Files are stored under $WORKSPACE_ROOT (default: $HOME/.claude). All paths are protected against directory traversal attacks.
The following operations are available for each section (memory, agents, skills):
GET /v1/{section} Auth Required
Lists all files in the section. Returns a recursive file listing with path, size, and modification time.
Response
{
"files": [
{
"path": "MEMORY.md",
"size": 1234,
"modified": "2026-03-26T10:00:00.000Z"
},
{
"path": "project/notes.md",
"size": 567,
"modified": "2026-03-25T15:30:00.000Z"
}
]
}
Example
curl http://localhost:3001/v1/memory \
-H "Authorization: Bearer sk-gw-abc123"
GET /v1/{section}/{path} Auth Required
Reads a file. Returns application/json for .json files, text/plain for all others.
Example
curl http://localhost:3001/v1/memory/MEMORY.md \
-H "Authorization: Bearer sk-gw-abc123"
PUT /v1/{section}/{path} Auth Required
Creates or overwrites a file. Parent directories are created automatically. Send the file content as the request body (JSON or plain text).
Response
{"status": "ok", "path": "memory/MEMORY.md"}
Example
# Write plain text
curl -X PUT http://localhost:3001/v1/memory/MEMORY.md \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: text/plain" \
-d '# Agent Memory
This server manages web infrastructure.'
# Write JSON
curl -X PUT http://localhost:3001/v1/agents/config.json \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{"name": "server-admin", "model": "claude-sonnet-4-20250514"}'
DELETE /v1/{section}/{path} Auth Required
Deletes a file. Returns 404 if the file does not exist.
Response
{"status": "ok"}
Example
curl -X DELETE http://localhost:3001/v1/memory/old-notes.md \
-H "Authorization: Bearer sk-gw-abc123"
Tool Registry
The Tool Registry allows external tools to be registered and made available to the Claude agent. Each tool defines a webhook_url that is called when the agent invokes the tool. Registered tools are wrapped as in-process MCP servers and injected into the Claude Agent SDK alongside the built-in tools (Bash, Read, Write, etc.).
Tools persist to TOOLS_PERSIST_PATH (default: /home/node/.claude/tools.json) and survive server restarts.
PUT /v1/tools/:name Auth Required
Registers a new tool or updates an existing one. Returns 201 for new tools, 200 for updates.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
description | string | Yes | Human-readable description of what the tool does |
input_schema | object | Yes | JSON Schema describing the tool's input parameters (must have a type field) |
webhook_url | string | Yes | URL to POST to when the agent invokes this tool |
timeout_ms | number | No | Webhook timeout in milliseconds (default: 30000) |
Response (201 Created / 200 OK)
{
"name": "lookup-customer",
"description": "Look up a customer by email address",
"input_schema": {
"type": "object",
"properties": {
"email": { "type": "string", "description": "Customer email" }
},
"required": ["email"]
},
"webhook_url": "https://api.example.com/tools/lookup-customer",
"timeout_ms": 30000
}
Example
curl -X PUT http://localhost:3001/v1/tools/lookup-customer \
-H "Authorization: Bearer sk-gw-abc123" \
-H "Content-Type: application/json" \
-d '{
"description": "Look up a customer by email address",
"input_schema": {
"type": "object",
"properties": {
"email": { "type": "string", "description": "Customer email" }
},
"required": ["email"]
},
"webhook_url": "https://api.example.com/tools/lookup-customer",
"timeout_ms": 10000
}'
GET /v1/tools Auth Required
Lists all registered tools.
Response
{
"tools": [
{
"name": "lookup-customer",
"description": "Look up a customer by email address",
"input_schema": { "type": "object", "properties": { "email": { "type": "string" } } },
"webhook_url": "https://api.example.com/tools/lookup-customer",
"timeout_ms": 10000
}
]
}
Example
curl http://localhost:3001/v1/tools \
-H "Authorization: Bearer sk-gw-abc123"
GET /v1/tools/:name Auth Required
Returns a single tool definition. Returns 404 if the tool does not exist.
Example
curl http://localhost:3001/v1/tools/lookup-customer \
-H "Authorization: Bearer sk-gw-abc123"
DELETE /v1/tools/:name Auth Required
Deletes a tool. Returns 204 on success, 404 if the tool does not exist.
Example
curl -X DELETE http://localhost:3001/v1/tools/lookup-customer \
-H "Authorization: Bearer sk-gw-abc123"
Webhook Execution Flow
When the Claude agent invokes a registered tool during a query, the gateway POSTs to the tool's webhook_url:
Agent invokes tool
The Claude agent decides to use a registered tool (e.g., lookup-customer) based on its description and input schema.
Gateway POSTs to webhook
The MCP server handler sends a POST request to the tool's webhook_url with the following body:
{
"tool_use_id": "tu_abc123",
"tool_name": "lookup-customer",
"input": { "email": "user@example.com" },
"context": {
"user_id": null,
"conversation_id": null,
"session_id": "session-1",
"api_key_label": "myapp"
}
}
The client's Bearer token is forwarded in the Authorization header.
Webhook responds
The external service processes the request and returns:
{
"output": "Customer found: Jane Doe, Plan: Enterprise",
"metadata": { "customer_id": 42 }
}
On error (timeout, HTTP error, network failure), an error result is returned to the agent, which can retry or use an alternative approach.
Agent continues
The tool result is passed back to the Claude agent as a tool_result event, and the agent continues its work.
NDJSON Event Reference
All events include a seq field (incrementing integer starting at 0) and a type field. Events are streamed as newline-delimited JSON (application/x-ndjson).
text
Assistant text output (streamed incrementally).
{"seq": 0, "type": "text", "content": "Let me check..."}
tool_use
Agent is invoking a tool. input is a human-readable summary (command for Bash, file path for Read/Write/Edit, pattern for Glob/Grep).
{"seq": 1, "type": "tool_use", "toolName": "Bash", "toolUseId": "tu_abc123", "input": "df -h /var", "startedAt": 1711461600000}
tool_result
Tool execution completed. Output is truncated to 3000 characters. durationMs is null if timing data is unavailable.
Note: There is no success field. The gateway reports tool completion, not tool success/failure. The AI agent internally decides whether a tool result represents success or an error and acts accordingly. Consumers should treat a tool_result event as "tool completed" (e.g., show a checkmark), regardless of the output content. If the output contains an error message, the AI will handle the retry or fallback logic itself.
{"seq": 2, "type": "tool_result", "toolName": "Bash", "toolUseId": "tu_abc123", "output": "/dev/sda1 50G 35G 15G 70% /var", "durationMs": 245}
rate_limited
Rate limit detected. The gateway will retry automatically. Emitted both when the retry layer detects an error and when the SDK stream reports a rate limit event.
// Retry layer (error-based)
{"seq": 3, "type": "rate_limited", "status": "retrying", "attempt": 1, "waitMs": 1000}
// SDK stream event
{"seq": 3, "type": "rate_limited", "status": "waiting", "retryAfterMs": 5000}
sdk_status
Claude SDK status change. Currently emitted when context compaction starts ("compacting") and ends (null).
{"seq": 4, "type": "sdk_status", "status": "compacting"}
{"seq": 5, "type": "sdk_status", "status": null}
sdk_compact_complete
Context compaction completed. Reports the trigger reason and pre-compaction token count.
{"seq": 6, "type": "sdk_compact_complete", "trigger": "auto", "preTokens": 180000}
done
Query completed successfully. Includes token usage, cost, and context window statistics. Always the last event on success.
{
"seq": 7,
"type": "done",
"inputTokens": 1250,
"outputTokens": 340,
"costUsd": 0.0087,
"sessionId": "abc-123-def",
"context": {
"usedTokens": 1590,
"contextWindow": 200000,
"percentUsed": 0.8,
"cacheReadTokens": 500,
"cacheCreationTokens": 200
}
}
error
Query failed with an error. Always the last event on failure.
{"seq": 3, "type": "error", "content": "Connection refused: ssh root@example.com"}
Session Management
Sessions enable multi-turn conversations with the Claude agent. A client provides a sessionId with each query, and the gateway maps it to an internal Claude SDK session UUID.
How Sessions Work
Client sends query with sessionId
Client includes a stable sessionId (e.g., "chat-user-42") in the POST /v1/query request body.
Gateway resolves session
If a session exists with the same ID, systemPrompt, and model, the existing Claude session is resumed using the SDK's resume option with the stored SDK session ID. If the prompt or model changed, a new Claude session is created under the same client ID using the sessionId option. After each query completes, the gateway syncs the stored SDK session ID with the actual session_id returned by the SDK, ensuring subsequent queries resume the correct conversation.
Session persists across restarts
Sessions are persisted to disk (debounced write to SESSION_PERSIST_PATH). On restart, sessions are restored and expired ones filtered out.
Idle cleanup (optional)
If SESSION_IDLE_TIMEOUT_MS > 0, a background timer runs every 5 minutes and removes sessions that have been idle longer than the timeout.
Stateless Queries
Set "useSession": false in the request body to skip session management entirely. A random session ID is generated for the query, and no session state is stored.
Managing Sessions via API
GET /v1/sessions-- list all active sessions with model and lastUsed timestampDELETE /v1/sessions/:id-- remove a specific sessionPUT /v1/settings-- configuresessionIdleTimeoutMs
Deployment Guide
Standalone Docker Compose
The simplest way to run the Agent Gateway:
# Clone and configure
git clone https://github.com/diem2001/agent-gateway.git
cd agent-gateway
cp .env.example .env
# Edit .env — set at minimum:
# ANTHROPIC_API_KEY=sk-ant-...
# API_KEYS=myapp:your-secret-key
# Build and start
docker compose up -d --build
# Verify
curl http://localhost:3001/health
Environment Variables
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY | -- | Anthropic API key (or use OAuth via POST /v1/auth/login) |
API_KEYS | default:changeme | Client auth keys (label:secret,...) |
LOG_LEVEL | info | Logging level (off, info, debug) |
SESSION_IDLE_TIMEOUT_MS | 0 | Auto-expire idle sessions (0 = disabled) |
PORT | 3001 | HTTP listen port |
HOST | 0.0.0.0 | Bind address |
SESSION_PERSIST_PATH | /home/node/.claude/sessions.json | Session persistence file (inside agent_home bind-mount) |
EVENT_CACHE_TTL_MS | 1800000 | Query event cache TTL (30 min) |
WORKSPACE_ROOT | $HOME/.claude | Root dir for memory/agents/skills |
TOOLS_PERSIST_PATH | /home/node/.claude/tools.json | Tool registry persistence file |
Docker Architecture
What the Container Includes
The Docker image is based on node:22-bookworm and includes system tools required by the Claude Agent SDK:
- bash, coreutils, findutils, grep, procps -- for Bash, Glob, Grep tools
- git -- for repository operations
- openssh-client -- for SSH connectivity to remote servers
- tmux -- for the Anthropic OAuth login flow
- rsync, wget, curl, jq, mc, python3 -- general utilities
- gosu -- privilege de-escalation (root to node user)
- Claude Code CLI -- installed on first startup via
curl -fsSL https://claude.ai/install.sh | bash
Entrypoint Behavior
The container starts as root and performs setup before dropping to the node user:
- Creates workspace directories (
~/.claude/memory,~/.claude/agents,~/.claude/skills,~/.ssh,~/.local/bin) - Writes default Claude settings with broad tool permissions (if not already present)
- Installs Claude Code CLI via
curlinstall script (if not already installed) - Ensures
PATHincludes~/.local/binin.bashrc - Restores SSH key config from bind-mount if keys exist in
~/.ssh/ - Drops to
nodeuser viagosuand starts the Express server
Persistent Data
The agent_home directory is bind-mounted at /home/node and stores:
/home/node/.claude/sessions.json-- session state (survives container restarts)/home/node/.claude/tools.json-- registered tool definitions (survives container restarts)/home/node/.claude/skills/-- published skills/home/node/.claude/agents/-- agent configurations/home/node/.claude/memory/-- workspace memory/home/node/.claude/projects/-- Claude session data/home/node/.ssh/-- SSH keys (uploaded viaPOST /v1/ssh-keys)
Embedding in an Existing Stack
To add the Agent Gateway as a service in an existing docker-compose.yml:
services:
agent-gateway:
build: ./path/to/agent-gateway
container_name: agent-gateway
ports:
- "127.0.0.1:3001:3001"
volumes:
- ./agent_home:/home/node
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- API_KEYS=${API_KEYS:-default:changeme}
- LOG_LEVEL=${LOG_LEVEL:-info}
- SESSION_PERSIST_PATH=/home/node/.claude/sessions.json
- TOOLS_PERSIST_PATH=/home/node/.claude/tools.json
- SESSION_IDLE_TIMEOUT_MS=${SESSION_IDLE_TIMEOUT_MS:-0}
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:3001/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
restart: unless-stopped
Other services can reach the gateway at http://agent-gateway:3001 on the Docker network. The port 127.0.0.1:3001 binding ensures it is not exposed publicly -- use a reverse proxy (nginx, Traefik) for external access with SSL.
Reverse Proxy (nginx)
location /agent-gateway/ {
proxy_pass http://127.0.0.1:3001/;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_buffering off; # Required for NDJSON streaming
proxy_read_timeout 300s; # Agent queries can be long-running
}
Important: disable proxy buffering
NDJSON streaming requires proxy_buffering off in nginx. Without it, events are buffered and delivered in batches instead of real-time. The gateway also sets X-Accel-Buffering: no on streaming responses as a fallback.
Architecture
Request Flow
HTTP Request
Client sends POST /v1/query with Bearer token, queryId, prompt, and optional sessionId.
Auth Middleware
Validates API key against API_KEYS env var. Attaches client label for audit logging. Rejects with 401 if invalid.
Session Resolution
Looks up or creates a Claude SDK session. Reuses existing session if prompt and model match. Creates new if first query or config changed.
Agent SDK Execution
Runs query() from Claude Agent SDK with tools (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch). Streams events via NDJSON as the agent works.
Retry Layer
On rate limits or empty responses: exponential backoff (1s, 2s, 4s), up to 3 retries, 60s total budget. Emits rate_limited events to the client.
Event Cache + Done
All events are cached by queryId for replay via GET /v1/query/:id/events. Cache entries expire after 30 minutes (configurable). Final done event includes token usage and cost.