Documentation

Docs

Everything you need to deploy and configure SpendProxy. The integration is a one-line change — see for yourself below.

Quick start

1. Start SpendProxy

Using Docker:

 docker run -d -p 4100:4100 spendproxy/proxy:latest

Or using npx:

 npx @cloudexpat/spendproxy

2. Point your AI SDK at SpendProxy

Pick your provider and language. The highlighted diff is the entire integration — one line. API keys pass through untouched.

3. Open the dashboard

Navigate to localhost:4100/dashboard. Cost data appears in real time. The dashboard auto-refreshes every 10 seconds.

Configuration

SpendProxy is configured via environment variables.

Variable	Default	Description
CE_PROXY_PORT	4100	Port to listen on
CE_DATA_DIR	~/.spendproxy	Data directory for SQLite database and config
CE_DASHBOARD_PASSWORD	(none)	Password-protect the dashboard and cost APIs
CE_PROXY_DB	~/.spendproxy/ce-proxy.db	SQLite database path

Example with custom port and data directory:

 docker run -d \
  -p 8080:8080 \
  -e CE_PROXY_PORT=8080 \
  -e CE_DATA_DIR=/data \
  -v /host/data:/data \
  spendproxy/proxy:latest

Supported providers and models

SpendProxy uses fuzzy matching for model IDs. Versioned identifiers automatically resolve to the base model. Pricing database updated continuously.

OpenAI

gpt-5.5, gpt-5.5-pro

gpt-5.4, gpt-5.4-mini, gpt-5.4-nano

gpt-4.1, gpt-4.1-mini, gpt-4.1-nano

o3, o4-mini

Endpoint: /v1/*

Anthropic

claude-opus-4-7, claude-opus-4-6

claude-sonnet-4-6, claude-sonnet-4-5

claude-haiku-4-5

Endpoint: /anthropic/*

Google

gemini-3.1-pro, gemini-3.5-flash

gemini-2.5-pro, gemini-2.5-flash

Endpoint: /google/*

Attribution headers

SpendProxy auto-attributes costs using system prompt fingerprinting, toolset hashing, and SDK detection. You can also use optional headers for explicit control — they take priority over auto-detection.

Request headers (optional)

Header	Purpose
X-CE-Route	Feature or endpoint name
X-CE-Tag	Custom tag (team, environment)
X-CE-Project	Project name

Response headers (always included)

Header	Value
X-CE-Request-Id	Unique request ID
X-CE-Cost	Total cost in USD
X-CE-Input-Tokens	Input token count
X-CE-Output-Tokens	Output token count
X-CE-Cached-Tokens	Cached input tokens
X-CE-Model	Resolved model name
X-CE-Latency-Ms	Proxy overhead latency

For streaming responses, cost data is sent as a final SSE comment: ce-cost {"cost": 0.0023, ...}

Security model

Data residency

SpendProxy runs entirely in your infrastructure. The SQLite database, request logs, cost data, and attribution data all stay in your VPC. Nothing is transmitted to SpendProxy, CloudExpat, or any third-party service.

API key handling

API keys are passed through to the provider in the original request headers. SpendProxy never reads, logs, or stores your API keys. They're forwarded as-is.

Network access

SpendProxy makes outbound HTTPS requests only to the AI provider APIs (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com). It does not phone home, send telemetry, check for updates, or communicate with any other external service.

What SpendProxy stores locally

✓ Request metadata: model, tokens, cost, latency, timestamps
✓ Attribution data: system prompt fingerprints (hashes, not content), toolset hashes, SDK identifiers
✓ Optimization state: cache entries, dedup signatures, budget counters
× Prompt content, response content, or API keys are never stored