Documentation
Docs
Everything you need to deploy and configure SpendProxy.
Quick start
1. Start SpendProxy
Using Docker:
docker run -d -p 4100:4100 spendproxy/proxy:latest Or using npx:
npx @cloudexpat/spendproxy 2. Point your AI SDK at SpendProxy
Change the base URL in your client. Your API keys pass through untouched.
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:4100/v1', // was: https://api.openai.com/v1
}); import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
baseURL: 'http://localhost:4100/anthropic', // was: https://api.anthropic.com
}); // Change the URL prefix:
// was: https://generativelanguage.googleapis.com/v1beta/models/...
// now: http://localhost:4100/google/v1beta/models/... 3. Open the dashboard
Navigate to localhost:4100/dashboard. Cost data appears in real time. The dashboard auto-refreshes every 10 seconds.
Configuration
SpendProxy is configured via environment variables.
| Variable | Default | Description |
|---|---|---|
| CE_PROXY_PORT | 4100 | Port to listen on |
| CE_DATA_DIR | ~/.spendproxy | Data directory for SQLite database and config |
| CE_PROXY_DB | ~/.spendproxy/ce-proxy.db | SQLite database path |
Example with custom port and data directory:
docker run -d \
-p 8080:8080 \
-e CE_PROXY_PORT=8080 \
-e CE_DATA_DIR=/data \
-v /host/data:/data \
spendproxy/proxy:latest Supported providers and models
SpendProxy uses fuzzy matching for model IDs. Versioned identifiers like gpt-4o-2024-08-06 automatically resolve to the base model.
OpenAI
/v1/*Anthropic
/anthropic/*/google/*Attribution headers
SpendProxy auto-attributes costs using system prompt fingerprinting, toolset hashing, and SDK detection. You can also use optional headers for explicit control — they take priority over auto-detection.
Request headers (optional)
| Header | Purpose |
|---|---|
| X-CE-Route | Feature or endpoint name |
| X-CE-Tag | Custom tag (team, environment) |
| X-CE-Project | Project name |
Response headers (always included)
| Header | Value |
|---|---|
| X-CE-Request-Id | Unique request ID |
| X-CE-Cost | Total cost in USD |
| X-CE-Input-Tokens | Input token count |
| X-CE-Output-Tokens | Output token count |
| X-CE-Cached-Tokens | Cached input tokens |
| X-CE-Model | Resolved model name |
| X-CE-Latency-Ms | Proxy overhead latency |
For streaming responses, cost data is sent as a final SSE comment: ce-cost {"cost": 0.0023, ...}
Security model
Data residency
SpendProxy runs entirely in your infrastructure. The SQLite database, request logs, cost data, and attribution data all stay in your VPC. Nothing is transmitted to SpendProxy, CloudExpat, or any third-party service.
API key handling
API keys are passed through to the provider in the original request headers. SpendProxy never reads, logs, or stores your API keys. They're forwarded as-is.
Network access
SpendProxy makes outbound HTTPS requests only to the AI provider APIs (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com). It does not phone home, send telemetry, check for updates, or communicate with any other external service.
What SpendProxy stores locally
- ✓ Request metadata: model, tokens, cost, latency, timestamps
- ✓ Attribution data: system prompt fingerprints (hashes, not content), toolset hashes, SDK identifiers
- ✓ Optimization state: cache entries, dedup signatures, budget counters
- × Prompt content, response content, or API keys are never stored