tokenpak

TokenPak API Reference

Source of truth — matches actual proxy_v4.py endpoints. API runs on the proxy port (default 8766).

Base URL: http://localhost:8766


Authentication

Most endpoints require no authentication for local use.

The dashboard can be protected with an admin token (configure dashboard.require_token in config or set TOKENPAK_DASHBOARD_AUTH=true). Set the token via X-Admin-Token header when required.


Core LLM Proxy

POST /v1/*

Route an LLM request through TokenPak. Supports OpenAI-compatible and Anthropic-compatible clients.

curl -X POST http://localhost:8766/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -d '{"model":"claude-haiku-4-5","messages":[{"role":"user","content":"Hello"}],"max_tokens":100}'

TokenPak auto-detects the provider (Anthropic, OpenAI, Google) based on path, headers, and request body. The response is the standard provider response — no schema changes.


Health & Status

GET /health

Liveness check. Returns system status, module availability, and session stats.

curl http://localhost:8766/health

Response:

{
  "status": "ok",
  "compilation_mode": "full",
  "vault_index": { "available": true, "blocks": 42, "path": "/..." },
  "router": { "enabled": true },
  "capsule_available": true,
  "canon": { "enabled": true, "session_hits": 12 },
  "skeleton": { "enabled": false },
  "budget": { "enabled": true, "total_tokens": 200000 },
  "strict_validation": true,
  "upstream_timeout_seconds": 60,
  "circuit_breakers": {
    "anthropic": { "open": false, "failures": 0 }
  },
  "stats": { "requests": 47, "tokens_saved": 12400 }
}

GET /stats

Session statistics and cost tracking.

curl http://localhost:8766/stats

Response:

{
  "session": { "requests": 47, "tokens_saved": 12400, "cost_usd": 0.08 },
  "compilation_mode": "full",
  "vault_index": { "available": true, "blocks": 42 },
  "router": { "enabled": true },
  "today": { "requests": 47, "cost_usd": 0.08 },
  "by_model": { "claude-haiku-4-5": { "requests": 30, "cost": 0.04 } },
  "recent": [ ... ]
}

GET /stats/last

Per-request stats for the most recent request.

curl http://localhost:8766/stats/last

GET /stats/session

Current session-level stats (tokens in, tokens out, savings).

curl http://localhost:8766/stats/session

GET /cache-stats

Cache performance metrics (hit rate, miss count, TTL info).

curl http://localhost:8766/cache-stats

GET /recent

Last 50 requests with timing, model, token counts, and cost.

curl http://localhost:8766/recent

Data Ingest

POST /ingest

Ingest a single request record into TokenPak’s telemetry.

curl -X POST http://localhost:8766/ingest \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-haiku-4-5","tokens":1500,"cost":0.002}'

Required fields:

Field Type Description
model string Model name (e.g. claude-haiku-4-5)
tokens int ≥ 0 Token count for this request
cost float ≥ 0 Cost in USD

Optional fields: timestamp (ISO 8601), agent, session_id, tokens_saved, compression_rate


POST /ingest/batch

Ingest multiple records at once.

curl -X POST http://localhost:8766/ingest/batch \
  -H "Content-Type: application/json" \
  -d '[{"model":"claude-haiku-4-5","tokens":1500,"cost":0.002}, ...]'

Dashboard

GET /dashboard

Opens the TokenPak web dashboard (HTML UI). Accessible in your browser at:

http://localhost:8766/dashboard

Shows token savings, cost trends, model usage, and session history.


Notes