Pack
Deterministically pack context before it goes on the wire — stop re-sending the same boilerplate, file contents, and system prompts.
TokenPak — local LLM proxy
TokenPak is a local proxy that packs your LLM context before it hits the API — fewer repeated tokens on the wire, more reuse opportunities, and savings you can measure per-request.
It adds the missing logistics layer: packing, routing, reusable context, guardrails, orchestration, and per-request records.
The logistics layer, as a capability grid.
Deterministically pack context before it goes on the wire — stop re-sending the same boilerplate, file contents, and system prompts.
Send each request to the right model, with fallback rules you control.
Recall and reuse context from local PAKs instead of rebuilding it every session.
Spend and safety guardrails run as a side-channel gatehouse across every request.
Coordinate scoped, multi-step and multi-agent work. Preview (alpha) — not yet in a published release.
Every request leaves a local record — what was sent, reused, observed, and charged.
One linear path, packed and recorded locally before it leaves your machine.
Reuse feed: PAKs feed into the Packing Station — recall and reuse context instead of rebuilding it.
Gatehouse (Guard): spend and safety guardrails run as a side-channel overlay before Send.
TokenPak overlaps with tools you may already run — and adds a layer they don't. Run them together.
Your agent
Claude Code
Cursor · Cline
TokenPak (local)
pack · measure
guard · record
Your gateway
LiteLLM /
OpenRouter / …
Model provider
Anthropic · OpenAI
Google Gemini
Dispatch Center dashboard (preview) — orchestration for multi-step and multi-agent work, around the flow.
| Tool type | Overlaps on | TokenPak adds | Together? |
|---|---|---|---|
| Gateways / routers | routing | deterministic context packing + reusable PAKs | Run both. |
| Observability tools | measurement | pre-send packing + per-request context-level attribution | Run both. |
| MCP-based workflows | ecosystem coordination | a semantic contract (TIP) for packing, routing, cost, telemetry | Composes. |
Reduce the repeated context your agents re-send. Improve cache reuse. Receipt-backed measurement is coming from benchmarked workloads.
TokenPak's core is open source (Apache-2.0). Pro adds team dashboards, advanced routing, and enterprise controls.
Pro is delivered as the tokenpak-paid package via a separate index.
Refreshed automatically on every release + on a daily safety-net schedule.
### Added - **TokenPak Dispatch graduates from preview to a released feature.** The `tokenpak dispatch` command (intake/routing, Decision Inbox, run-ledger lifecycle, observability) now ships in the released `pip install tokenpak` package — the Dispatch engine, its registry/schema data, and the user guide are included in the wheel. Delivery/receipt remain an explicit post-alpha preview (no live station execution wired yet).
Curated entry points sourced from tokenpak/docs.
starter
Install TokenPak and see savings in one command.
starter
pip install options, OS notes, troubleshooting install failures.
starter
Route Claude Code through TokenPak with one environment variable.
starter
Point the OpenAI Python SDK at TokenPak with OPENAI_BASE_URL.
starter
Point the Anthropic Python SDK at TokenPak with ANTHROPIC_BASE_URL.
starter
Route Cline through TokenPak with a custom compatible base URL.
pip install tokenpak && tokenpak setup. No cloud component; credentials stay in your environment and provider flow.