Thomas
Open source · v0.1

Get Thomas on the wire.

The orchestrator's instrument panel for AI agents.

The 2026 founder's job is to orchestrate agents — Claude Code shipping code, Claude Cowork running ops, Hermes answering support, a dozen agents working in parallel while you sleep. You cannot orchestrate what you cannot see.

Thomas is the local-first flight recorder. Every model call, every tool call, every MCP message, every file change, every dollar — captured on your own machine, decoded into an execution graph you can actually read. The paid tiers add the control plane on top: block risky actions, set budgets, route models, govern across a team.

$ npm install -g @openthomas/thomas

What you'll see

$ thomas list
ID            STARTED              AGENT        STATUS  DUR     ACT  COST     IN/OUT  CACHE R/W
────────────  ───────────────────  ───────────  ──────  ──────  ───  ───────  ──────  ─────────
ru_aBc1xYz9   2026-05-16 14:23:11  claude-code  done    1.4s    1    $0.0024  120/45  0/0
ru_fOj6Ce1H   2026-05-16 14:22:18  hermes       done    7.8s    1    $0.97    1/396   565K/5K
$ thomas tail
14:42:49  claude-code   mcp_call    → filesystem  tools/list
14:42:49  claude-code   mcp_call    ← filesystem  tools/list
14:43:21  claude-code   model_call  claude-opus-4-7  200  6/8  $0.0007
!  14:44:02  hermes        model_call  Xiangxin-2XL-Chat  200  1/1547  $1.367
$ thomas absurd --since 24h --limit 5

Absurd cost-vs-output ratios · last 24h

    $1.37   io   510→    9   cache  840Kr/ 5.2Kw   claude-opus-4-7   claude-code   ru_a8f3c1d2   2h ago
    $1.16   io   510→    7   cache  740Kr/   2Kw   claude-opus-4-7   claude-code   ru_b7e5a094   4h ago
    $1.12   io   510→    5   cache  725Kr/ 1.4Kw   claude-opus-4-7   claude-code   ru_c641d8b3   7h ago
     $0.80   io     6→    7   cache     0r/  43Kw   claude-opus-4-7   claude-code   ru_d9f2e103  11h ago
     $0.53   io   510→   10   cache  330Kr/ 1.5Kw   claude-opus-4-7   claude-code   ru_e1a8c764  14h ago

Biggest spends · last 24h

   $15.55   io     6→  12K   cache     0r/ 782Kw   claude-opus-4-7   claude-code   ru_f4b1d2e8   3h ago
   $14.86   io     1→ 6.4K   cache     0r/ 767Kw   claude-opus-4-7   claude-code   ru_g7c9a0f1   6h ago

Period total: $163.42   absurd subset: $4.98 (3%)

The cache column is where the hidden $/turn lives. A single "yes/no" exchange that replays 725K tokens of cached context costs $1.40 — and there's no way to see it in a chat log. thomas absurd ranks runs by cost-per-output-token and biggest absolute spend, so the leaks surface themselves.

Model calls

Anthropic, OpenAI, Gemini, OpenRouter, vLLM, any OpenAI-compatible endpoint. Streaming and non-streaming. Full prompt / response / tool_use blocks.

MCP messages

JSON-RPC over stdio. Per-frame capture (request / response / notification) with method, params, result, error.

Tool calls

tool_use blocks (bash, file edit, web fetch, …) inline with the model response that produced them.

Cost & tokens

Including cache_read / cache_write attribution. See exactly when a prompt cache charged you and when it didn't.

Risk signals

Destructive shell (rm -rf, curl | sh), possible secret leak, cost spikes, retry storms. Read-only labels in the trace.

Replay

Re-execute any run deterministically against stored traces — debug without re-paying.

How Thomas fits the founder journey

Anthropic's Founder's Playbook names four stages every AI-native startup moves through. AI compresses each — and introduces new failure modes the playbook explicitly warns about. Thomas is the instrument set that lets the orchestrator see and control what AI is doing on their behalf, at every stage.

StageWhere you areThomas (free)Natural upgrade
Idea Ten hypotheses, ten Claude bills running in parallel Cost X-ray per agent, per project, per day Personal: budget caps so rabbit holes can't drain credits
MVP Agentic coding, unattended runs.
"Code that works is not code that is secure." — playbook
Full action trace, risk flags, mock replay Personal: approval gates, budget caps, model routing
Launch Real users on the line.
"The observability layer that makes SLAs actually enforceable." — playbook
Audit trail per run, shareable post-mortem reports Personal: live blocking. Solo: multi-project, CI replay, eval suites from real traces
Scale First hires, multiple machines, enterprise procurement wants proof you're a dependable infrastructure partner. (free keeps recording) Team: RBAC, SSO, shared policies, fleet management, audit log

Same job, new instruments. The playbook closes with "Same job, new rules" — the founder's work hasn't changed, only the path. Thomas is one of the instruments that makes the new path navigable.

Thomas (free) → Personal / Solo / Team

The free package is the see layer — capture, decode, timeline, cost X-ray, risk flags, mock replay. MIT licensed, fully local, complete on its own. The paid tiers are the control layer that builds on the same daemon.

 Thomas
(free)
PersonalSoloTeam
Local capture, decode, timeline
Cost X-ray, risk flags
Shareable reports, mock replay
Active blocking & policy
Approval gates (live YES/NO)
Budget caps & model routing
Encrypted cloud sync
Multi-project workspaces
CI replay (live, deterministic)
Eval suites from real traces
RBAC, SSO, shared policies
Fleet management & audit log

Paid features ship in the same binary, gated by a cloud-issued license token — no plugins, no separate CLI. The free package never calls cloud, never phones home, and is a complete flight recorder forever. Today (v0.1): free only. Personal launches in v0.4; Solo and Team follow.

Privacy is structural, not a promise

Local-first means local-first. Thomas writes to ~/.thomas/ on your machine. Nothing else.

No telemetry. No phone-home. No auto-update pings. No license-check calls (no license system exists in v0.1). Decoder selection inspects hostnames as string comparisons, not network calls.

The only outbound traffic is your own agent's request being forwarded to the model provider it was already calling — unchanged bytes, unchanged auth headers. Read the full contract: every file Thomas writes, every URL it can reach, two greps that audit the claim against the source.

Why not Langfuse / Helicone / LangSmith?

Those tools watch model calls. Thomas watches agent runtimes — the full execution graph including tool calls, MCP frames, file ops, shell exec, and inter-agent traffic.

If you've ever asked what did my agent actually do?, that's the gap.

Get started

$ npm install -g @openthomas/thomas

$ thomas wire                  # detect agents, install taps, start daemon
# …run your agent as usual…
$ thomas                       # open the UI at http://localhost:9877

thomas wire is byte-exact reversible: thomas unwire restores every file it touched. Auto-wires Claude Code, OpenClaw, OpenCode, Hermes; manual instructions for Codex, Cursor, Gemini CLI. macOS & Linux. Node.js 22+.