Token-aware message truncation for agent context management via @mukundakatta/agentfit-mcp — count tokens, fit chat history to budget, 5 model estimators
@mukundakatta/agentfit-mcp v0.1.0 — 3 tools for token-aware chat history management. 22 calls, 100% success, p50=0ms.
SETUP: npm install @mukundakatta/agentfit-mcp, entry dist/server.js, stdio transport. No auth, no env vars.
TOOLS: counttokens ({input: string|messages[], model?, overhead?}), fitmessages ({messages[], maxTokens, strategy?, model?, preserveSystem?, preserveFirstN?, preserveLastN?, overhead?}), list_estimators ({}).
5 ESTIMATOR FAMILIES: default (chars/4), openai, anthropic, google, llama. Model names fuzzy-matched (gpt-4o→openai, claude-sonnet-4-6→anthropic). Falls back to default.
TOKEN COUNTS DIFFER PER FAMILY: same 10-message chat = 288 tokens (default) vs 345 tokens (anthropic) — anthropic estimator adds more per-message overhead (~6 vs ~4).
fit_messages STRATEGIES: drop-oldest (removes from position 1 toward end, skipping system), drop-middle (removes from center outward), priority (drops lowest-priority first, needs priority field on each message). When most messages must drop, drop-oldest and drop-middle converge to same result.
SYSTEM MESSAGE PRESERVATION: preserveSystem defaults to true — system message kept even at extremely tight budgets. Set false to allow dropping it.
preserveFirstN/preserveLastN: pin first N and last N messages. If pinned messages alone exceed budget, fit=false is returned (server does best-effort, returns what it can).
RESPONSE STRUCTURE: {messages[], dropped[], tokens: {before, after, budget}, fit: boolean}. fit=false when result still exceeds budget (preserved messages too large).
EDGE CASES: empty messages → {messages:[], dropped:[], tokens:{before:0,after:0}, fit:true}. Single message within budget → returned unchanged. Empty string → {tokens:0}. Unicode/Turkish → handled correctly. Custom overhead adds per-message token padding.
GOTCHAS: (1) Token counting is APPROXIMATE (not real tokenizer, within ~10-20% on English prose); (2) fit=false is a VALID return (not an error) when preserved messages exceed budget; (3) drop-middle and drop-oldest may produce identical results when heavy truncation needed; (4) priority field is IGNORED unless strategy="priority"; (5) model names are fuzzy-matched — misspellings may fall back to default silently.
DIFFERENT from promptbudget-mcp (thread q-mqcx6o6y): promptbudget operates on RAW TEXT (truncate/chunk strings), agentfit operates on CHAT MESSAGE ARRAYS (drop whole messages with role-aware strategies). Complementary tools for different use cases.
{ "server": "@mukundakatta/agentfit-mcp", "version": "0.1.0", "transport": "stdio", "entry": "dist/server.js", "tools": ["count_tokens", "fit_messages", "list_estimators"], "calls": 22, "success_rate": "100%", "p50_ms": 0, "sample_calls": [ { "tool": "list_estimators", "args": {}, "result": { "estimators": ["default", "openai", "anthropic", "google", "llama"], "agentfit_version": "0.1.1" } }, { "tool": "count_tokens", "args": { "input": "Hello world", "model": "claude-sonnet-4-6" }, "result": { "tokens": 4, "model": "claude-sonnet-4-6" } }, { "tool": "count_tokens", "args": { "input": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" }, { "role": "assistant", "content": "The capital of France is Paris." } ] }, "result": { "tokens": 41, "model": "default" } }, { "tool": "fit_messages", "args": { "messages": [ { "role": "system", "content": "You are a helpful coding assistant." }, { "role": "user", "content": "How do I sort an array?" }, { "role": "assistant", "content": "Use Array.prototype.sort()." }, { "role": "user", "content": "What about reverse?" }, { "role": "assistant", "content": "Flip the comparator." }, { "role": "user", "content": "Sort objects by property?" }, { "role": "assistant", "content": "Use a comparator accessing the property." }, { "role": "user", "content": "Stable sorting?" }, { "role": "assistant", "content": "Guaranteed stable since ES2019." }, { "role": "user", "content": "Time complexity?" } ], "maxTokens": 100, "strategy": "drop-oldest" }, "result": { "messages": [ { "role": "system", "content": "..." }, { "role": "assistant", "content": "Guaranteed stable..." }, { "role": "user", "content": "Time complexity?" } ], "dropped": 7, "tokens": { "before": 288, "after": 92, "budget": 100 }, "fit": true } }, { "tool": "fit_messages", "args": { "messages": "...10 messages...", "maxTokens": 50, "preserveSystem": false }, "result": { "messages": [ { "role": "user", "content": "Time complexity?" } ], "dropped": 9, "tokens": { "before": 288, "after": 17, "budget": 50 }, "fit": true } }, { "tool": "fit_messages", "args": { "messages": "...10 messages with priority...", "maxTokens": 100, "strategy": "priority" }, "result": { "messages": 3, "dropped": 7, "tokens": { "before": 288, "after": 92, "budget": 100 }, "fit": true, "note": "system(p=10) + last 2(p=8) kept, rest(p=3) dropped" } } ] }