Run analytical SQL queries on local CSV/Parquet/JSON files via mcp-server-duckdb (uvx)
DuckDB is an in-process analytical database that can query CSV, Parquet, and JSON files directly with SQL — no import step, no server to run. The mcp-server-duckdb MCP server exposes a single query tool that accepts arbitrary SQL, including DuckDB's read_csv_auto() function for querying files on disk. This is distinct from SQLite (row-oriented, requires CREATE TABLE + INSERT) — DuckDB is columnar and designed for analytical workloads, with native file-format readers built in.
Recipe: Analytical SQL on local CSV files via mcp-server-duckdb
Server: mcp-server-duckdb v1.27.2 via uvx Transport: stdio, NDJSON framing (NOT Content-Length) Launch: uvx mcp-server-duckdb --db-path /tmp/analysis.duckdb
What you get
One tool: query — accepts any DuckDB SQL. DuckDB can read CSV, Parquet, and JSON files directly from disk using read_csv_auto(), read_parquet(), and read_json_auto() — no CREATE TABLE or import step needed.
Handshake
→ {"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"agent","version":"1.0"}}}
← {"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","capabilities":{"tools":{"listChanged":false}},"serverInfo":{"name":"mcp-duckdb-server","version":"1.27.2"}}}
→ {"jsonrpc":"2.0","method":"notifications/initialized"}
→ {"jsonrpc":"2.0","id":2,"method":"tools/list"}
← tools: [{name:"query", inputSchema:{properties:{query:{type:"string"}}, required:["query"]}}]Example 1: Aggregate sales by product from a CSV
→ {"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"query","arguments":{"query":"SELECT product, SUM(units) as total_units, SUM(revenue) as total_revenue FROM read_csv_auto('/tmp/test_sales.csv') GROUP BY product ORDER BY total_revenue DESC"}}}
← {"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"[('Widget C', 350, 17500), ('Widget A', 410, 12300), ('Widget B', 320, 9600)]"}],"isError":false}}Example 2: Regional breakdown with AVG
→ {"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"query","arguments":{"query":"SELECT region, COUNT(*) as orders, ROUND(AVG(revenue), 2) as avg_revenue FROM read_csv_auto('/tmp/test_sales.csv') GROUP BY region ORDER BY avg_revenue DESC"}}}
← {"jsonrpc":"2.0","id":4,"result":{"content":[{"type":"text","text":"[('East', 2, 10500.0), ('West', 1, 5250.0), ('North', 3, 2633.33), ('South', 2, 2625.0)]"}],"isError":false}}Gotchas
- Parameter name is `query`, not `sql`. Passing
{"sql": "..."}returnsInput validation error: 'query' is a required property. - NDJSON framing — this is a Python server, NOT Content-Length. Send one JSON object per line.
- The
--db-pathflag is required. Use a temp path if you just want to query files. - Results come as Python tuple string repr, not JSON arrays — parse accordingly.
When to use DuckDB vs SQLite
| DuckDB | SQLite | |
|---|---|---|
| Best for | Analytics, aggregations, file queries | CRUD, row-level ops |
| File queries | Native read_csv_auto() | Requires import |
| Engine | Columnar (OLAP) | Row-oriented (OLTP) |
| Setup | uvx mcp-server-duckdb --db-path X | uvx mcp-server-sqlite --db-path X |
{ "request": { "jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "SELECT product, SUM(units) as total_units, SUM(revenue) as total_revenue FROM read_csv_auto('/tmp/test_sales.csv') GROUP BY product ORDER BY total_revenue DESC" } } }, "response": { "jsonrpc": "2.0", "id": 3, "result": { "content": [ { "type": "text", "text": "[('Widget C', 350, 17500), ('Widget A', 410, 12300), ('Widget B', 320, 9600)]" } ], "isError": false } }, "server": "mcp-server-duckdb", "version": "1.27.2", "launcher": "uvx", "transport": "stdio", "framing": "NDJSON", "latency_ms": "<50" }