Run analytical SQL on local CSV/JSON/Parquet files via mcp-server-duckdb (uvx)
DuckDB via MCP lets an agent run full SQL analytics on local files (CSV, JSON, Parquet) without standing up a database server. The server exposes a single query tool — you pass raw SQL and get results as tuples. DuckDB's read_csv_auto() auto-detects column names, types, and delimiters from the CSV file. Supports CREATE TABLE for in-memory work, plus the full DuckDB SQL dialect (window functions, CTEs, date functions, JSON extraction, etc.).
Recipe: Run SQL analytics on local CSV files via mcp-server-duckdb (uvx)
Spawn
uvx --from mcp-server-duckdb mcp-server-duckdb --db-path /tmp/my-analysis.duckdbTransport: stdio, NDJSON framing (newline-delimited JSON — NOT Content-Length).
Server info
- Name:
mcp-duckdb-server - Version:
1.27.2 - Tools:
query(single tool — pass any DuckDB SQL) - Parameter:
{"query": "SELECT ..."}(note: the param isquery, notsql)
Tool schema
{
"name": "query",
"inputSchema": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "SQL query to execute" }
},
"required": ["query"]
}
}Use case 1: In-memory tables
Create tables, insert rows, run aggregations — all in a single session:
CREATE TABLE cities (name TEXT, country TEXT, population INTEGER);
INSERT INTO cities VALUES ('Tokyo','Japan',13960000), ('Istanbul','Turkey',15460000), ('London','UK',8982000);→ Result: [(3,)] (3 rows inserted)
SELECT name, country, population FROM cities ORDER BY population DESC;→ [('Istanbul', 'Turkey', 15460000), ('Tokyo', 'Japan', 13960000), ('London', 'UK', 8982000)]
Use case 2: Query CSV files directly (the killer feature)
No import step needed — DuckDB reads CSV files with auto-detected schemas:
DESCRIBE SELECT * FROM read_csv_auto('/tmp/sample-sales.csv');→ Auto-detected schema: date DATE, product VARCHAR, region VARCHAR, units BIGINT, revenue DOUBLE
SELECT product, SUM(units) as total_units, ROUND(SUM(revenue),2) as total_revenue, ROUND(AVG(revenue/units),2) as avg_price_per_unit
FROM read_csv_auto('/tmp/sample-sales.csv')
GROUP BY product ORDER BY total_revenue DESC;→ [('Widget C', 600, 18000.0, 30.0), ('Widget B', 420, 16800.0, 40.0), ('Widget A', 545, 10900.0, 20.0)]
SELECT strftime(date::DATE, '%Y-%m') as month, COUNT(*) as transactions, SUM(units) as total_units, ROUND(SUM(revenue),2) as total_revenue
FROM read_csv_auto('/tmp/sample-sales.csv')
GROUP BY month ORDER BY month;→ [('2026-01', 4, 500, 13700.0), ('2026-02', 4, 500, 17000.0), ('2026-03', 4, 565, 15000.0)]
Gotchas
- NDJSON only: this Python MCP server uses newline-delimited JSON, NOT Content-Length framing. Sending Content-Length headers causes
pydantic ValidationError: Invalid JSON. - Param name: the tool parameter is
query, notsql. Sending{"sql": "..."}returnsInput validation error: 'query' is a required property. - Cold start: uvx first-install ~3s, subsequent runs ~1s.
- DuckDB also supports
read_parquet()andread_json_auto()for other file formats.
{ "server": "mcp-server-duckdb", "version": "1.27.2", "spawn": "uvx --from mcp-server-duckdb mcp-server-duckdb --db-path /tmp/pathfinder-test3.duckdb", "transport": "stdio", "framing": "NDJSON", "tools": ["query"], "trace": [ { "step": "initialize", "request": { "jsonrpc": "2.0", "id": 1, "method": "initialize", "params": { "protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": { "name": "pathfinder", "version": "1.0" } } }, "response": { "serverInfo": { "name": "mcp-duckdb-server", "version": "1.27.2" }, "protocolVersion": "2024-11-05" } }, { "step": "tools/list", "request": { "jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {} }, "response": { "tools": [ { "name": "query", "inputSchema": { "type": "object", "properties": { "query": { "type": "string" } }, "required": ["query"] } } ] } }, { "step": "query — CREATE+INSERT (in-memory)", "request": { "jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "CREATE TABLE cities (name TEXT, country TEXT, population INTEGER); INSERT INTO cities VALUES ('Tokyo','Japan',13960000),('Istanbul','Turkey',15460000),('London','UK',8982000),('Berlin','Germany',3645000),('Paris','France',2161000);" } } }, "response": { "content": [ { "type": "text", "text": "[(5,)]" } ], "isError": false } }, { "step": "query — SELECT ORDER BY (in-memory)", "request": { "jsonrpc": "2.0", "id": 4, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "SELECT name, country, population FROM cities ORDER BY population DESC;" } } }, "response": { "content": [ { "type": "text", "text": "[('Istanbul', 'Turkey', 15460000), ('Tokyo', 'Japan', 13960000), ('London', 'UK', 8982000), ('Berlin', 'Germany', 3645000), ('Paris', 'France', 2161000)]" } ], "isError": false } }, { "step": "query — DESCRIBE read_csv_auto (file analytics)", "request": { "jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "DESCRIBE SELECT * FROM read_csv_auto('/tmp/sample-sales.csv');" } } }, "response": { "content": [ { "type": "text", "text": "[('date', 'DATE', 'YES', None, None, None), ('product', 'VARCHAR', 'YES', None, None, None), ('region', 'VARCHAR', 'YES', None, None, None), ('units', 'BIGINT', 'YES', None, None, None), ('revenue', 'DOUBLE', 'YES', None, None, None)]" } ], "isError": false } }, { "step": "query — GROUP BY aggregation on CSV", "request": { "jsonrpc": "2.0", "id": 4, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "SELECT product, SUM(units) as total_units, ROUND(SUM(revenue),2) as total_revenue FROM read_csv_auto('/tmp/sample-sales.csv') GROUP BY product ORDER BY total_revenue DESC;" } } }, "response": { "content": [ { "type": "text", "text": "[('Widget C', 600, 18000.0), ('Widget B', 420, 16800.0), ('Widget A', 545, 10900.0)]" } ], "isError": false } }, { "step": "query — monthly trend with strftime", "request": { "jsonrpc": "2.0", "id": 5, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "SELECT strftime(date::DATE, '%Y-%m') as month, COUNT(*) as transactions, SUM(units) as total_units, ROUND(SUM(revenue),2) as total_revenue FROM read_csv_auto('/tmp/sample-sales.csv') GROUP BY month ORDER BY month;" } } }, "response": { "content": [ { "type": "text", "text": "[('2026-01', 4, 500, 13700.0), ('2026-02', 4, 500, 17000.0), ('2026-03', 4, 565, 15000.0)]" } ], "isError": false } } ], "failure_report": { "ndjson_required": "Python MCP SDK uses NDJSON, not Content-Length. Content-Length framing causes pydantic ValidationError.", "wrong_param_name": "Tool param is 'query', not 'sql'. Using 'sql' returns Input validation error." }, "executed_at": "2026-06-13T12:15:00Z" }