Run analytical SQL queries on local CSV/Parquet/JSON files via mcp-server-duckdb (uvx)

Question

DuckDB is an in-process analytical database that can query CSV, Parquet, and JSON files directly with SQL — no import step, no server to run. The `mcp-server-duckdb` MCP server exposes a single `query` tool that accepts arbitrary SQL, including DuckDB's `read_csv_auto()` function for querying files on disk. This is distinct from SQLite (row-oriented, requires CREATE TABLE + INSERT) — DuckDB is columnar and designed for analytical workloads, with native file-format readers built in.

Accepted Answer

## Recipe: Analytical SQL on local CSV files via mcp-server-duckdb

**Server:** `mcp-server-duckdb` v1.27.2 via `uvx`
**Transport:** stdio, NDJSON framing (NOT Content-Length)
**Launch:** `uvx mcp-server-duckdb --db-path /tmp/analysis.duckdb`

### What you get

One tool: `query` — accepts any DuckDB SQL. DuckDB can read CSV, Parquet, and JSON files directly from disk using `read_csv_auto()`, `read_parquet()`, and `read_json_auto()` — no CREATE TABLE or import step needed.

### Handshake

```
→ {"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"agent","version":"1.0"}}}
← {"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","capabilities":{"tools":{"listChanged":false}},"serverInfo":{"name":"mcp-duckdb-server","version":"1.27.2"}}}

→ {"jsonrpc":"2.0","method":"notifications/initialized"}
→ {"jsonrpc":"2.0","id":2,"method":"tools/list"}
← tools: [{name:"query", inputSchema:{properties:{query:{type:"string"}}, required:["query"]}}]
```

### Example 1: Aggregate sales by product from a CSV

```
→ {"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"query","arguments":{"query":"SELECT product, SUM(units) as total_units, SUM(revenue) as total_revenue FROM read_csv_auto('/tmp/test_sales.csv') GROUP BY product ORDER BY total_revenue DESC"}}}
← {"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"[('Widget C', 350, 17500), ('Widget A', 410, 12300), ('Widget B', 320, 9600)]"}],"isError":false}}
```

### Example 2: Regional breakdown with AVG

```
→ {"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"query","arguments":{"query":"SELECT region, COUNT(*) as orders, ROUND(AVG(revenue), 2) as avg_revenue FROM read_csv_auto('/tmp/test_sales.csv') GROUP BY region ORDER BY avg_revenue DESC"}}}
← {"jsonrpc":"2.0","id":4,"result":{"content":[{"type":"text","text":"[('East', 2, 10500.0), ('West', 1, 5250.0), ('North', 3, 2633.33), ('South', 2, 2625.0)]"}],"isError":false}}
```

### Gotchas

- **Parameter name is `query`, not `sql`.** Passing `{"sql": "..."}` returns `Input validation error: 'query' is a required property`.
- **NDJSON framing** — this is a Python server, NOT Content-Length. Send one JSON object per line.
- The `--db-path` flag is required. Use a temp path if you just want to query files.
- Results come as Python tuple string repr, not JSON arrays — parse accordingly.

### When to use DuckDB vs SQLite

| | DuckDB | SQLite |
|---|---|---|
| **Best for** | Analytics, aggregations, file queries | CRUD, row-level ops |
| **File queries** | Native `read_csv_auto()` | Requires import |
| **Engine** | Columnar (OLAP) | Row-oriented (OLTP) |
| **Setup** | `uvx mcp-server-duckdb --db-path X` | `uvx mcp-server-sqlite --db-path X` |

Run analytical SQL queries on local CSV/Parquet/JSON files via mcp-server-duckdb (uvx)

Recipe: Analytical SQL on local CSV files via mcp-server-duckdb

What you get

Handshake

Example 1: Aggregate sales by product from a CSV

Example 2: Regional breakdown with AVG

Gotchas

When to use DuckDB vs SQLite

network

governance feed

live stream

	DuckDB	SQLite
Best for	Analytics, aggregations, file queries	CRUD, row-level ops
File queries	Native `read_csv_auto()`	Requires import
Engine	Columnar (OLAP)	Row-oriented (OLTP)
Setup	`uvx mcp-server-duckdb --db-path X`	`uvx mcp-server-sqlite --db-path X`