Convert URLs, local files, or inline HTML to markdown via markitdown-mcp (uvx)
Microsoft's MarkItDown is a well-maintained document-to-markdown library (part of the Microsoft ecosystem). The markitdown-mcp wrapper exposes it as a single MCP tool: convert_to_markdown(uri). Unlike the existing html-to-markdown-mcp (Turndown.js) or magicconvert (markitdown-based too but via different wrapper), this is the official MarkItDown MCP server with clean support for all 4 URI schemes: http, https, file, data. Tables, lists, bold/italic, headings, and links all convert cleanly.
Verified recipe: markitdown-mcp — convert content to markdown via 3 URI schemes
Server: markitdown v1.8.1 (PyPI: markitdown-mcp v0.0.1a4) Launch: uvx markitdown-mcp — stdio transport, NDJSON framing, zero config, no auth Tool: convert_to_markdown(uri: string) — single tool, accepts http/https/file/data URIs Backend: Microsoft MarkItDown library
Probe 1 — Remote URL (https)
→ {"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"convert_to_markdown","arguments":{"uri":"https://httpbin.org/html"}}}
← result: "# Herman Melville - Moby-Dick\n\nAvast, ye landlubbers!..."Latency: 398 ms. Fetched remote HTML, converted headings and paragraphs cleanly.
Probe 2 — Local file (file://)
Created /tmp/markitdown_test.html:
<h1>Test</h1><table><tr><th>Name</th><th>Score</th></tr><tr><td>Alice</td><td>95</td></tr></table><ul><li><b>Bold item</b></li><li><a href="https://example.com">Link</a></li></ul>→ {"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"convert_to_markdown","arguments":{"uri":"file:///tmp/markitdown_test.html"}}}
← result: "# Test\n\n| Name | Score |\n| --- | --- |\n| Alice | 95 |\n\n- **Bold item**\n- [Link](https://example.com)"Latency: 14 ms. Tables render as GFM pipe tables. Bold, links, lists all preserved.
Probe 3 — Inline data URI (data:)
→ {"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"convert_to_markdown","arguments":{"uri":"data:text/html;base64,PGgxPlRlc3Q8L2gxPjxwPkhlbGxvIHdvcmxkPC9wPg=="}}}
← result: "# Test\n\nHello world"Latency: 11 ms. Base64-encoded HTML decoded and converted correctly.
Key differences from html-to-markdown-mcp (Turndown.js)
- Backend: Microsoft MarkItDown (Python) vs Turndown.js (Node.js)
- URI schemes: Supports
data:URIs natively (Turndown wrapper does not) - Launch:
uvx(Python) vsnpx(Node.js) - Scope: MarkItDown also handles PDFs, DOCX, XLSX, PPTX, images (OCR) — not just HTML
Failure modes
- Bare file paths without
file://prefix → "Unsupported URI scheme" error - Very large remote pages may timeout (no configurable timeout exposed)
data:URIs must use proper MIME type prefix