
PDF Reader MCP
Extract text, images, and metadata from local or URL PDFs as structured output, dir-confined.
Add to your client
Copy the config for your MCP client and paste it into its config file.
npx -y @sylphx/pdf-reader-mcpPaste into ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"pdf-reader-mcp": {
"command": "npx",
"args": [
"-y",
"@sylphx/pdf-reader-mcp"
]
}
}
}Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf
Before you start
- Node.js 22.13.0 or higher (required by pdfjs-dist v6)
- npx (bundled with Node.js) to run @sylphx/pdf-reader-mcp without a global install
- No credentials or API key needed — auth is none
- Optional: a directory to confine reads to (passed via --allow-dir), and network access if reading PDFs from URLs
About PDF Reader MCP
PDF Reader MCP is a Model Context Protocol server that lets AI agents extract text, images, metadata, and page counts from PDF files. It accepts local files (absolute or relative paths), as well as remote PDFs over HTTP/HTTPS, and returns structured JSON rather than raw blobs, so agents get clean, ordered content back.
The server exposes a single unified read_pdf tool driven by boolean flags (include_full_text, include_images, include_metadata, include_page_count) and a per-source pages selector that supports ranges like "1-5,10-15,20". It uses Y-coordinate-based layout reconstruction to preserve natural reading order and can process multiple PDFs concurrently for speed.
It is built on pdfjs-dist and ships with directory-confinement and host-allowlist controls, making it safer to point at agent workspaces. It runs over stdio by default and can also be deployed as a remote HTTP server. Compatible with Claude Desktop, Claude Code, Cursor, Windsurf, Cline, VS Code, and Warp.
Tools & capabilities (1)
read_pdfUnified tool to extract text, images, metadata, and page count from one or more PDF sources (local paths or URLs), with optional per-source page-range selection.
When to use it
- Use it when an agent needs to read and summarize a local PDF report, contract, or invoice as structured text.
- Use it when you want to pull specific page ranges out of a large PDF instead of the whole document.
- Use it when you need to fetch and parse a PDF directly from a URL without downloading it yourself first.
- Use it when you need PDF metadata (title, author, page count) for cataloging or routing logic.
- Use it when you want to extract embedded images (as Base64 with dimensions) from a PDF for downstream processing.
- Use it when you need to confine PDF access to a specific working directory for safety in an agent loop.
Quick setup
- 1Ensure Node.js 22.13.0+ is installed.
- 2Add the server to your MCP client config with command `npx` and args `["@sylphx/pdf-reader-mcp"]`.
- 3Optionally pass `--allow-dir=/path/to/pdfs` (repeatable) to confine filesystem reads, and `--allow-host=domain` or `--no-http` to control URL access.
- 4Restart your MCP client (e.g. Claude Desktop) so it picks up the new server.
- 5Verify by asking the agent to read a known local PDF and confirm it returns text or metadata.
Security notes
File access is confined to the working directory the host sets, so run it with the cwd scoped to the intended project folder. Parsing untrusted PDFs always carries some parser-exploitation risk; only process documents you trust. Note: use the current @sylphx/ package, not the older @sylphlab/ name.
PDF Reader MCP FAQ
Does it need an API key or authentication?
No. It runs locally over stdio with no auth, reading PDFs from the filesystem or from HTTP/HTTPS URLs you allow.
Can it read PDFs from a URL, not just local files?
Yes. Each source can specify a `url` field for HTTP/HTTPS PDFs. You can disable this with `--no-http` or restrict it to specific domains with `--allow-host`.
How do I restrict which directories it can read?
Pass `--allow-dir=/path` (repeatable) or set the `MCP_PDF_ALLOWED_DIRS` environment variable. Reads outside allowed directories fail fast with an Access denied error.
Can I extract only certain pages?
Yes. Each source accepts a `pages` parameter using ranges like "1-5,10-15,20" or an explicit array like [1,2,3].
Why is it returning an error about Node version?
pdfjs-dist v6 requires Node.js 22.13.0 or higher. Upgrade Node if you see engine or module errors on startup.
Alternatives to PDF Reader MCP
Compare all alternatives →Official MCP reference server for secure local filesystem read/write within allowed directories.
Official MCP server for reading, searching, and manipulating a local Git repository's files and history.
Official AWS Labs MCP server to manage and query S3 Tables (table buckets, namespaces, tables).
Compare PDF Reader MCP with: