# PDFParser — Full LLM Context PDFParser is a PDF parsing service for developers and teams. It extracts text, extracts images, converts PDF documents to Markdown (PDF to MD), and returns clean structured data as JSON. ## What PDFParser does PDFParser turns any PDF into structured data. It is built around four capabilities: 1. **Text extraction** — Plain text from every page, in reading order, ready for indexing or RAG (retrieval-augmented generation) pipelines. 2. **Image extraction** — Every image embedded in a PDF, returned as base64-encoded PNG. Useful for "image from PDF" workflows. 3. **PDF to MD (Markdown)** — Convert PDF to Markdown with headings, lists, and inline image references preserved. 4. **Data structuring** — Output is a JSON array, one element per page, with raw text, text with image markers, and images. ## API contract `POST /v1/parse` (mocked in the current frontend) accepts a PDF file and returns the following JSON shape: ```json [ { "pageNumber": 0, "rawText": "string", "textWithImage": "string with [image:1] markers", "images": [ { "imageId": 1, "imageBase64": "base64-png-data" } ] } ] ``` ## How it works (3 steps) 1. **Upload PDF** — Drop a PDF file or use the API. 2. **We parse it** — The engine extracts text, images, and structured fields. 3. **Get structured data** — Receive clean JSON ready to use in your app. ## Result views in the web app After parsing, three view modes are available: - `raw_text` — Plain extracted text. - `text_with_image` — Text with inline image references. - `json` — Full JSON payload, formatted for copy/paste. ## Plans - **Free** — $0 / month. 100 pages / month. Raw text, text + image extraction, and JSON output. Available now. - **Pro** — $29 / month. Up to 10,000 pages / month, tables & forms detection, higher API limits, priority support. Coming soon. - **Enterprise** — Custom pricing. Unlimited pages, SLA, dedicated support, SSO, audit logs, on-premise deployment. Coming soon. ## Common questions **Can PDFParser extract images from a PDF?** Yes. Every image on every page is returned as a base64 PNG in the `images` array of the response. **Can PDFParser convert PDF to Markdown?** Yes. PDF to MD conversion preserves headings, lists, and inline image references. **Does PDFParser return structured data?** Yes. Output is a JSON array with one entry per page. The shape is documented above. **Is there a free tier?** Yes. The Free plan allows 100 pages per month at no cost. **What file formats are supported?** PDF only at this time. ## Keywords PDF parsing, PDF parser, PDF to MD, PDF to Markdown, text extract, text extraction, image extract, image extraction, image from PDF, images from PDF, data structuring, structured data, PDF API, parse PDF online, extract data from PDF, RAG, document AI.