# PDFParser > PDFParser is a PDF parsing service that turns PDF documents into structured > data. It performs text extraction, image extraction (image from PDF), > PDF-to-Markdown (PDF to MD) conversion, and data structuring — exposed as a > simple HTTP API and a web app. PDFParser accepts a PDF upload and returns a JSON array. Each element represents a page with the following fields: - `pageNumber` — zero-based index of the page. - `rawText` — plain text extracted from the page. - `textWithImage` — text with inline image references (e.g. `[image:1]`). - `images` — array of objects `{ imageId, imageBase64 }` for every image on the page. ## Core capabilities - **Text extraction** — clean, ordered text from any PDF. - **Image extraction** — every image on the page returned as base64 PNG. - **PDF to MD** — convert a PDF to Markdown with headings, lists, and inline image references preserved. - **Data structuring** — turn unstructured PDFs into structured JSON ready for indexing, search, RAG pipelines, or analytics. ## Pricing - **Free** — 100 pages / month. Available now. - **Pro** — Coming soon. - **Enterprise** — Coming soon. ## Pages - [Home](/): Upload a PDF and parse it in the browser. - [Pricing](/pricing): Plans and limits. - [API](/api): Programmatic access (coming soon). - [About](/about): Background on PDFParser (coming soon). - [Sign in](/login): Existing user login. - [Sign up](/register): Create a free account. ## Resources - [Full LLM context](/llms-full.txt) - [Sitemap](/sitemap.xml)