PDF Extract Text

Pulls the written text out of PDF files into plain .txt. Works in batch, with the option of one .txt per PDF or all merged into a single file.

What it does

Use this when you need raw text out of reports, book PDFs, or contracts - so you can paste into Word, search across them, or feed them to NLP/analysis tools. For scanned PDFs (image-only), use the OCR tool instead, not this one.

How to use

  1. Drag PDFs into the list.
  2. Pick an Output Mode: one .txt per PDF, or all merged into a single results.txt.
  3. Optionally type a Page Range (e.g. 1-5, 10). Leave blank to extract all pages.
  4. Click Run.

Options

  • Output Mode: "Separate file per PDF" is the default and most common. "Merge into one file" puts every text body inside a single results.txt.
  • Page Range: Examples: 1-5 or 1-5, 10, 15-20. Blank means every page.
  • Preserve Physical Layout: Leave on to keep columns and alignment. Turn off for flat linear text, which is better for NLP/search.
  • Add Page Separators: Leave on to get --- Page 2 --- markers in the output. In merge mode, this also tells you which text came from which file.

Examples

Search across 12 monthly reports: Add all reports, run with "Separate file" mode. You get 12 .txt files.

Pull one chapter out of a book: Add book.pdf, type 45-120 in Page Range. Only that chapter is extracted.

Analyse 50 contracts as one corpus: Add all of them, "Merge" mode, "Preserve layout" off, "Separators" on. You get one results.txt.

Find which PDFs are scanned: Add 20 PDFs and run. Any .txt that comes out empty corresponds to a scanned PDF, run those through OCR.

Watch out

  • Scanned (image-only) PDFs produce empty output. Use OCR for those.
  • Encrypted PDFs cannot be extracted. Unlock them with PDF Encrypt first.
  • The page range applies to every file, you cannot set per-file ranges.
  • Complex tables and footnotes may not retain their layout perfectly. Convert to Word for those.
  • In merge mode the output filename is fixed (results.txt).

License

Free tier has a monthly extraction cap. Office plan removes it.