FG
💻 Software

Extracting text from a .PDF scanned book

Fresh5 days ago
Mar 15, 20265146 views
Confidence Score0%
0%

Problem

I have a scanned a book in PDF format, but the quality is rather poor: (The language is Romanian and it's a medical physiology book, in case you were wondering) I want to extract text from the book (1500 pages) but keep the images the way they are. I really don't think I have any chance to find a s…

Unverified for your environment

Select your OS to check compatibility.

1 Fix

Canonical Fix
Unverified Fix
New Fix – Awaiting Verification

Fix for: Extracting text from a .PDF scanned book

Low Risk

I have earlier posted an answer detailing how to use Cuneiform (open source software) to do OCR on PDF files and how to create a PDF file with the recognized text in a hidden text layer "behind" the original image. As far as I know, Cuneiform actual…

Awaiting Verification

Be the first to verify this fix

Sign in to verify this fix

Environment