Do you have GDPR compliance issues ?

Check out Legiscope a GDPR compliance software, that will save you weeks of work, automating your documentation, the training of your teams and all processes you need to keep your organisation compliant with privacy regulations

Py-ocrmypdf

Jul 20, 2023

Adds an OCR text layer to scanned PDF files

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted.

Main features

Generates a searchable PDF/A file from a regular PDF
Places OCR text accurately below the image to ease copy / paste
Keeps the exact resolution of the original embedded images
When possible, inserts OCR information as a “lossless” operation without disrupting any other content
Optimizes PDF images, often producing files smaller than the input file
If requested deskews and/or cleans the image before performing OCR
Validates input and output files
Distributes work across all available CPU cores
Uses Tesseract OCR engine to recognize more than 100 languages
Scales properly to handle files with thousands of pages
Battle-tested on millions of PDFs

Checkout these related ports:

Zxing-cpp - ZXing C++ Library for QR code recognition
Zu-hunspell - Zulu hunspell dictionaries
Zu-aspell - Aspell Zulu dictionary
Zq - Easier and faster alternative to jq
Zorba - General purpose C++ XQuery processor
Zenxml - Simple C++ XML Processing
Zed - Command-line tool to manage and query Zed data lakes
Yq - Command-line YAML and XML processor, jq wrapper for YAML/XML documents
Yould - Pronounceable word generator
Yodl - Easy to use but powerful document formatting/preparation language
Yi-hunspell - Yiddish hunspell dictionaries
Yi-aspell - Aspell Yiddish dictionary
Yelp-xsl - DocBook XSLT stylesheets for yelp
Yelp-tools - Utilities to help manage documentation for Yelp and the web
Ydiff - Diff readability enhancer for color terminals

RECENT POSTS

Do you have GDPR compliance issues ?

Py-ocrmypdf

Adds an OCR text layer to scanned PDF files

Checkout these related ports: