パッケージ: ocrmypdf (8.0.1+dfsg-1)

ocrmypdf に関するリンク

Debian の資源:

ocrmypdf ソースパッケージをダウンロード:

メンテナ:

Sean Whitton (QA ページ)

外部の資源:

ホームページ [github.com]

類似のパッケージ:

add an OCR text layer to PDF files

OCRmyPDF generates a searchable PDF/A file from a regular PDF containing only images, allowing it to be searched.

It uses the Tesseract OCR engine and so supports all the languages that Tesseract does.

Some other main features:

  * Places OCR text accurately below the image to ease copy / paste
  * Keeps the exact resolution of the original embedded images
  * When possible, inserts OCR information as a lossless operation
    without rendering vector information
  * Keeps file size about the same
  * If requested deskews and/or cleans the image before performing OCR
  * Validates input and output files
  * Provides debug mode to enable easy verification of the OCR results
  * Processes pages in parallel when more than one CPU core is
    available
  * Battle-tested on thousands of PDFs, a test suite and continuous
    integration.

その他の ocrmypdf 関連パッケージ

依存

推奨

提案

enhances

dep: ghostscript (>= 9.18~dfsg~)

PostScript 言語および PDF 向けインタプリタ
dep: icc-profiles-free

ICC color profiles for use with color profile aware software
dep: liblept5

画像処理ライブラリ
dep: python3

対話式の高レベルオブジェクト指向言語 (デフォルト python3 バージョン)
dep: python3-cffi-backend-api-max (>= 9729)

パッケージは利用できません
dep: python3-cffi-backend-api-min (<= 9729)

パッケージは利用できません
dep: python3-chardet

汎用の文字エンコーディング判別器 - Python3 用
dep: python3-img2pdf (>= 0.3.0)

Lossless conversion of raster images to PDF (library)
dep: python3-pdfminer (>= 20181108+dfsg-3)

PDF parser and analyser (Python3)
dep: python3-pikepdf

Python library to read and write PDFs with QPDF
dep: python3-pil

Python Imaging Library (Python3)
dep: python3-pkg-resources

pkg_resources を用いたパッケージ検索およびリソースアクセス
dep: python3-reportlab

ReportLab library to create PDF documents using Python3
dep: python3-ruffus (>= 2.8)

Python3 computation pipeline library widely used in bioinformatics
dep: qpdf (>= 8.0.2)

PDF ファイルの変換および検査用ツール
dep: tesseract-ocr (>= 4.0.0)

Tesseract command line OCR tool
dep: zlib1g

圧縮ライブラリ - ランタイム

rec: pngquant

PNG (Portable Network Graphics) 画像最適化ユーティリティ
rec: unpaper

スキャンしたページを後処理

sug: img2pdf

Lossless conversion of raster images to PDF
sug: ocrmypdf-doc

add an OCR text layer to PDF files - documentation
sug: python-watchdog

Python API and shell utilities to monitor file system events - Python 2.x

ocrmypdf のダウンロード

すべての利用可能アーキテクチャ向けのダウンロード
アーキテクチャ	パッケージサイズ	インストールサイズ	ファイル
all	109.5 kB	431.0 kB	[ファイル一覧]