PDF Tools

Find Text
Merge
Compare
Save Web Page as PDF
Change/View PDF Info
Convert Scanned PDF to Searchable PDF (OCR)
Reduce Size of PDF File
Editors

Find Text

Find text (for example, "Important") in pdf-files recursively and show page number:

$ pdfgrep -n -r --include "*.pdf" "Important"

Merge

Merge three pdf files together:

$ pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf

Combine selected pages of two pdf files into a new document:

$ pdftk A=file1.pdf B=file2.pdf cat A1-3 B1-5 A4 output newfile.pdf

Compare

diffpdf compares two PDF files textually or visually.

$ diffpdf file1.pdf file2.pdf

Save Web Page as PDF

wkhtmltopdf can save web page to PDF file preserving formatting and hyperlinks.

Some features (headers, margins and etc.) require a patched Qt. Most Linux distributions provide wkhtmltopdf without those features [FAQ]. The binary versions for major Linux distributions, Windows and macOS with all features are provided by developers.

An example of using wkhtmltopdf:

$ wkhtmltopdf -s A3 -L 25mm -R 25mm --default-header --header-font-size 10 --header-spacing 5 http://example.com/ example.pdf

Firefox Bug #454059 - Creating PDF of web page: hyperlinks are lost. Opened on 2008-09-07.

Change/View PDF Info

exiftool can list and edit meta information. To view the tags:

$ exiftool file.pdf

To change the title of file.pdf:

$ exiftool -overwrite_original -Title="Title of PDF Document" file.pdf

The writable tags are Author, Creator, Keywords ("keyword1;keyword2"), Producer, Subject and Title [PDF Tags - exiftool.org].

Convert Scanned PDF to Searchable PDF (OCR)

$ convert -density 300 scanned-pdf.pdf converted-png.png

If pdf document has 2 pages, then 2 png files will be created: converted-png-0.png and converted-png-1.png.

OCR png files by tesseract setting proper language ("-l deu" in case of German text)

$ tesseract converted-png-0.png new-pdf-page1 -l deu pdf
$ tesseract converted-png-1.png new-pdf-page2 -l deu pdf

Merge pages

pdftk new-pdf-page1.pdf new-pdf-page2.pdf cat output new-pdf.pdf

Reduce Size of PDF File

The size of pdf file can be reduced

$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dColorImageDownsampleType=/Bicubic -dColorImageResolution=300 -dPDFSETTINGS=/screen -sOutputFile=fileout.pdf filein.pdf

-dPDFSETTINGS options [Milan Kupcevic]:

/screen:   screen-view-only quality, 72 dpi images;
/ebook:   low quality, 150 dpi images;
/printer:   high quality, 300 dpi images;
/prepress:   high quality, color preserving, 300 dpi images.

Editors

«Master PDF Editor is a proprietary application to edit PDF documents on Linux, Windows and macOS. It can create, edit (insert text or images), annotate, view, encrypt, and sign PDF documents. With version 5, Master PDF Editor has removed some features from its free to use version, like editing or adding text, inserting images, and more - when using such tools, the application adds a big watermark to the PDF document unless users buy the full version.» [linuxuprising.com]