Using this short and easy tutorial you will learn how to convert entire PDF documents, or just certain pages, to various image formats and resolutions.
1. Install tool
If you are using a superuser (root), you should remove the ‘sudo’ word from the command.
You can install this tool on Ubuntu, Debian, Mint, and other related distribution using:
$ sudo apt update $ sudo apt install poppler-utils
For Redhat, CentOS, Fedora, and other related distributions, you need to use yum or dnf:
$ sudo dnf install poppler-utils –y
$ sudo yum install poppler-utils –y
Arch Linux uses pacman:
$ sudo pacman -S poppler
In OpenSUSE you need to use zypper:
$ sudo zypper install poppler-tools
I will be using CentOS 8 for this tutorial:
If you are here, you already know what a PDF is, but just to be clear, PDF stands for “portable document format”. This format is used when you need to save files that cannot be modified but still need to be easily shared and printed.
The tool we are using to convert PDF files to images can create PPM files (color image files), PGM files (grayscale image files), PBM files (monochrome image files), or the popular format of PNG files.
I will use a 78 pages PDF called Origami.pdf to demonstrate a few operations.
2.1 Convert the entire PDF file to image(s)
Please keep in mind that the
pdftoppm tool creates an image file for each PDF page. This operation will create 78 image files.
The simplest way to use the tool is “
sudo pdftoppm PDF_name IMAGE_name”. I should expect 78 files, right?
$ sudo pdftoppm Origami.pdf Origami
Each file on a single line:
$ ls -lh
$ ls -lh |grep ppm |wc –l</code> # this command lists the files in the current folder, outputs the files which contain ‘ppm’ in their name, and counts the lines
2.2 Modify the resolution of the images
By default, without using any parameters, the pdftoppm tool will create 150 DPI (Dots Per Inch) images, which is a great idea for black and white text documents because you are less storage space, but not such a good idea for pictures. The above ppm files each occupy 6.1 MB.
To modify the resolution, you need to add 2 arguments,
-ry, adjusting the X and Y axes. Let’s try 300 DPI:
$ sudo pdftoppm -rx 300 -ry 300 Origami.pdf Origami
At 300 DPI, the size of each file is 25 MB.
2.3 Select a range of pages to be converted to images
Sometimes you don’t need the entire PDF file, maybe you need just the first page, maybe you need just a range of pages, say from page 5 to page 10. The
pdftoppm tool makes this easy for you. You can specify the range using the
-f (first) and
-l (last) arguments.
Just the first page:
$ sudo pdftoppm -f 1 -l 1 Origami.pdf Origami
Pages 5 to 10:
$ sudo pdftoppm -f 5 -l 10 Origami.pdf Origami
Of course, you can combine the resolution arguments with the page range ones. I’ll extract pages 12 to 15 with a 600 DPI resolution:
$ sudo pdftoppm -rx 600 -ry 600 -f 12 -l 15 Origami.pdf Origami
As you can see, the images are huge, 97 MB, but the quality is excellent.
2.4 Change the image file format
Like I mentioned at the beginning of the article, multiple image file formats are available when using this tool. A popular choice is PNG (Portable Network Graphic). To convert the PDF to PNG, just use the
-png argument. I’ll convert pages 18 to 20 to a 300 DPI resolution PNG file, in order to use everything shown in this tutorial, so far. So:
$ sudo pdftoppm -png -rx 300 -ry 300 -f 18 -l 20 Origami.pdf Origami
You notice a 300 DPI PNG file is smaller than a 150 DPI PPM file.
If you are curious about how the metadata looks for the created images, you can check How to Get Image Metadata in Linux.
Let’s compare the two file formats at the same resolution. I will create 150 DPI images from the page range 5-7. Remember, 150 DPI is the default option, so you do not need to specify
$ sudo pdftoppm -png -f 5 -l 7 Origami.pdf Origami
At the same resolution, 150 DPI, the PNG file is around 4 times smaller than the PPM file. That is because compression is used for the PNG file format.
Other file formats which you can use are:
- PBM – monochrome files, use the
-monoargument instead of
$ sudo pdftoppm -mono Origami.pdf Origami
- PGM – grayscale files, use the
-grayargument instead of
$ sudo pdftoppm -gray Origami.pdf Origami
I’ve generated mono and gray files from page 5 of the PDF:
If you need to archive all the images into a single file, you can use the
zip command. If you are getting “zip: command not found”, you need to install it:
-bash: zip: command not found
$ sudo dnf install zip -y
The syntax is ‘
zip archive_name file(s)_name’. The following command archives all Origami-XX images in an archive called
$ zip origami.zip Origami-*
Looking at the above picture, you can notice that PNG files are already compressed and do not decrease in size when archived.
As conclusion, the developers of the
pdftoppm tool provided this description:
“pdftoppm converts Portable Document Format (PDF) files to color image files in Portable Pixmap (PPM) format, grayscale image files in Portable Graymap (PGM) format, or monochrome image files in Portable Bitmap (PBM) format. Pdftoppm reads the PDF file, PDF-file, and writes one PPM file for each page, PPM-nnnnnn.ppm, where nnnnnn is the page number.”