Home » Questions » Computers [ Ask a new question ]

How do I save an image PDF file as an image?

How do I save an image PDF file as an image?

I have a PDF that contains a scan image of a document. I want to save the contents of this PDF as an image so that I can then run it through an OCR program that only accepts .jpg, .png, and .gif type files.

Asked by: Guest | Views: 259
Total answers/comments: 5
bert [Entry]

"Please pay close attention to pooryorick's answer, in which he points
out how sleske's answer is actually a much better answer for this
particular problem.

Use GhostScript. This command works for me:

gs -dBATCH -dNOPAUSE -sDEVICE=png16m -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -r150 -sOutputFile=output%d.png input.pdf

There are multiple png pseudo-devices, differentiating on color depth: pngmono, pnggray, png16, png256, png16m, and pngalpha. Choose whichever one suits you the best.

You can also use jpeg, but unless you have a disk space issue, you want as high a quality as you can manage for your OCR, and that's not jpeg.

GhostScript no longer has support for gif, but I can't imagine why you'd need that, what with png256 support."
bert [Entry]

"There's also pdfimages from the Xpdf tools (available from the site of XpdfReader). It will not convert a whole PDF page to an image, rather it will extract embedded images from a PDF.

This is useful if the PDF contains text and images, and you want only the images. Also, it will extract the images in their original format, so no loss of quality is involved (unlike programs which render the whole page and then convert it to e.g. JPEG). Depending on your needs this might be useful.

Simple usage:

pdfimages -j -list mydocument.pdf mydocument-images

This will read the input file mydocument.pdf, extract all images and write them to individual files named mydocument-images-0000.jpg, mydocument-images-0001.jpg etc.

Option -j makes it write embedded JPEG-compressed images as JPEG files, not as PBM/PGM/PPM files (which are uncompressed and huge). Note that images may still be written as PBM/PGM/PPM files, if that's how they were stored in the PDF input file."
bert [Entry]

"PDFill PDF Tools is probably the easist way to convert your PDFs to images on Windows. It'll let you export all the pages in the PDF to separate images in one shot. It also has a lot of other features available for free, which are only available in other PDF viewers if you purchase the commercial or ""Pro"" version.

Use the ""Convert PDF to Images"" button (button #10) in the screenshot below.

If you need to concatenate the images into one very tall image so you only have to feed one file to your OCR program, you can use IrfanView"
"PDFill PDF Tools is probably the easist way to convert your PDFs to images on Windows. It'll let you export all the pages in the PDF to separate images in one shot. It also has a lot of other features available for free, which are only available in other PDF viewers if you purchase the commercial or ""Pro"" version.

Use the ""Convert PDF to Images"" button (button #10) in the screenshot below.

If you need to concatenate the images into one very tall image so you only have to feed one file to your OCR program, you can use IrfanView"
bert [Entry]

"Since you didn't include an OS tag I'll include an OSX answer:

PDFs by default open in Preview.app which allows you to use File -> Save-As:

GIF
ICNS
JPEG
JPEG-2000
BMP
OpenEXR
Photoshop
PNG
TGA
TIFF"
bert [Entry]

"Also PDF Xchange Viewer (Free) will do export-to-file. File → Export → Export to image.

Not only that, but I think it's the best free PDF viewer for Windows, and it has some nice markup capabilities. I have a license for Adobe Acrobat and I still prefer this unless I'm doing extensive editing, which is rarely."