Apache PDFBox将pdf转换为图像

Apache PDFBox convert pdf to images

有人可以举例说明如何使用Apache PDFBox在不同的图像中转换pdf(pdf的每一页都有一个)。 提前致谢


1.8。*版本的解决方案:

1
2
3
4
5
6
7
8
9
10
PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page = 0;
for (PDPage pdPage : pdPages)
{
    ++page;
    BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);
    ImageIOUtil.writeImage(bim, pdfFilename +"-" + page +".png", 300);
}
document.close();

在构建之前,不要忘记阅读1.8依赖项页面。

2.0版本的解决方案:

1
2
3
4
5
6
7
8
9
10
PDDocument document = PDDocument.load(new File(pdfFilename));
PDFRenderer pdfRenderer = new PDFRenderer(document);
for (int page = 0; page < document.getNumberOfPages(); ++page)
{
    BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);

    // suffix in filename will be used as the file format
    ImageIOUtil.writeImage(bim, pdfFilename +"-" + (page+1) +".png", 300);
}
document.close();

ImageIOUtil类位于单独的下载/工件(pdf-tools)中。在进行构建之前,请阅读2.0依赖关系页面,您将需要额外的带有jbig2图像的PDF文件,用于保存到tiff图像以及读取加密文件。

确保使用您正在使用的任何JDK版本的最新版本,即如果您使用的是jdk8,则不要使用版本1.8.0_5,请使用1.8.0_191或您阅读时的最新版本。早期版本非常慢。


我今天用PdfBox 2.0.15试了一下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.rendering.*;
import java.awt.image.*;
import java.io.*;
import javax.imageio.*;


public static void PDFtoJPG (String in, String out) throws Exception
{
    PDDocument pd = PDDocument.load (new File (in));
    PDFRenderer pr = new PDFRenderer (pd);
    BufferedImage bi = pr.renderImageWithDPI (0, 300);
    ImageIO.write (bi,"JPEG", new File (out));
}


没有任何额外的依赖项,您可以使用PDFBox中已包含的PDFToImage类。

科特林:

PDFToImage.main(arrayOf("-outputPrefix","newImgFilenamePrefix", existingPdfFilename))

其他配置选择:https://pdfbox.apache.org/docs/2.0.8/javadocs/org/apache/pdfbox/tools/PDFToImage.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
public class PDFtoJPGConverter {

    public List<File> convertPdfToImage(File file, String destination) throws Exception {

    File destinationFile = new File(destination);

    if (!destinationFile.exists()) {
        destinationFile.mkdir();
        System.out.println("DESTINATION FOLDER CREATED ->" + destinationFile.getAbsolutePath());
    }else if(destinationFile.exists()){
        System.out.println("DESTINATION FOLDER ALLREADY CREATED!!!");
    }else{
        System.out.println("DESTINATION FOLDER NOT CREATED!!!");
    }

    if (file.exists()) {
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        List<File> fileList = new ArrayList<File>();

        String fileName = file.getName().replace(".pdf","");
        System.out.println("CONVERTER START.....");

        for (int i = 0; i < doc.getNumberOfPages(); i++) {
        // default image files path: original file path
        // if necessary, file.getParent() +"/" => another path
        File fileTemp = new File(destination + fileName +"_" + i +".jpg"); // jpg or png
        BufferedImage image = renderer.renderImageWithDPI(i, 200);
        // 200 is sample dots per inch.
        // if necessary, change 200 into another integer.
        ImageIO.write(image,"JPEG", fileTemp); // JPEG or PNG
        fileList.add(fileTemp);
        }
        doc.close();
        System.out.println("CONVERTER STOPTED.....");
        System.out.println("IMAGE SAVED AT ->" + destinationFile.getAbsolutePath());
        return fileList;
    } else {
        System.err.println(file.getName() +" FILE DOES NOT EXIST");
    }
    return null;
    }

    public static void main(String[] args) {

    try {
        PDFtoJPGConverter converter = new PDFtoJPGConverter();
        Scanner sc = new Scanner(System.in);
        System.out.print("Enter your destination folder where save image
");
        // Destination = D:/PPL/;
        String destination = sc.nextLine();

        System.out.print("Enter your selected pdf files name with source folder
");
        String sourcePathWithFileName = sc.nextLine();
        // Source Path = D:/PDF/ant.pdf,D:/PDF/abc.pdf,D:/PDF/xyz.pdf
        if (sourcePathWithFileName != null || sourcePathWithFileName !="") {
        String[] files = sourcePathWithFileName.split(",");
        for (String file : files) {
            File pdf = new File(file);
            System.out.print("FILE:>>"+ pdf);
            converter.convertPdfToImage(pdf, destination);
        }
        }

    } catch (Exception ex) {
        ex.printStackTrace();
    }
    }
}

====================================

在这里,我使用Apache pdfbox-2.0.8,commons-logging-1.2和fontbox-2.0.8 Library

快乐编码:)