Get Image from the document using Apache POI
我正在使用Apache Poi从docx中读取图像。
这是我的代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 | enter code here public Image ReadImg(int imageid) throws IOException { XWPFDocument doc = new XWPFDocument(new FileInputStream("import.docx")); BufferedImage jpg = null; List<XWPFPictureData> pic = doc.getAllPictures(); XWPFPictureData pict = pic.get(imageid); String extract = pict.suggestFileExtension(); byte[] data = pict.getData(); //try to read image data using javax.imageio.* (JDK 1.4+) jpg = ImageIO.read(new ByteArrayInputStream(data)); return jpg; } |
它正确读取图像,但顺序不正确。
例如,如果文档包含
image1.jpeg
image2.jpeg
image3.jpeg
image4.jpeg
image5.jpeg
显示为
image4
image3
图片1
图片5
image2
您能帮我解决吗?
我想按顺序阅读图像。
谢谢,
西西克
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | public static void extractImages(XWPFDocument docx) { try { List<XWPFPictureData> piclist = docx.getAllPictures(); // traverse through the list and write each image to a file Iterator<XWPFPictureData> iterator = piclist.iterator(); int i = 0; while (iterator.hasNext()) { XWPFPictureData pic = iterator.next(); byte[] bytepic = pic.getData(); BufferedImage imag = ImageIO.read(new ByteArrayInputStream(bytepic)); ImageIO.write(imag,"jpg", new File("D:/imagefromword/" + pic.getFileName())); i++; } } catch (Exception e) { System.exit(-1); } } |