PDFBox 3 将大型 pdf 文件中的页面渲染为图像 IllegalArgumentException:容量 < 0

问题描述 投票:0回答:1

我正在使用 PDFBox 版本 3.0.3 将页面从一个巨大的 pdf 文件(从 500M 到 1GB)渲染为图像。

这是我用来将页面渲染为图像的代码。

PDDocument document = Loader.loadPDF(pdfFile, IOUtils.createTempFileOnlyStreamCache());
PDFRenderer pdfRenderer = new PDFRenderer(document);
pdfRenderer.setSubsamplingAllowed(true);
BufferedImage image = pdfRenderer.renderImage(0, scale, ImageType.RGB);
ImageIO.write(image, "png", imageFile);

我在调试时注意到问题来自于

Filter
的方法decode。当长度大约
524,288,000 
1,048,576,000
RandomAccessReadWriteBuffer
输入的值将为负数。

randomAccessWriteBuffer = new RandomAccessReadWriteBuffer(
    Math.min(length << 2, RandomAccessReadBuffer.DEFAULT_CHUNK_SIZE_4KB));
o.a.p.contentstream.PDFStreamEngine - java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
java.io.IOException: java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:223)
    at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:477)
    at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1103)
    at org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:74)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:893)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:531)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:506)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:153)
    at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:286)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:330)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:247)
Caused by: java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
    at java.base/java.nio.Buffer.createCapacityException(Buffer.java:290)
    at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:390)
    at org.apache.pdfbox.io.RandomAccessReadBuffer.<init>(RandomAccessReadBuffer.java:70)
    at org.apache.pdfbox.io.RandomAccessReadWriteBuffer.<init>(RandomAccessReadWriteBuffer.java:40)
    at org.apache.pdfbox.filter.Filter.decode(Filter.java:250)
    at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:73)
    at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:172)
    at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:193)
    at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:895)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:469)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:217)

这是库中的错误吗?无论如何我们可以解决这个问题吗?谢谢

java pdfbox
1个回答
0
投票

此问题已在 PDFBOX-5908 中修复,并将在 PDFBox 3.0.4 中修复。 此处提供了快照版本,请对其进行测试以确保确定。感谢您报告此事。

© www.soinside.com 2019 - 2024. All rights reserved.