使用 DOCX4j 将 DOCX 转换为 PDF 时出现“java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 2”

问题描述 投票:0回答:1

我有下面的代码使用 docx4j 将 docx 转换为 pdf ` 最终 WordprocessingMLPackage wordMlPackage = Docx4J.load(inputStream); 最终变量输出路径= Paths.get(tmpDir.toAbsolutePath().toString(), "%s.%s".formatted("output", "pdf")); 最终 var pdfFile = outputPath.toFile();

final var mapper = new BestMatchingMapper();
final var fonts = wordMlPackage.getMainDocumentPart().getFontTablePart().getJaxbElement();
final var fontsInUse = wordMlPackage.getMainDocumentPart().fontsInUse();
mapper.populateFontMappings(fontsInUse, fonts);
wordMlPackage.setFontMapper(mapper);

try (FileOutputStream fos = new FileOutputStream(pdfFile)) {
  Docx4J.toPDF(wordMlPackage, fos);
}

` 当我收到以下错误时

2024-06-25 21:35:09.742 ERROR [-] [-] [o.d.c.o.f.XsltFOFunctions] [true] >>> begin 0, end -1, length 2

java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 2
    at java.base/java.lang.String.checkBoundsBeginEnd(String.java:4606)
    at java.base/java.lang.String.substring(String.java:2709)
    at org.docx4j.model.properties.run.Lang.setXslFO(Lang.java:71)
    at org.docx4j.convert.out.fo.XsltFOFunctions.createFoAttributes(XsltFOFunctions.java:934)
    at org.docx4j.convert.out.fo.XsltFOFunctions.createBlockForRPr(XsltFOFunctions.java:906)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.docx4j.org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:300)
    at org.docx4j.org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:440)
    at org.docx4j.org.apache.xalan.extensions.ExtensionsTable.extFunction(ExtensionsTable.java:226)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.extFunction(TransformerImpl.java:491)
    at org.docx4j.org.apache.xpath.functions.FuncExtFunction.execute(FuncExtFunction.java:208)
    at org.docx4j.org.apache.xpath.XPath.execute(XPath.java:342)
    at org.docx4j.org.apache.xalan.templates.ElemCopyOf.execute(ElemCopyOf.java:134)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemChoose.execute(ElemChoose.java:128)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemChoose.execute(ElemChoose.java:141)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:395)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:178)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transformToRTF(TransformerImpl.java:2006)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transformToRTF(TransformerImpl.java:1928)
    at org.docx4j.org.apache.xalan.templates.ElemVariable.getValue(ElemVariable.java:312)
    at org.docx4j.org.apache.xalan.templates.ElemVariable.execute(ElemVariable.java:248)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:395)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:178)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transformToRTF(TransformerImpl.java:2006)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transformToRTF(TransformerImpl.java:1928)
    at org.docx4j.org.apache.xalan.templates.ElemVariable.getValue(ElemVariable.java:312)
    at org.docx4j.org.apache.xalan.templates.ElemVariable.execute(ElemVariable.java:248)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemChoose.execute(ElemChoose.java:128)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:395)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:178)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1376)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1376)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:395)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:178)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1376)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:395)
    at org.docx4j.org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:178)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2418)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2288)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1374)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:727)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1291)
    at org.docx4j.org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1269)
    at org.docx4j.XmlUtils.transform(XmlUtils.java:1518)
    at org.docx4j.XmlUtils.transform(XmlUtils.java:1337)
    at org.docx4j.convert.out.common.AbstractXsltExporterDelegate.process(AbstractXsltExporterDelegate.java:66)
    at org.docx4j.convert.out.common.AbstractWmlExporter.process(AbstractWmlExporter.java:82)
    at org.docx4j.convert.out.common.AbstractWmlExporter.process(AbstractWmlExporter.java:32)
    at org.docx4j.convert.out.common.AbstractExporter.export(AbstractExporter.java:80)
    at org.docx4j.Docx4J.toFO(Docx4J.java:711)
    at org.docx4j.Docx4J.toPDF(Docx4J.java:744)
    at org.coursera.coursebuilder.managers.utils.ImageConverterImpl.convertDocxToImage(ImageConverterImpl.java:90)
    at org.coursera.coursebuilder.managers.utils.ImageConverterImpl.convertToImage(ImageConverterImpl.java:65)
    at org.coursera.coursebuilder.jobs.TextExtractJob.invoke(TextExtractJob.java:66)
    at org.coursera.coursebuilder.jobs.TextExtractJob.invoke(TextExtractJob.java:10)
    at org.coursera.common.job.managers.JobManager.invokeJob(JobManager.java:122)
    at org.coursera.common.job.managers.JobManager.lambda$invokeJobByName$1(JobManager.java:110)
    at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
    at org.springframework.security.concurrent.DelegatingSecurityContextRunnable.run(DelegatingSecurityContextRunnable.java:82)
    at io.grpc.Context$1.run(Context.java:566)
    at org.coursera.common.executors.ContextualizingThreadPoolExecutor$MdcContextTaskDecorator.lambda$decorate$0(ContextualizingThreadPoolExecutor.java:286)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)

有错误的docx文件是 https://docs.google.com/document/d/1UbPOp_fYlCjp6E_TcU-Ujst6YOf0E4YR/edit?usp=sharing&ouid=116887955152906357976&rtpof=true&sd=true

我认为字体映射出了问题,所以我添加了一些代码来显式处理映射,但仍然不起作用

final var mapper = new BestMatchingMapper(); final var fonts = wordMlPackage.getMainDocumentPart().getFontTablePart().getJaxbElement(); final var fontsInUse = wordMlPackage.getMainDocumentPart().fontsInUse(); mapper.populateFontMappings(fontsInUse, fonts); wordMlPackage.setFontMapper(mapper);

docx4j docx-to-pdf-conversion
1个回答
0
投票

您的文档包含

<w:lang w:val="en"/>
,但 org.docx4j.model.properties.run.Lang.setXslFO 中的代码需要 ISO 639-1 字母代码加上破折号加上 ISO 3166-1 alpha-2 字母代码。

© www.soinside.com 2019 - 2024. All rights reserved.