我正在尝试从 pdf 中提取图像并允许用户从网格中选择图像。本质上我正在开发一个所见即所得的 SOP 编辑器。用户使用一些参考材料创建 SOP(标准操作程序),并可以将源材料中的图像插入到 SOP 中。
我可以用C#(后端)或js(前端Nuxt3)提取图像,没问题;然而,有些 pdf 的图像本质上是合成图像。假设我有一张汽车的图像。我提取了图像,实际上它是 5 个子图像拼接在一起的。我尝试通过检测重叠线将图像拼接在一起,但有时它们会沿对角线或某种奇怪的形状被分解,这使事情变得超级复杂。
我已附上我最初尝试使用 tato30/vue-pdf 从前端的 pdf 中读取图像的尝试。正如您从下面的沙箱中看到的那样,它可以很好地拉动最初的 2 个图像,但随后有大量可能会重叠的部分图像?奇怪的是,我还在沙盒中提供了一个 pdf 的 url,只是单个页面而不是整个 pdf,如果你将其子到 VuePDF 组件中,你只会得到两个图像......
https://codesandbox.io/p/devbox/ptmy44
编辑 如果您无法查看代码沙箱,这里是代码。
<script setup lang="ts">
import { ref } from 'vue';
import { VuePDF, usePDF } from '@tato30/vue-pdf';
import * as PDFJS from 'pdfjs-dist';
const singlePagePdf = ref('https://stackoverflowrandomfile.blob.core.windows.net/randomfile/split-image-example.pdf');
const multiPagePdf = ref('https://stackoverflowrandomfile.blob.core.windows.net/randomfile/Walmart Supply Chain Standards 2023.pdf');
const currentPage = ref(48);
let { pdf, pages } = usePDF(multiPagePdf.value);
const loaded = ref(false);
const switchPdf = () => {
loaded.value = false;
images.value = [];
let returnObj;
if (currentPage.value === 48) {
returnObj = usePDF(singlePagePdf.value);
currentPage.value = 1
}
else {
returnObj = usePDF(multiPagePdf.value);
currentPage.value = 48
}
pdf = returnObj.pdf;
}
const images = ref<Array<{
url: string;
transform: any | null;
page: number;
width: number;
height: number;
}>>([]);
const getPageImages = (page: number, documentUrl: string) => {
loaded.value = true;
pdf.value.promise.then(async (document: any) => { // Replace 'any' with appropriate type if available
const pageProxy = await document.getPage(page);
const ops = await pageProxy.getOperatorList();
const objs: Array<{ type: string; transform: any | null; imageId: string }> = [];
for (let i = 0; i < ops.fnArray.length; i++) {
if (
ops.fnArray[i] === PDFJS.OPS.paintImageXObject ||
ops.fnArray[i] === PDFJS.OPS.transform
) {
const argsVals = ops.argsArray[i];
objs.push({
type: ops.fnArray[i] === PDFJS.OPS.paintImageXObject ? 'image' : 'transform',
transform: ops.fnArray[i] === PDFJS.OPS.transform ? argsVals : null,
imageId: argsVals[0],
});
}
}
objs.map(async (val, i) => {
if (val.type === 'image') {
const imageKey = val.imageId;
pageProxy.objs.get(imageKey, async (obj: any) => { // Replace 'any' with appropriate type if available
const bitmap = await createImageBitmap(obj.bitmap);
const ocanvas = new OffscreenCanvas(bitmap.width, bitmap.height);
const ctx = ocanvas.getContext("bitmaprenderer");
if (ctx) {
ctx.transferFromImageBitmap(bitmap);
const blob = await ocanvas.convertToBlob({ type: "image/png" });
const blobUrl = URL.createObjectURL(blob);
const transform = objs.length > 1 && i > 0 ? objs[i - 1] : null;
images.value.push({
url: blobUrl,
transform: transform ? transform.transform : null,
page,
width: ocanvas.width,
height: ocanvas.height,
});
}
});
}
});
})
};
</script>
<template>
<button v-if="loaded" @click="switchPdf">Switch Pdf</button>
<VuePDF :key="currentPage" @loaded="getPageImages(currentPage)" :pdf="pdf" :page="currentPage"></VuePDF>
<ul v-if="loaded">
<li v-for="image in images">
<img :src="image.url"/>
</li>
</ul>
<div v-else>Loading large pdf...</div>
</template>
<style scoped>
.read-the-docs {
color: #888;
}
</style>
是否有任何已知的算法可以使用共享边将图像连接在一起?
通过 PDF 阅读器查看。如需单页副本,请参阅https://easyupload.io/8lk59v
我们可以使用 C 命令行工具首先通过查询第 48 页上的嵌入来提取所有图像对象编号。但是,将这些区域作为一种类型的图像和一个 RGB 进行快照会更简单、更好色彩空间。
以下详细说明了为什么此类图像在 PDF 位图格式之外无法很好地融合。
mutool info -IM ..\..\..\..\downloads\page48.pdf
..\..\..\..\downloads\page48.pdf:
PDF-1.6
Info object (2491 0 R):
<</CreationDate(D:20230405132102-05'00')/Creator(Adobe InDesign 18.2 \(Macintosh\)\(FlexiPDF\))/ICNAppName(FlexiPDF)/ICNAppPlatform(Win)/ICNAppVersion(3.10.0)/ModDate(D:20250101133430)/Producer(Acrobat Distiller 23.0 \(Macintosh\))/Title(Supply Chain Packaging Guide)>>
Pages: 1
Retrieving info from pages 1-1...
Mediaboxes (1):
1 (307 0 R): [ 0 0 792 612 ]
Images (17):
1 (307 0 R): [ DCT ] 395x410 8bpc DevCMYK (311 0 R)
1 (307 0 R): [ DCT ] 317x247 8bpc DevCMYK (312 0 R)
1 (307 0 R): [ DCT ] 97x97 8bpc DevCMYK (313 0 R)
1 (307 0 R): [ DCT ] 79x48 8bpc DevCMYK (314 0 R)
1 (307 0 R): [ DCT ] 66x50 8bpc DevCMYK (315 0 R)
1 (307 0 R): [ DCT ] 97x97 8bpc DevCMYK (316 0 R)
1 (307 0 R): [ DCT ] 50x82 8bpc DevCMYK (317 0 R)
1 (307 0 R): [ DCT ] 50x66 8bpc DevCMYK (318 0 R)
1 (307 0 R): [ Flate ] 18x18 8bpc Idx (319 0 R)
1 (307 0 R): [ DCT ] 125x125 8bpc DevCMYK (320 0 R)
1 (307 0 R): [ DCT ] 63x75 8bpc DevCMYK (321 0 R)
1 (307 0 R): [ Flate ] 13x15 8bpc DevCMYK (322 0 R)
1 (307 0 R): [ DCT ] 125x125 8bpc DevCMYK (323 0 R)
1 (307 0 R): [ DCT ] 63x75 8bpc DevCMYK (324 0 R)
1 (307 0 R): [ Flate ] 13x13 8bpc DevCMYK (325 0 R)
1 (307 0 R): [ DCT ] 29x104 8bpc DevCMYK (326 0 R)
1 (307 0 R): [ Flate ] 6x6 8bpc DevCMYK (327 0 R)
现在我们知道它们是混合尺寸、颜色和文件类型,因为有些是 jpg,有些不是。
为了获得它们的坐标,我们可以仅对图像渲染进行跟踪。
mutool trace ..\..\..\..\downloads\page48.pdf |find "image"
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="227.439 0 -0 236.304 48.888 297.648" width="395" height="410"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="182.552 0 -0 141.984 308.088 289.72804" width="317" height="247"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="55.891 0 -0 55.584 349.848 300.528" width="97" height="97"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="45.519 0 -0 27.504 395.214 355.24803" width="79" height="48"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="38.029 0 -0 28.224 421.854 381.888" width="66" height="50"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="55.891 0 -0 55.584 403.848 300.528" width="97" height="97"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="28.795 0 -0 46.908 377.208 345.19804" width="50" height="82"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="28.795 0 -0 37.555 349.848 372.557" width="50" height="66"/>
<fill_image alpha="1" colorspace="Indexed(167,DeviceCMYK)" ri="1" bp="1" op="0" opm="1" transform="10.366 0 -0 10.198 403.848 355.248" width="18" height="18"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="71.965 0 -0 71.424 386.568 301.968" width="125" height="125"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="36.276 0 -0 42.623 351.288 366.049" width="63" height="75"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="7.4855 0 -0 8.1187 386.568 372.5283" width="13" height="15"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="71.965 0 -0 71.424 351.288 301.968" width="125" height="125"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="36.276 0 -0 42.623 421.848 366.049" width="63" height="75"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="7.4855 0 -0 7.4162 415.296 372.5278" width="13" height="13"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="16.704 0 -0 59.904 455.688 353.088" width="29" height="104"/>
<fill_image alpha="1" colorspace="DeviceCMYK" ri="1" bp="1" op="0" opm="1" transform="3.4577 0 -0 3.024 453.528 352.368" width="6" height="6"/>