我想使用 Python 将多个文件从 .docx 转换为 PDF。我的代码无法运行:
import re
import os
from pathlib import Path
import sys
from docx2pdf import convert
# The location where the files are located
input_path = r'c:\Folder7\input'
# The location where we will write the PDF files
output_path = r'c:\Folder7\output'
# Creeaza structura de foldere daca nu exista
os.makedirs(output_path, exist_ok=True)
# Verifica existenta folder-ului
directory_path = Path(input_path)
if directory_path.exists() and directory_path.is_dir():
print(directory_path, "exists")
else:
print(directory_path, "is invalid")
sys.exit(1)
for file_path in directory_path.glob("*"):
# file_path is a Path object
print("Procesez fisierul:", file_path)
document = Document()
# file_path.name is the name of the file as str without the Path
document.add_heading(file_path.name, 0)
file_content = file_path.read_text(encoding='UTF-8')
document.add_paragraph(file_content)
# build the new path where we store the files
output_file_path = os.path.join(output_path, file_path.name + ".pdf")
document.save(output_file_path)
print("Am convertit urmatorul fisier:", file_path, "in: ", output_file_path)
我收到此错误:
Traceback (most recent call last):
File "D:\Convert docx to pdf.py", line 26, in <module>
document = Document()
NameError: name 'Document' is not defined
如何让这段代码工作?
这会起作用
import os
from pathlib import Path
import sys
from docx2pdf import convert
# The location where the files are located
input_path = r'c:\Folder7\input'
# The location where we will write the PDF files
output_path = r'c:\Folder7\output'
# Creeaza structura de foldere daca nu exista
os.makedirs(output_path, exist_ok=True)
# Verifica existenta folder-ului
directory_path = Path(input_path)
if directory_path.exists() and directory_path.is_dir():
print(directory_path, "exists")
else:
print(directory_path, "is invalid")
sys.exit(1)
for file_path in directory_path.glob("*"):
print("Procesez fisierul:", file_path)
# build the new path where we store the files
output_file_path = os.path.join(output_path, file_path.stem + ".pdf")
input_file_path = os.path.join(input_path, file_path.name)
convert(input_file_path, output_file_path)
print("Am convertit urmatorul fisier:", file_path, "in: ", output_file_path)
您应该先导入文档。现在的消息是说您的“工具包”中没有名为“文档”的对象,如果没有它,您就无法使用它。查看您的导入,您也许可以尝试从 Docx2pdf 导入文档,就像您在转换时所做的那样。
更新:这可能对你有帮助 https://python-docx.readthedocs.io/en/latest/
尝试使用
rocketpdf
它是我创建的一个开源CLI应用程序,用于处理和转换pdf文件。
pip install rocketpdf
转换目录中的所有文件:
rocketpdf parseall ./your/directory
如果您想了解如何操作,请查看存储库中的文档rocketpdf。