一次性将渲染的四开文档的所有部分转换或打印为 html

Question

我想将 Shiny for Python 文档转换为 pdf。可以跳到每个部分然后打印为 pdf。但是，想知道是否有一种更紧凑的方法可以一次性打印所有部分。

Answer 1

我可以提出一个基于

wkhtmltopdf

和Python的解决方案（抓取文档不同部分的html文件的链接并将它们传递给

pdfkit

，一个Python库，它是

wkhtmltopdf

的包装器）将 HTML 转换为 PDF 的实用程序。

因此，首先下载

wkhtmltopdf

，然后在您的系统上安装此工具（您可以阅读this以获取有关安装过程的帮助，如果您是Windows用户，请记住将
wkhtmltopdf
添加到PATH） .

然后通过 cmd/shell 检查其可用性，

$ wkhtmltopdf --version

# wkhtmltopdf 0.12.6 (with patched qt)

现在安装这些Python库（假设你已经安装了Python），

pip install requests beautifulsoup4 pdfkit

然后运行这个Python脚本，

$ python html2pdf.py

html2pdf.py


import re
import pdfkit
import requests
from bs4 import BeautifulSoup

# Making a GET request
r = requests.get('https://shiny.rstudio.com/py/docs/get-started.html')

# print(r.status_code)
  
# Parsing the HTML
soup = BeautifulSoup(r.content, 'html.parser')
a = soup.find_all('a', class_='sidebar-link')

# get the links
links = [link.get('href') for link in a if link.get('href') is not None]
site_link = 'https://shiny.rstudio.com/py'
full_links = [site_link + link[2:] for link in links]

# for file names
names = [re.findall("(?:.+\/)(.+)(?:.html)", link)[0] for link in full_links] 

# convert the link of htmls to pdf
for i, link in enumerate(full_links):
    pdfkit.from_url(link, f"{names[i]}.pdf")

它将把所有html文件（https://shiny.rstudio.com/py/docs/侧边栏中的链接）一次性转换为pdf文件。

$ ls

get-started.pdf            reactive-programming.pdf  ui-navigation.pdf
html2pdf.py                reactive-values.pdf       ui-page-layouts.pdf
overview.pdf               running-debugging.pdf     ui-static.pdf
putting-it-together.pdf    server.pdf                user-interface.pdf
reactive-calculations.pdf  ui-dynamic.pdf            workflow-modules.pdf
reactive-events.pdf        ui-feedback.pdf           workflow-server.pdf
reactive-mutable.pdf       ui-html.pdf

一次性将渲染的四开文档的所有部分转换或打印为 html

问题描述投票：0回答：1

1个回答

最新问题

一次性将渲染的四开文档的所有部分转换或打印为 html

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1