免责声明:我是
borb
的作者。我发布这个问题是因为该项目的 Github 问题页面上已经有人提出这个问题,我希望将其发布到 Stackoverflow 上能让更广泛的受众看到答案。
我正在使用
borb
库,它具有表示文本内容的类,例如 ChunkOfText
、LineOfText
、HeterogeneousParagraph
和 Paragraph
。
虽然这些属性(在其构造函数中)使我能够选择字体、字体大小、颜色等,但我没有找到使我能够在文本下划线的属性/方法。
from borb.pdf import Document
from borb.pdf import Page
from borb.pdf import PageLayout
from borb.pdf import SingleColumnLayout
from borb.pdf import Paragraph
from borb.pdf import PDF
# empty document
doc: Document = Document()
# empty page
page: Page = Page()
doc.add_page(page)
# layout
layout: PageLayout = SingleColumnLayout(page)
# paragraph
layout.add(Paragraph("Lorem ipsum dolor sit amet"))
# store
with open("output.pdf", "wb") as pdf_file_handle:
PDF.dumps(pdf_file_handle, doc)
技巧是使用
ChunkOfText
,这样您就可以获得布局信息(之前的绘画框),然后为每个下划线绘制线条(使用 DisconnectedShape
)。
import typing
from decimal import Decimal
from borb.pdf import Document, DisconnectedShape, PDF, Alignment, HexColor
from borb.pdf import Page
from borb.pdf import PageLayout
from borb.pdf import SingleColumnLayout
from borb.pdf import ChunkOfText
from borb.pdf import HeterogeneousParagraph
from borb.pdf.canvas.geometry.rectangle import Rectangle
def main():
# create document
doc: Document = Document()
# create Page
page: Page = Page()
doc.add_page(page)
# create PageLayout
layout: PageLayout = SingleColumnLayout(page)
# create ChunkOfText
txt: str = "Lorem ipsum dolor sit amet consectetur nunc."
chunksOfText: typing.List[ChunkOfText] = []
for w in txt.split(" "):
chunksOfText += [ChunkOfText(w)]
chunksOfText += [ChunkOfText(" ")]
chunksOfText = chunksOfText[:-1]
# create HeterogeneousParagraph
p: HeterogeneousParagraph = HeterogeneousParagraph(chunksOfText)
layout.add(p)
# underline some of the ChunkOfText objects
for l in _lines_to_underline(p, [2, 3, 4, 8]):
w: Decimal = abs(l[0][0] - l[1][1])
DisconnectedShape(
lines=[l],
stroke_color=HexColor("f1cd2e"),
vertical_alignment=Alignment.BOTTOM,
).paint(page, Rectangle(l[0][0], l[0][1], w, Decimal(2)))
# build PDF
with open("output.pdf", "wb") as pdf_file_handle:
PDF.dumps(pdf_file_handle, doc)
if __name__ == "__main__":
main()
现在让我们看看
_lines_to_underline
,此方法采用 HeterogeneousParagraph
,查看其单独的内容(LayoutElement
对象)并尝试构建行(表示为小数元组的元组)。它还需要一个参数 idx
代表您想要加下划线的索引。
def _lines_to_underline(
p: HeterogeneousParagraph, idx: typing.List[int]
) -> typing.List[
typing.Tuple[typing.Tuple[Decimal, Decimal], typing.Tuple[Decimal, Decimal]]
]:
if len(idx) == 0:
return []
idx: typing.List[int] = sorted(idx)
lines: typing.List[
typing.Tuple[typing.Tuple[Decimal, Decimal], typing.Tuple[Decimal, Decimal]]
] = []
for chunk_index in idx:
c: ChunkOfText = p._chunks_of_text[chunk_index]
r: Rectangle = c.get_previous_paint_box()
# IF we are processing the first ChunkOfText
# THEN we do not need to be concerned about potentially merging its coordinates with a previous ChunkOfText
if len(lines) == 0:
lines += [((r.get_x(), r.get_y()), (r.get_x() + r.get_width(), r.get_y()))]
continue
else:
# IF the previously calculated line is drawn at (roughly) the same y-coordinate
# THEN simply elongate that line
y_delta: Decimal = abs(lines[-1][1][1] - r.get_y())
x_delta: Decimal = abs(lines[-1][1][0] - r.get_x())
if y_delta >= 5 or x_delta >= 5:
lines += [
((r.get_x(), r.get_y()), (r.get_x() + r.get_width(), r.get_y()))
]
else:
x0: Decimal = lines[-1][0][0]
y0: Decimal = lines[-1][0][1]
x1: Decimal = lines[-1][1][0]
y1: Decimal = lines[-1][1][1]
lines[-1] = (x0, y0), (r.get_x() + r.get_width(), y1)
return lines
此方法还执行一些首次尝试合并行,这样如果两个后续单词带有下划线,则仅返回一行。
当我们运行此代码时,我们得到以下输出: