我正在尝试创建一个基本的
Manim
动画,该动画显示语言模型如何通过从左到右预测标记来生成文本的自回归生成过程。
目前,我使用的是
VGroup
,它随着每个新令牌的“预测”而增长。
但是,标记的基线对齐方式显得“参差不齐”。
如何确保标记位置正确,以便文本对齐?
下面是我演示该问题的最小示例:
from manim import *
class textConcat(Scene):
def construct(self):
# List of tokens to predict
tokens = "The cat is hungry".split(" ")
# Initial token setup
text = Text(tokens.pop(0), font="Helvetica", font_size=20).set_color(BLUE)
text_group = (
VGroup(text).arrange(RIGHT, aligned_edge=DOWN).to_corner(UL, buff=1.0)
)
self.play(Write(text_group))
for token in tokens:
# machine outputs the next token to the right
prediction = Text(token, font="Helvetica", font_size=20).set_color(BLUE)
# Instantiate the prediction behind the machine
self.play(FadeIn(prediction), run_time=0.01)
# # Slide the prediction to the right of the machine
# self.play(prediction.animate.next_to(machine, RIGHT * 4), run_time=0.5)
# Create an arc path from the right of the machine to the left
start_point = prediction.get_left()
end_point = text_group.get_right() + RIGHT * 0.3
arc_path = ArcBetweenPoints(
start_point, end_point, angle=-PI
) # Negative for an arc underneath
# Animate the prediction following the arc path
self.play(MoveAlongPath(prediction, arc_path), run_time=1.0)
# Update the VGroup with the new prediction token # TODO: Fix alignment of baseline
text_group.add(prediction)
self.play(
prediction.animate.next_to(text_group[-2], RIGHT),
run_time=0.5,
)
# Finish animation
self.wait(2)
if __name__ == "__main__":
from manim import *
textConcat().render()
结果:
我尝试捕获初始标记的 y 坐标并相应地强制对齐,例如:
text_group.shift(UP * (text_group.get_y() - text_group[0].get_y()))
但这并没有帮助。
任何指点将不胜感激!
ManimCommunity v0.18.1
。
在 manim 中,文本确实很难对齐,因为 manim 对于特定大小的文本没有固定的高度。相反,使用底层 SVG 图像的高度(我假设)。因此,在您的情况下,“The”的高度和中心与“hungry”不同,因为“g”和“y”较低。
我能想到的唯一解决方案是使用 Tex() 对象并在后面的位置创建文本的副本(在变量 token_copy 中)
from manim import *
class textConcat(Scene):
def construct(self):
# List of tokens to predict
tokens = "The cat is hungry".split(" ")
# Initial token setup
token_copy = Tex(*tokens, arg_separator = ' ', font_size=30).set_color(RED).to_corner(UL, buff=1.0)
text = Tex(tokens.pop(0), font_size=30).set_color(BLUE)
text_group = (
text.move_to(token_copy[0].get_center())
)
self.play(Write(text_group))
for i in range(1,4):
# machine outputs the next token to the right
prediction = Tex(tokens[i-1], font_size=30).set_color(BLUE)
# Instantiate the prediction behind the machine
self.play(FadeIn(prediction), run_time=0.01)
# # Slide the prediction to the right of the machine
# self.play(prediction.animate.next_to(machine, RIGHT * 4), run_time=0.5)
# Create an arc path from the right of the machine to the left
start_point = prediction.get_left()
end_point = token_copy[i].get_center() #text_group.get_right() + RIGHT * 0.3
arc_path = ArcBetweenPoints(
start_point, end_point, angle=-PI
) # Negative for an arc underneath
# Animate the prediction following the arc path
self.play(MoveAlongPath(prediction, arc_path), run_time=1.0)
# Update the VGroup with the new prediction token # TODO: Fix alignment of baseline
text_group.add(prediction)
self.wait(0.5)
# Finish animation
self.wait(2)
代码(例如 for 循环)可以显而易见。更加优化。但我只是想展示总体想法。