我正在努力使用手电筒对视频进行下采样。我有一个形状为
(1,frame_number,channels,h,w)
的视频张量。我想逐个图像地对该视频张量进行下采样(无 3D 插值)以获得形状为 (1,frame_number,channels,new_h,new_w)
的新张量。我希望这个张量与向后计算兼容。
我尝试过使用调整大小和插值,但直到现在才起作用。有人有想法吗?
非常感谢!
torch 文档中有一个示例,其中视频帧的大小被调整。 我也从中创建了一个处理批次的示例。这可能就是您正在寻找的:
from torchvision import transforms as t
import torch
batch_size, frame_number, channels, height, width = 1, 8, 3, 64, 64
video = torch.ones((batch_size, frame_number, channels, height, width))
video.shape # 1, 8, 3, 64, 64
transforms = [t.Resize((8, 8))] # resize each frame to 8x8 pixels
frame_transform = t.Compose(transforms)
batch = [] # batch buffer
for batch_id in range(batch_size):
video_frames = [] # frame buffer
for frame in video[batch_id]:
video_frames.append(frame_transform(frame))
batch.append(torch.stack(video_frames, 0))
video = torch.stack(batch, 0)
video.shape # 1, 8, 3, 8, 8