开始时间只有40分钟
过了一会儿,时间需要3个小时
我想也许是我的型号代码错误?
if self.spatial_frozen:
self.spatial_backbone.eval()
with torch.no_grad():
x = self.spatial_backbone(x)
#spatial_backbone dont need grad
spatial_x = rearrange(x, '(b f) d -> b f d', b=batch_size)
x = spatial_x
x = self.temporal_transformer(x)
#temporal_transformer need grad