如何平均所有包含框架的窗口?

问题描述 投票:1回答:1

我有以下情况。我有一个大小的数组(3,128,n)(其中n很大)。 (此数组代表一张图片)。我有一个超分辨率深度学习模型,它将(3,128,128)图像作为输入,并以更好的质量返回。我想用我的模型来应用整个画面。

我现有的解决方案

我对这个问题的第一个解决方案是将我的数组拆分为大小数组(3,128,128)。然后我有一个方形图像列表,我可以将我的模型应用于每个方块,然后连接所有结果以获得一个新的(3,128,n)图像。此方法的问题在于模型在图像边缘上的表现不佳。

我想要的解决方案

为了解决这个问题,我想到了另一种解决方案。我可以考虑可以从原始图像中提取的所有方形图像,而不是考虑非重叠的方形图像。我可以将所有这些图像传递给我的模型。然后,为了重建坐标点(a,b,c),我将考虑包含c的所有重建的方形图片,并取其平均值。我希望这个平均值能给予c靠近中心的正方形更多的权重。更具体 :

  • 我从3 * 128 * n数组开始(我们称之为A)。我在左侧和右侧填充,这给了我一个大小为3*128*(n+2*127)的新阵列(我们称之为A_pad)
  • 对于范围(0,n + 127)中的i,让A_i = A_pad[:, :, i:i+128]A_i的大小(3 * 128 * 128)并且可以输入我的模型,这将创建一个相同大小的新数组B_i
  • 现在我想要一个与A相同大小的新数组B,其定义如下:对于每个(x,y,z),B[x, y, z]是128 B_i[x, y, z+127-i]的平均值,因此z <= i <z + 128与权重1 + min(z + 127 -i, i-z)。这对应于获取包含z的所有窗口的平均值,其权重与到最近边缘的距离成比例。

我的问题是基于B的计算。鉴于我所描述的,我可以编写多个for循环,这将产生正确的结果,但我担心它会很慢。我正在寻找尽可能快的使用numpy的解决方案。

python numpy
1个回答
2
投票

这是一个示例实现,遵循您在“我的所需解决方案”部分中概述的步骤。它广泛使用np.lib.stride_tricks.as_strided,乍一看似乎根本不显而易见;我为每个用法添加了详细的注释以便澄清。另请注意,在您的描述中,您使用z来表示图像中的列位置,而在注释中我使用术语n-position以符合通过n的形状规范。

关于效率,这是否是胜利者并不明显。计算在numpy中发生,但表达式sliding_128 * weights构建一个大数组(原始图像大小的128倍),然后沿着框架维度减少它。这肯定是它的成本,内存甚至可能是一个问题。循环可能在这个位置派上用场。

包含带有# [TEST]前缀的注释的行被添加用于测试目的。具体地说,这意味着我们用1 / 128覆盖最终帧总和的权重,以便最终恢复原始图像(因为也没有应用ML模型转换)。

import numpy as np

n = 640  # For example.
image = np.random.randint(0, 256, size=(3, 128, n))
print('image.shape: ', image.shape)  # (3, 128, 640)

padded = np.pad(image, ((0, 0), (0, 0), (127, 127)), mode='edge')
print('padded.shape: ', padded.shape)  # (3, 128, 894)

sliding = np.lib.stride_tricks.as_strided(
    padded,
    # Frames stored along first dimension; sliding across last dimension of `padded`.
    shape=(padded.shape[-1]-128+1, 3, 128, 128),
    # First dimension: Moving one frame ahead -> move across last dimension of `padded`.
    # Remaining three dimensions: Move as within `padded`.
    strides=(padded.strides[-1:] + padded.strides)
)
print('sliding.shape: ', sliding.shape)  # (767, 3, 128, 128)

# Now at this part we would feed the frames `sliding` to the ML model,
# where the first dimension is the batch size.
# Assume the output is assigned to `sliding` again.
# Since we're not using an ML model here, we create a copy instead
# in order to update the strides of `sliding` with it's actual shape (as defined above).
sliding = sliding.copy()

sliding_128 = np.lib.stride_tricks.as_strided(
    # Reverse last dimension since we want the last column from the first frame.
    # Need to copy again because `[::-1]` creates a view with negative stride,
    # but we want actual reversal to work with the strides below.
    # (There's perhaps a smart way of adjusting the strides below in order to not make a copy here.)
    sliding[:, :, :, ::-1].copy(),
    # Second dimension corresponds to the 128 consecutive frames.
    # Previous last dimension is dropped since we're selecting the
    # column that corresponds to the current n-position.
    shape=(128, n, 3, 128),
    # First dimension (frame position): Move one frame and one column ahead
    #     (actually want to move one column less in `sliding` but since we reverted order of columns
    #      we need to move one ahead now) -> move across first dimension of `sliding` + last dimension of `sliding`.
    # Second dimension (n-position): Moving one frame ahead -> move across first dimension of `sliding`.
    # Remaining two dimensions: Move within frames (channel and row dimensions).
    strides=((sliding.strides[0] + sliding.strides[-1],) + sliding.strides[:1] + sliding.strides[1:3])
)
print('sliding_128.shape: ', sliding_128.shape)  # (128, 640, 3, 128)

# Weights are independent of the n-position -> we can precompute.
weights = 1 + np.concatenate([np.arange(64), np.arange(64)[::-1]])
weights = np.ones(shape=128)  # [TEST] Assign weights for testing -> want to obtain the original image back.
weights = weights.astype(float) / weights.sum()  # Normalize?
weights = weights[:, None, None, None]  # Prepare for broadcasting.

weighted_image = np.moveaxis(np.sum(sliding_128 * weights, axis=0), 0, 2)
print('weighted_image.shape: ', weighted_image.shape)  # (3, 128, 640)

assert np.array_equal(image, weighted_image.astype(int))  # [TEST]
© www.soinside.com 2019 - 2024. All rights reserved.