如何旋转和填充巨大的numpy数组而不消耗太多内存？

Question

我正在处理大的wsi图像，在制作wsi时，我使用opencv进行计算并将帧附加到一个巨大的numpy数组中（50kx50kx3数组占用7.5gb内存），为了保存，我使用了opencv imwrite函数，但是当我尝试时要使用 openslide 打开那些保存的 tiff 文件，它会抛出“不支持或丢失图像文件”错误。

我发现opencv不会将图像保存为金字塔，然后我尝试使用pyvips将图像保存为金字塔，在保存之前我需要旋转图像，因为相机会旋转以加快扫描速度，但是当我使用opencv旋转时，它的数量会增加一倍程序消耗的内存，我想知道是否有另一种方法可以在不使用大量内存的情况下进行旋转，当我到达图像边缘时也会发生这种情况，因此需要填充它以免溢出，但在填充后它会下降并且只占用双倍内存首先，有没有最佳的方法来进行垫和旋转？这是我填充数组的地方：

if np.shape(test_matrix)[1] - (cm2 + int(y_edge_on_wsi) + y_in) <= 2000:
    test_matrix = np.pad(test_matrix, ((0, 0), (0, 10000), (0, 0)), mode='constant', constant_values=255)
                

if np.shape(test_matrix)[0] - (cm1 + int(x_edge_on_wsi) + x_in) <= 2000:
                        test_matrix = np.pad(test_matrix, ((0, 10000), (0, 0), (0, 0)), mode='constant', constant_values=255)
                

# Add rows or columns to The beginning of the test_matrix
if cm2 + int(y_edge_on_wsi) <= 2000:
    test_matrix = np.pad(test_matrix, ((0, 0), (10000, 0), (0, 0)), mode='constant', constant_values=255)
    cm2 = cm2 + 10000
if cm1 + int(x_edge_on_wsi) <= 2000:
    test_matrix = np.pad(test_matrix, ((10000, 0), (0, 0), (0, 0)), mode='constant', constant_values=255)
    cm1 = cm1 + 10000

这是我在保存图像之前旋转的地方：

self.image = cv2.rotate(self.image, cv2.ROTATE_90_CLOCKWISE)
cv2.imwrite(self.fullSavePathTiff, self.image)

这是旋转之前的内存使用情况：

这是旋转后的内存使用情况：

Answer 1

您可以尝试将图像裁剪成更小的块，以准确定位感兴趣的区域。有一个名为 openseadragon 的库，它有助于指出感兴趣区域的确切位置（

开海龙

以较小的块处理图像可以帮助更有效地管理内存使用。您可以使用此代码片段来指导

import cv2
import numpy as np
from tifffile import imwrite

# load the library
chunk_size = 1000
image_height, image_width = 50000, 50000  # Example dimensions

def rotate_and_pad_chunk(chunk):
    # Rotate the chunk
    rotated_chunk = cv2.rotate(chunk, cv2.ROTATE_90_CLOCKWISE)
    # Pad the chunk
    pad_width = size_of_image/100 "you would have to check the size of image"
    pad_height = size_of_image/100 "you would have to check the size"
    padded_chunk = cv2.copyMakeBorder(rotated_chunk, pad_height, pad_height, pad_width, pad_width, cv2.BORDER_CONSTANT, value=[0, 0, 0])
    return padded_chunk

output_image = np.zeros((image_height, image_width, 3), dtype=np.uint8)  # Adjust as necessary

for y in range(0, image_height, chunk_size):
    for x in range(0, image_width, chunk_size):
        chunk = np.zeros((chunk_size, chunk_size, 3), dtype=np.uint8)  # Replace with actual chunk loading logic
        processed_chunk = rotate_and_pad_chunk(chunk)
        # Place the processed chunk back in the output image
        output_image[y:y + processed_chunk.shape[0], x:x + processed_chunk.shape[1], :] = processed_chunk

# Save the processed image
imwrite('output_image.tiff', output_image)

Answer 2

你可以在 pyvips 中进行旋转，也许：

image = pyvips.Image.new_from_array(numpy_array)
image.rot90().write_to_file("xxx.tif",
    pyramid=True,
    tile=True,
    compression="jpeg",
    Q=85)

pyvips 将直接从 numpy 数组的内存中进行旋转和保存，因此只会使用少量的额外 RAM。应该也很快。

如何旋转和填充巨大的numpy数组而不消耗太多内存？

问题描述投票：0回答：2

2个回答

最新问题

如何旋转和填充巨大的numpy数组而不消耗太多内存？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2