Numpy memmap 损坏数组

Question

我使用numpy-2.1.2-cp313-cp313-win_amd64。当我尝试通过 memmap 加载数组时，数组形状和数据已损坏。最小可重现示例如下：

>>> a = np.arange(65536)
>>> a
array([    0,     1,     2, ..., 65533, 65534, 65535])
>>> np.save('f.npy', a)

# When I load an array via np.load, it's OK.
>>> b = np.load('f.npy')
>>> b
array([    0,     1,     2, ..., 65533, 65534, 65535])
>>> a.dtype
dtype('int64')

# When I use memmap, the shape of array is corrupted and some elements were added into beginning
>>> c = np.memmap('f.npy', dtype=np.int64, mode='r')
>>> c
memmap([    379676406402707, 7166182912910098550, 4064846277420656498,
        ...,               65533,               65534,
                      65535])
>>> c.shape
(65552,)

# When I specified shape, the same elements were added into beginning, and elements on the tail were cropped.
>>> c = np.memmap('f.npy', dtype=np.int64, mode='r', shape=a.shape)
>>> c
memmap([    379676406402707, 7166182912910098550, 4064846277420656498,
        ...,               65517,               65518,
                      65519])

当我在这个例子中使用

np.load('f.npy', mmap_mode='r')

时，这是可以的，但在实际数据中会出现ValueError: Cannot load file contains pickled data whenallow_pickle=False。如果我切换

allow_pickle=True

，则会出现另一个错误：

UnpicklingError: Failed to interpret file 'f.npy' as a pickle

。

所以，我想使用memmap。我怎样才能正确地做到这一点？

Answer 1

由

np.save

和

np.memmap

创建的文件不兼容。这两种保存/加载方式可以应用于 numpy 数组：

通过 numpy 加载/保存函数保存和加载：

>>> a = np.arange(65536)
>>> np.save('f.npy', a)
>>> b = np.load('f.npy')
>>> b
array([    0,     1,     2, ..., 65533, 65534, 65535])
>>> b.shape
(65536,)
>>> c = np.load('f.npy', mmap_mode='r')
>>> c
memmap([    0,     1,     2, ..., 65533, 65534, 65535])
>>> c.shape
(65536,)

通过memmap保存和加载：

>>> a = np.arange(65536)
>>> outFile = np.memmap('f.npy', dtype=a.dtype, mode='w+', shape=a.shape)
>>> outFile[:] = a
>>> outFile.flush()
>>> inFile = np.memmap('f.npy', dtype=np.int64, mode='r')
>>> inFile
memmap([    0,     1,     2, ..., 65533, 65534, 65535])
>>> inFile.shape
(65536,)

当您混合使用这两种方式时，结果将会损坏。

Numpy memmap 损坏数组

问题描述投票：0回答：1

1个回答

最新问题

Numpy memmap 损坏数组

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1