在本地文件中存储 SciPy 稀疏对称矩阵、将其加载到共享内存(shm)并从 shm 重建它的简单方法?

问题描述 投票:0回答:1

SciPy稀疏矩阵的上对角线(包括对角线)存储在本地文件中,将文件加载到共享内存(shm)中,并从shm重建稀疏矩阵的简单方法?

ChatGPT 4o 提供了一个看似过于复杂的解决方案,由三个部分组成:

data
indices
indptr
;有必要吗?

import numpy as np
import scipy.sparse as sp
from multiprocessing import shared_memory

# Example matrix creation
matrix = sp.random(100, 100, density=0.1, format='csr')

# Step 1: Extract and save the upper triangular part (including the diagonal)
upper_triangular_matrix = sp.triu(matrix)
sp.save_npz('upper_triangular_matrix.npz', upper_triangular_matrix)

# Step 2: Load the matrix from the file
loaded_matrix = sp.load_npz('upper_triangular_matrix.npz')

# Convert the matrix to shared memory
# Flatten the data, indices, and indptr arrays for shared memory storage
data = loaded_matrix.data
indices = loaded_matrix.indices
indptr = loaded_matrix.indptr

# Create shared memory blocks for each array
shm_data = shared_memory.SharedMemory(create=True, size=data.nbytes)
shm_indices = shared_memory.SharedMemory(create=True, size=indices.nbytes)
shm_indptr = shared_memory.SharedMemory(create=True, size=indptr.nbytes)

# Copy the arrays into the shared memory blocks
np.copyto(np.ndarray(data.shape, dtype=data.dtype, buffer=shm_data.buf), data)
np.copyto(np.ndarray(indices.shape, dtype=indices.dtype, buffer=shm_indices.buf), indices)
np.copyto(np.ndarray(indptr.shape, dtype=indptr.dtype, buffer=shm_indptr.buf), indptr)

# Access the shared memory blocks
shared_data = np.ndarray(data.shape, dtype=data.dtype, buffer=shm_data.buf)
shared_indices = np.ndarray(indices.shape, dtype=indices.dtype, buffer=shm_indices.buf)
shared_indptr = np.ndarray(indptr.shape, dtype=indptr.dtype, buffer=shm_indptr.buf)

# Create a shared memory CSR matrix
shared_csr_matrix = sp.csr_matrix((shared_data, shared_indices, shared_indptr), shape=loaded_matrix.shape)

# Accessing the shared memory matrix
print(shared_csr_matrix)

# Clean up shared memory
shm_data.close()
shm_data.unlink()
shm_indices.close()
shm_indices.unlink()
shm_indptr.close()
shm_indptr.unlink()
python scipy sparse-matrix shared-memory symmetric
1个回答
0
投票

最终我的矩阵并不是真正稀疏,所以使用 SciPy 的稀疏矩阵格式并没有真正帮助节省内存。我的矩阵的主要属性是它是对称的。

ChatGPT 4o 给出了处理对称矩阵的基本步骤,就像它是一个完整矩阵一样,以节省内存;这基本上就是我的要求:

第 1 步:将上对角矩阵保存到文件中

首先,生成并保存对称矩阵的上对角部分(如果尚未保存)。

import numpy as np
import os
from multiprocessing import shared_memory

filename = "upper_diag.npy"

# Function to create a symmetric matrix and save the upper-diagonal part
def create_and_save_upper_diag(filename, size=5):
    A = np.random.rand(size, size)
    A = (A + A.T) / 2  # Make A symmetric
    upper_diag_A = np.triu(A)
    np.save(filename, upper_diag_A)  # Save to a file

# Check if the file exists, if not create and save the matrix
if not os.path.exists(filename):
    create_and_save_upper_diag(filename)

第 2 步:从文件加载到 SHM 中

获得文件后,您可以将其加载到共享内存中。

# Function to load a matrix from a .npy file into shared memory
def load_to_shm(filename):
    # Load matrix from file
    upper_diag_A = np.load(filename)

    # Create shared memory block
    shm = shared_memory.SharedMemory(create=True, size=upper_diag_A.nbytes)

    # Create a NumPy array backed by shared memory
    upper_diag_A_shm = np.ndarray(upper_diag_A.shape, dtype=upper_diag_A.dtype, buffer=shm.buf)
    upper_diag_A_shm[:] = upper_diag_A[:]  # Copy data to shared memory

    return upper_diag_A_shm, shm

# Load the matrix into shared memory
upper_diag_A_shm, shm = load_to_shm(filename)

访问矩阵的示例

您可以使用矩阵的对称性来访问前面讨论的矩阵。

def get_element(matrix, i, j):
    if i <= j:
        return matrix[i, j]
    else:
        return matrix[j, i]

# Example of accessing elements
print(get_element(upper_diag_A_shm, 1, 4))  # Access an element

清理

使用完共享内存后,正确清理它很重要。

# Cleanup
upper_diag_A_shm = None
shm.close()
shm.unlink()  # Remove the shared memory block

此方法将文件操作与 Python 中的共享内存处理相结合,使其适合在不同运行或进程中重用矩阵的应用程序,从而通过避免重新计算来节省时间和内存。

© www.soinside.com 2019 - 2024. All rights reserved.