如何优雅地预分配 numpy 数组？

Question

这主要是针对 Numpy 数组的，但我觉得这是一个普遍的设计问题。在许多情况下，我在科学计算中遇到以下普遍问题：我必须读取由某个向量场的多个时间实例组成的数据集，例如许多点上的时变速度场。可能是大小为

(Npoints, 3, Ntimesteps)

的 3D 数组（3 因为我有一个向量的三个分量）。我通常每个时间步有一个文件，所以我必须读取

Ntimesteps

文件，但我事先不知道每个字段的大小（即我事先不知道

Npoints

，但对于每个文件）。如果我想预先分配 Numpy 数组来存储这些数据，我习惯于按照以下方式做一些事情：

tsteps = list(glob.glob('time*.csv'))
Nsteps = len(tsteps)
with open(tsteps[0], 'r') as f:
    # do my work to get Npoints and the data of the first file
    dataset = np.zeros(Npoints, 3, Nsteps)
    dataset[:,:,0] = data_first_file
for i,f in enumerate(tsteps[1:]):
    # do my work again
    dataset[:,:,i] = data_from_tstep_i

但是，这看起来不太“优雅”，因为我必须编写两次代码来处理每个时间步文件：一次用于预分配数组，一次在循环中。有更优雅的模式来做到这一点吗？

Answer 1

我想你正在寻找类似的东西：

tsteps = glob.glob('time*.csv') # glob returns a list already
dataset = np.zeros(Npoints, 3, len(tsteps)) # if you need Nsteps just once, you don't need it
for i, filename in enumerate(tsteps): # if you're gonna read each file then place the for loop at the top
    with open(filename) as f:
        dataset[:,:,i] = np.loadtxt(f)

如何优雅地预分配 numpy 数组？

问题描述投票：0回答：1

1个回答

最新问题

如何优雅地预分配 numpy 数组？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1