函数总计重复指数

Question

我们正在寻找改善我们正在开发的大型管道中这一小步骤的方法。问题是：

given：有效地描述描述重量的浮动类型的整数数据类型的多个

np.ndarray

对象。所有这些阵列都具有相同的形状/尺寸。

问题：

每当重复一组索引时，总结重复元组的权重（现在通过构造是唯一的）。返回：

独特的指数和总重量。我们已经开发了一种相当强大的方法，但是它很慢。考虑： np.ndarray

this工作并返回预期的结果：

x = np.array([1, 1, 2, 2, 3, 3, 1, 1, 3, 3, 4], dtype=np.uint16) y = np.array([1, 1, 2, 2, 2, 2, 1, 1, 3, 4, 5], dtype=np.uint16) l = np.array([1, 2, 2, 2, 3, 2, 1, 1, 3, 3, 6], dtype=np.uint16) v = np.array([2, 4, 6, 8, 7, 5, 3, 1, 8, 6, 4], dtype=np.float64) indices = (x, y, l) dims = [np.amax(index) + 1 for index in indices] idx = np.ravel_multi_index(indices, dims, order='F') out, uind, cinv = np.unique(idx, return_index=True, return_inverse=True) vv = np.bincount(cinv, weights=v) out = tuple(index[uind] for index in indices) ret = (vv, *out)

print(vv, out)

这是一部带有较小阵列的MWE，但实际上，这些阵列将超过一百万个元素。这就提出了一个问题，即通过这些设置来调用

array([ 6.,  4., 14.,  5.,  7.,  8.,  6.,  4.])
array([1, 1, 2, 3, 3, 3, 3, 4], dtype=uint16)
array([1, 1, 2, 2, 2, 3, 4, 5], dtype=uint16)
array([1, 2, 2, 2, 3, 3, 3, 6], dtype=uint16)

非常慢。我们尝试了几件事：使用

np.unique

或

CSR_matrix

，出于不同的原因，它们的情况并不好。

执行这些类型的计算的更有效的方法是什么？

TrryC或Fortran，或其他编译器的编译语言。

函数总计重复指数

问题描述投票：0回答：0

最新问题

函数总计重复指数

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0