具有广泛变化元素的 np.array 的运行总和

Question

我需要计算 np.array 的运行总和。假设

是给定的长度为

的 np.array，并且

是一个整数步长，那么我需要一个长度为

的数组

l - n + 1

，其中

b= [a[0]+a[1]+...+a[n-1], a[1]+a[2]+...+a[n], ..., a[ l-n]+a[l-n+1]+...+a[l-1]]

我尝试了以下方法：

def run_sum(a, n):     
    a_cumsum = np.r_[0, np.cumsum(a)]     
    b = a_cumsum[n:] - a_cumsum[:-n]     
    return b

它工作得很好，但是如果数组

有非常不同的元素（例如其中一些是

1e8

，其中一些是

1e-8

），就像我的情况一样，那么这个方法会给出太多的错误，这些错误会累积在

中np.cumsum

。您对如何直接执行此操作而不使用

np.cumsum

有什么想法吗？

由于

有长度

int(1e8)

我不想使用

for

循环。想要类似的东西：

ind = np.arange(a.size - n + 1)
b[ind] = np.sum(a[ind:ind+n])

但它不起作用（

更新，精度损失的示例：

a = np.array([-9999.999966666666, -5.7847991744735696e-08, -2.983181169098924e-08, -2.093995193995538e-08, -1.6894298480290635e-08, -1.4869270858640853e-08, -1.3961300790320832e-08, -1.3834651535607362e-08, -1.4395718046705324e-08, -1.570696320519037e-08])

run_sum(a, 3) = 
np.array([-9999.999966754345, -1.0861913324333727e-07, -6.766640581190586e-08, -5.2703398978337646e-08, -4.5723936636932194e-08, -4.2664396460168064e-08, -4.219145921524614e-08, -4.393768904265016e-08])

a[:-2] + a[1:1] + a[2:] = 
np.array([-9999.999966754345, -1.0861975537568031e-07, -6.766606211123524e-08, -5.2703521278886865e-08, -4.572487012925232e-08, -4.266522318456905e-08, -4.2191670372633516e-08, -4.393733278750305e-08])

第 5 个数字的差异对我来说很重要！

更新2：认为这会起作用。至少这个玩具示例处理得很好。这个想法是将初始数组拆分为具有大元素和小元素的数组，并独立计算 cumsum。

def run_sum_new(a, n, max_exp_sep=3, exp_sep_lim=4):
    """
    Finds the running sum of n consecutive elements of array a. If exponents of array elements
    vary greatly, divides them into different arrays to minimize the final error.

    Receives:
    a - the original array
    n - number of elements to sum
    max_exp_sep - the maximum number of arrays the original array can be divided into (regardless for positive and negative elements).
                  In total, the maximum number of arrays is 2*max_exp_sep+1
    exp_sep_lim - the minimum of the difference between the exponents of two elements at which they must be separated.

    Returns:
    run_sum - array of running sum of the original array
    """

    #finds maximal exponents of pos and neg elements and minimal exponent of elements of a
    a_pos = a[a>0]
    a_neg = a[a<0]
    if a_pos.size == 0:
        a_pos_max = np.nan
    else:
        a_pos_max = np.max(a_pos)
    if a_neg.size == 0:
        a_neg_max = np.nan
    else:
        a_neg_max = np.max(np.abs(a_neg))
    a_min = np.min(np.abs(a))
    exp_pos_max = np.ceil(np.log10(a_pos_max))
    exp_neg_max = np.ceil(np.log10(a_neg_max))
    exp_min = np.floor(np.log10(a_min))
    del(a_pos)
    del(a_neg)

    #finds separation limits list
    d_exp_pos = exp_pos_max - exp_min
    if np.isnan(d_exp_pos):
        exp_pos_sep_list = []
    elif d_exp_pos <= exp_sep_lim * (max_exp_sep + 1):
        exp_pos_sep_list = [10**(exp_min + exp_sep_lim * i) for i in range(1, max_exp_sep+1) if 10**(exp_min + exp_sep_lim * i) < a_pos_max]
    else:
        new_exp_sep_lim = np.ceil(d_exp_pos / (max_exp_sep + 1))
        print(f"Warning: positive elements of array have too large exponent spread {d_exp_pos} for the given max_exp_sep={max_exp_sep} and exp_sep_lim={exp_sep_lim}.",
              f"New exp_sep_lim={new_exp_sep_lim} has been taken.", sep='\n')
        exp_pos_sep_list = [10**(exp_min + new_exp_sep_lim * i) for i in range(1, max_exp_sep+1)]
    d_exp_neg = exp_neg_max - exp_min
    if np.isnan(d_exp_neg):
        exp_neg_sep_list = []
    elif d_exp_neg <= exp_sep_lim * (max_exp_sep + 1):
        exp_neg_sep_list = [-10**(exp_min + exp_sep_lim * i) for i in range(1, max_exp_sep+1) if 10**(exp_min + exp_sep_lim * i) < a_neg_max]
    else:
        new_exp_sep_lim = np.ceil(d_exp_neg / (max_exp_sep + 1))
        print(f"Warning: negative elements of array have too large exponent spread {d_exp_neg} for the given max_exp_sep={max_exp_sep} and exp_sep_lim={exp_sep_lim}.",
              f"New exp_sep_lim={new_exp_sep_lim} has been taken.", sep='\n')
        exp_neg_sep_list = [10**(exp_min + new_exp_sep_lim * i) for i in range(1, max_exp_sep+1)]
    exp_neg_sep_list.reverse()
    exp_sep_list = [-np.inf, ] + exp_neg_sep_list + exp_pos_sep_list + [np.inf,]

    #separates a
    a_sep_list = [np.where((i <= a) & (a < j), a, 0) for (i, j) in zip(exp_sep_list[:-1], exp_sep_list[1:])]
    a_sep_arr = np.array(a_sep_list)

    #finds run sum
    a_sep_cumsum = np.cumsum(a_sep_arr, axis=-1)
    a_sep_cumsum = np.hstack(([[0]]*a_sep_cumsum.shape[0], a_sep_cumsum))
    run_sum = np.sum(a_sep_cumsum[:,n:] - a_sep_cumsum[:,:-n], axis=0)

    return run_sum

run_sum_new(a, 3) = np.array([-9999.999966754345, -1.0861975537568031e-07, -6.766606211123526e-08, -5.270352127888686e-08, -4.5724870129252317e-08, -4.266522318456904e-08, -4.2191670372633516e-08, -4.393733278750304e-08])

a[:-2] + a[1:1] + a[2:] = 
np.array([-9999.999966754345, -1.0861975537568031e-07, -6.766606211123524e-08, -5.2703521278886865e-08, -4.572487012925232e-08, -4.266522318456905e-08, -4.2191670372633516e-08, -4.393733278750305e-08])

timeit.timeit('run_sum_new(a, 3)', number=1000, globals=globals()) = 0.12470354699999997

Answer 1

如果你想做类似 for 循环的事情，你可以使用

sliding_window_view

:

from numpy.lib.stride_tricks import sliding_window_view as swv

b = swv(a, n).sum(axis=1)

这给出了与你的函数非常相似的结果，它似乎并不缺乏精度：

# generate an input with a large variability
np.random.seed(0)
a = 1/np.random.random(size=100_000_000)*1e-8

a.min(), a.max()
# (1.0000000074212765e-08, 0.9650719974971247)

b1 = run_sum(a, n)
b2 = swv(a, n).sum(axis=1)

# are the two outputs almost equal?
np.allclose(b1, b2)
# True

# what is the greatest difference?
np.abs(b2-b1).max()
# 7.059628683189656e-15

但速度相当慢：

# run_sum(a, n)
895 ms ± 11.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# swv(a, n).sum(axis=1)
2.29 s ± 93.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Answer 2

这是我建议的想法：有一条快速路径和一条准确路径，并根据快速路径是否会产生正确的答案在它们之间进行选择。

实现这样的求和的一种简单快速的方法是获取上一步中使用的值，减去将从窗口中删除的元素，然后添加正在进入窗口的元素。

但是，数组中的元素相差多个数量级，这将导致精度损失，因为从运行总和中减去大元素并不能完全删除它。

因此，我建议跟踪结果相对于窗口中的数字求和可能有多少误差，并在这些情况下计算总和。（我称之为“重新启动”运行总和，因为它消除了之前步骤中积累的错误。）

我还使用了 Numba，这样我就可以编写一个高效的循环。

示例：

import numpy as np
import numba as nb


a = np.array([-9999.999966666666, -5.7847991744735696e-08, -2.983181169098924e-08, -2.093995193995538e-08, -1.6894298480290635e-08, -1.4869270858640853e-08, -1.3961300790320832e-08, -1.3834651535607362e-08, -1.4395718046705324e-08, -1.570696320519037e-08])

@nb.njit()
def run_sum_restart(a, n, rtol):
    eps = np.finfo(a.dtype).eps
    sum_ = a[:n].sum()
    out = np.zeros(len(a) - n + 1)
    out[0] = sum_
    addptr = n
    subptr = 0
    err = 0
    for i in range(1, len(out)):
        sum_ += a[addptr] - a[subptr]
        # Calculate max error for a[addptr] - a[subptr]
        err += (max(abs(a[addptr]), abs(a[subptr]))) * eps
        # Calculate max error for adding that to sum
        err += abs(sum_) * eps
        addptr += 1
        subptr += 1
        if err > rtol * abs(sum_):
            # Restart sum
            new_sum = a[subptr:addptr].sum()
            # print("Max relative error at ", err / sum_)
            # print("Restarted, found error of", (new_sum - sum_) / sum_)
            sum_ = new_sum
            err = 0
        out[i] = sum_
    return out


print(run_sum_restart(a, 3, rtol=1e-5))

输出：

[-9.99999997e+03 -1.08619755e-07 -6.76660621e-08 -5.27035213e-08
 -4.57248701e-08 -4.26652232e-08 -4.21916704e-08 -4.39373328e-08]

@mozway 基准测试的基准时间：

Original
634 ms ± 5.66 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Restart sum version
582 ms ± 9.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

所以这既更准确又更快。

具有广泛变化元素的 np.array 的运行总和

问题描述投票：0回答：2

2个回答

最新问题

具有广泛变化元素的 np.array 的运行总和

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2