对于np.digitize
函数,我有一个关于零的数据分布(包括负值和正值)。我希望bin边缘为right=False
为正值,但right=True
为负值(即我取绝对值,下限包含在bin中)。
>>> x = np.array([-10, -4, -1.2, -0.3, 3, 4, 7])
>>> bins = np.array([-8, -4, 0, 4, 8])
>>> np.digitize(x,bins,right=????)
array([0, 1, 2, 2, 3, 4, 4])
是否有一种替代方法来处理除条件集之外的其他方法:
if x <= -8:
return 0
elif -8 < x <= -4:
return 1
elif -4 < x <= 0:
return 2
elif 0 < x < 4:
return 3
elif 4 <= x < 8:
return 4
elif 8 <= x:
return 5
您可以使用numpy.nextafter
以尽可能小的数量移动一些边界:
>>> bins = bins.astype(x.dtype)
>>> bins = np.nextafter(bins, bins + (bins <= 0))
# apply
>>> np.digitize(x, bins)
array([0, 1, 2, 2, 3, 4, 4])
# zero also goes to the right bin
>>> np.digitize(0, bins)
array(2)
经检查
>>> bins
array([-8.e+000, -4.e+000, 5.e-324, 4.e+000, 8.e+000])
# ndarray.__str__ rounds, but casting to list reveals
>>> bins.tolist()
[-7.999999999999999, -3.9999999999999996, 5e-324, 4.0, 8.0]
我们看到零被转移到看起来像一个非常规的东西,这可能会或可能不会在某些平台上引起问题。
只是为了确保我们能够以另一种方式避免这个问题:
>>> bins = np.array([-8, -4, 0, 4, 8])
>>> bins = bins.astype(x.dtype)
>>> bins = np.nextafter(bins, np.minimum(bins, 0))
>>> np.digitize(x, bins, True)
array([0, 1, 2, 2, 3, 4, 4])
>>> np.digitize(0, bins, True)
array(2)
>>> bins.tolist()
[-8.0, -4.0, 0.0, 3.9999999999999996, 7.999999999999999]