使用np.random.choice（）生成样本的快速方法？ [重复]

Question

这个问题在这里已有答案：

How to create 2d array with numpy random.choice for every rows? 4个答案

我想生成随机样本而无需替换N次，如下所示：

import numpy as np

sample = np.zeros([100000, 4], int)
for i in range(100000):
    sample[i] = np.random.choice(128, 4, replace=False)

如果迭代变得非常大，则整体采样将是耗时的。有没有办法加快这个采样？

Answer 1

你的方法

In [16]: sample = np.zeros([100000, 4], int)

In [17]: %timeit for i in range(100000):sample[i] = np.random.choice(128, 4, rep
    ...: lace=False)
1 loop, best of 3: 2.5 s per loop

虽然你可以写：

In [149]: %timeit d=np.random.choice(128,100000);sample1=np.array([(d+x)%128 for x in np.random.choice(128,4)])
The slowest run took 4.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 4.11 ms per loop

这在我的机器上更快

这可能不那么随机，但这取决于您的应用程序。毕竟for循环在香草python中非常慢。你可能对Cython或Numba感兴趣

Answer 2

这将给你一个随机的int范围（0,128）形状（100000,4）

np.random.randint(128, size=(100000,4))

Answer 3

使用random.sample而不是np.random.choice

In [16]: import time
    ...: start_time = time.time()
    ...: sample = np.zeros([100000, 4], int)
    ...: for i in range(100000):
    ...:     sample[i] = random.sample(range(128), 4)
    ...: print("--- %s seconds ---" % (time.time() - start_time))
    ...: 
--- 0.7096474170684814 seconds ---

In [17]: import time
    ...: start_time = time.time()
    ...: sample = np.zeros([100000, 4], int)
    ...: for i in range(100000):
    ...:     sample[i] = np.random.choice(128, 4, replace=False)
    ...: print("--- %s seconds ---" % (time.time() - start_time))
    ...: 
--- 5.2036824226379395 seconds ---

使用np.random.choice（）生成样本的快速方法？ [重复]

问题描述投票：2回答：3

3个回答

最新问题

使用np.random.choice（）生成样本的快速方法？ [重复]

问题描述 投票：2回答：3

3个回答

最新问题

问题描述投票：2回答：3