使用np.random.choice()生成样本的快速方法? [重复]

问题描述 投票:2回答:3

这个问题在这里已有答案:

我想生成随机样本而无需替换N次,如下所示:

import numpy as np

sample = np.zeros([100000, 4], int)
for i in range(100000):
    sample[i] = np.random.choice(128, 4, replace=False)

如果迭代变得非常大,则整体采样将是耗时的。有没有办法加快这个采样?

python numpy random
3个回答
0
投票

你的方法

In [16]: sample = np.zeros([100000, 4], int)

In [17]: %timeit for i in range(100000):sample[i] = np.random.choice(128, 4, rep
    ...: lace=False)
1 loop, best of 3: 2.5 s per loop

虽然你可以写:

In [149]: %timeit d=np.random.choice(128,100000);sample1=np.array([(d+x)%128 for x in np.random.choice(128,4)])
The slowest run took 4.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 4.11 ms per loop

这在我的机器上更快

这可能不那么随机,但这取决于您的应用程序。毕竟for循环在香草python中非常慢。你可能对CythonNumba感兴趣


0
投票

这将给你一个随机的int范围(0,128)形状(100000,4)

np.random.randint(128, size=(100000,4))

0
投票

使用random.sample而不是np.random.choice

In [16]: import time
    ...: start_time = time.time()
    ...: sample = np.zeros([100000, 4], int)
    ...: for i in range(100000):
    ...:     sample[i] = random.sample(range(128), 4)
    ...: print("--- %s seconds ---" % (time.time() - start_time))
    ...: 
--- 0.7096474170684814 seconds ---

In [17]: import time
    ...: start_time = time.time()
    ...: sample = np.zeros([100000, 4], int)
    ...: for i in range(100000):
    ...:     sample[i] = np.random.choice(128, 4, replace=False)
    ...: print("--- %s seconds ---" % (time.time() - start_time))
    ...: 
--- 5.2036824226379395 seconds ---
© www.soinside.com 2019 - 2024. All rights reserved.