我有一本字典,例如:
d = {'a':'a-val', 'b':'b-val', 'c':'c-long-val'*1000}
我需要重复访问
d['c']
,如下所示:
print('value of c is', d['c'])
x_queue.put(d['c'])
some_function(d['c'])
但是我想知道将
d['c']
分配给变量并每次使用它是否会更快:
c_value = d['c']` # is this assignment "worth it"?
print('value of c is', c_value)
x_queue.put(c_value)
some_function(c_value)
我的预感是这可能取决于
d
中的元素数量(d
越大,查找键的成本越高)d['c']
的大小(
d['c']
越大,分配成本越高)
更快还是更Pythonic。
O(1),但在产生计算键的哈希值的开销后,最坏情况的时间复杂度为O(n)。另一方面,变量赋值及其值的检索需要非常短的恒定时间,因此尽可能避免通过键重新评估字典值。
注意: 在您花费时间(也就是您雇主的钱)来缩短程序的纳秒或毫秒之前,请确保您确实需要对其进行优化。
TL;博士深入
散列 + 读取 > 读取这就引出了一个问题“什么时候值得在内存中分配一个变量而不重复使用字典?”或“两种操作之间的差异有多大?”
快速而肮脏的基准测试
如果我们信任timeit
,那么缓存该值总是会更快,即使我们只调用该值两次。在 JupyterLab 中测试:初始化测试字典:
asdf = dict(a=1, b=2, c=3)
读取同一个字典值两次:
%timeit (asdf['a'], asdf['a'])
82.5 ns ± 7.6 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
从字典中读取一次,然后缓存它:
%timeit (u := asdf['a'], u)
62.2 ns ± 2.54 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
如果您必须调用某个值一百零一次,您可以将调用速度提高 5 倍:
%%timeit [asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a']]
2.85 µs ± 269 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
这是更快的变体:
%timeit [u := asdf['a'], u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u]
528 ns ± 29.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)