更快地将字典值分配给变量,或连续访问?

问题描述 投票:0回答:3

我有一本字典,例如:

d = {'a':'a-val', 'b':'b-val', 'c':'c-long-val'*1000}

我需要重复访问

d['c']
,如下所示:

print('value of c is', d['c'])
x_queue.put(d['c'])
some_function(d['c'])

但是我想知道将

d['c']
分配给变量并每次使用它是否会更快:

c_value = d['c']` # is this assignment "worth it"?
print('value of c is', c_value)
x_queue.put(c_value)
some_function(c_value)

我的预感是这可能取决于

  • d
    中的元素数量(
    d
    越大,查找键的成本越高)
  • d['c']
    的大小(
    d['c']
    越大,分配成本越高)
但我真的不确定这些选项之一(或另一个?)是

更快还是更Pythonic

python python-3.x performance dictionary
3个回答
2
投票
虽然通过键访问字典值的平均时间复杂度为

O(1),但在产生计算键的哈希值的开销后,最坏情况的时间复杂度为O(n)。另一方面,变量赋值及其值的检索需要非常短的恒定时间,因此尽可能避免通过键重新评估字典值。


0
投票
密钥在字典中进行哈希处理。因此,当您请求 d['c'] 时,首先计算哈希,然后找到相应的值。比简单的分配更多的步骤。


0
投票

注意: 在您花费时间(也就是您雇主的钱)来缩短程序的纳秒或毫秒之前,请确保您确实需要对其进行优化。

阅读:过早的优化是万恶之源——DonaldKnuth

TL;博士

是的,直接从变量读取总是比从字典读取更快,后者需要计算哈希值。

深入

两种算法的复杂度都是 O(1),但是字典的基本操作是计算哈希,然后读取内存中的地址。这与变量相比,变量只是从内存中读取。

散列 + 读取 > 读取

这就引出了一个问题“什么时候值得在内存中分配一个变量而不重复使用字典?”或“两种操作之间的差异有多大?”

快速而肮脏的基准测试

如果我们信任timeit

,那么缓存该值总是会更快,即使我们只调用该值两次。在 JupyterLab 中测试:

初始化测试字典:

asdf = dict(a=1, b=2, c=3)
读取同一个字典值两次:

%timeit (asdf['a'], asdf['a']) 82.5 ns ± 7.6 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
从字典中读取一次,然后缓存它:

%timeit (u := asdf['a'], u) 62.2 ns ± 2.54 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
如果您必须调用某个值一百零一次,您可以将调用速度提高 5 倍:

%%timeit [asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a'], asdf['a']] 2.85 µs ± 269 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
这是更快的变体:

%timeit [u := asdf['a'], u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u, u] 528 ns ± 29.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
    
© www.soinside.com 2019 - 2024. All rights reserved.