为什么在 NDArray 视图上设置标志会导致分配?它们保证有界吗?

问题描述 投票:0回答:1

考虑这段代码:

import numpy as np
import itertools


def get_view(arr):
    view = arr.view()
    view.flags.writeable = False  # this line causes memory to leak?
    return view


def main():
    for _ in itertools.count():
        get_view(np.zeros(1000))


if __name__ == "__main__":
    main()

将视图设置为不可写的行似乎会导致内存泄漏,尽管我不知道它是否有界。

  1. 为什么会出现这种情况?
  2. 是否保证有界?或者这是一个麻木的错误?或者也许它们是引用计数的,但由于某种原因手动调用垃圾收集器不会收集它们?

这是用tracemalloc 逻辑装饰的相同程序,用于每 100k 调用 get_view 打印一次分配。

import numpy as np
import tracemalloc
import itertools
import gc


def log_diff(snapshot, prev_snapshot):
    diff = snapshot.compare_to(prev_snapshot, "lineno")
    reported = 0
    for stat in diff:
        if "tracemalloc.py" in stat.traceback[0].filename:
            continue
        if stat.size_diff <= 0:
            continue
        print(f"#{reported}: {stat}")
        reported += 1
    print("---")


def get_view(arr):
    view = arr.view()
    view.flags.writeable = False  # this line causes memory to leak?
    return view


def main():
    tracemalloc.start()
    prev_snapshot = None
    for i in itertools.count():
        get_view(np.zeros(1000))
        if i % 100000 == 0:
            gc.collect(generation=2)
            snapshot = tracemalloc.take_snapshot()
            if prev_snapshot is not None:
                log_diff(snapshot, prev_snapshot)
            prev_snapshot = snapshot


if __name__ == "__main__":
    main()

在 Linux 上的 Python 3.11.6 和 numpy 1.26.4 上,我们获得的分配数量似乎是不确定的,但我看到它增长的最大数量约为 250。它在开始时增长,然后增长得慢得多。

如果我注释掉分配

view.flags.writeable
的行,内存使用量不会增加。

#0: /home/sami/bug.py:22: size=3534 B (+3477 B), count=62 (+61), average=57 B
#1: /home/sami/bug.py:29: size=84 B (+28 B), count=2 (+1), average=42 B
---
#0: /home/sami/bug.py:22: size=5871 B (+2337 B), count=103 (+41), average=57 B
#1: /home/sami/bug.py:15: size=72 B (+72 B), count=1 (+1), average=72 B
---
---
#0: /home/sami/bug.py:22: size=6270 B (+399 B), count=110 (+7), average=57 B
---
#0: /home/sami/bug.py:22: size=6327 B (+57 B), count=111 (+1), average=57 B
---
#0: /home/sami/bug.py:22: size=7638 B (+1311 B), count=134 (+23), average=57 B
---
#0: /home/sami/bug.py:22: size=7809 B (+171 B), count=137 (+3), average=57 B
---
---
#0: /home/sami/bug.py:22: size=8436 B (+627 B), count=148 (+11), average=57 B
---
#0: /home/sami/bug.py:22: size=8664 B (+228 B), count=152 (+4), average=57 B
---
#0: /home/sami/bug.py:22: size=8892 B (+228 B), count=156 (+4), average=57 B
---
---
#0: /home/sami/bug.py:22: size=9120 B (+228 B), count=160 (+4), average=57 B
---
---
#0: /home/sami/bug.py:22: size=9177 B (+114 B), count=161 (+2), average=57 B
---
...
python numpy memory-leaks numpy-ndarray
1个回答
0
投票

我不确定这是否是内存泄漏,但我可以给你一个不分配内存的等效方法:

view.setflags(write=False)

在tracemalloc下运行它表明它没有在此行分配内存。

© www.soinside.com 2019 - 2024. All rights reserved.