Python 中多重处理的停滞

Question

我遇到了多处理模块的奇怪行为。谁能解释一下这是怎么回事吗？

以下 MWE 停止（永远运行而不会出现错误）：

#!/usr/bin/env python3

import multiprocessing

import numpy as np
from skimage import io
from sklearn.cluster import KMeans

def create_model():
    sampled_pixels = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(sampled_pixels)

def process_image(test, test2):
    image  = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(image)
    image = kmeans_model.predict(image)

def main():

    create_model()

    with multiprocessing.Pool(1) as pool:

        pool.apply_async(process_image, args=('test', 'test'))

        pool.close()
        pool.join()

if __name__ == "__main__":
    main()

但是，如果我删除该行

create_model()

或更改

def process_image(test, test2)
# as well as
pool.apply_async(process_image, args=('test', 'test'))

到

def process_image(test)`
# and
pool.apply_async(process_image, args=('test'))

代码运行成功，因为它应该成功，因为参数和函数调用

create_model()

是完全多余的。

附录

> pip list
Package       Version
------------- ---------
imageio       2.34.0
joblib        1.4.0
lazy_loader   0.4
networkx      3.3
numpy         1.26.4
packaging     24.0
pillow        10.3.0
pip           23.2.1
scikit-image  0.23.1
scikit-learn  1.4.2
scipy         1.13.0
threadpoolctl 3.4.0
tifffile      2024.2.12

> python --version
Python 3.12.2

Answer 1

我认为您在

multiprocessing

模块中遇到的奇怪行为是由于 Python 如何处理由

multiprocessing.Pool

创建的子进程中的对象引用的一个微妙问题造成的。

修改

create_model()

返回创建的

kmeans_model

:

def create_model():
    sampled_pixels = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(sampled_pixels)
    return kmeans_model

然后，在

main()

中，使用

process_image

中返回的模型：

kmeans_model = create_model()

with multiprocessing.Pool(1) as pool:
    pool.apply_async(process_image, args=(kmeans_model,))

Python 中多重处理的停滞

问题描述投票：0回答：1

1个回答

最新问题

Python 中多重处理的停滞

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1