Python 中多重处理的停滞

问题描述 投票:0回答:1

我遇到了多处理模块的奇怪行为。谁能解释一下这是怎么回事吗?

以下 MWE 停止(永远运行而不会出现错误):

#!/usr/bin/env python3

import multiprocessing

import numpy as np
from skimage import io
from sklearn.cluster import KMeans

def create_model():
    sampled_pixels = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(sampled_pixels)

def process_image(test, test2):
    image  = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(image)
    image = kmeans_model.predict(image)

def main():

    create_model()

    with multiprocessing.Pool(1) as pool:

        pool.apply_async(process_image, args=('test', 'test'))

        pool.close()
        pool.join()

if __name__ == "__main__":
    main()

但是,如果我删除该行

create_model()
或更改

def process_image(test, test2)
# as well as
pool.apply_async(process_image, args=('test', 'test'))

def process_image(test)`
# and
pool.apply_async(process_image, args=('test'))

代码运行成功,因为它应该成功,因为参数和函数调用

create_model()
是完全多余的。


附录

> pip list
Package       Version
------------- ---------
imageio       2.34.0
joblib        1.4.0
lazy_loader   0.4
networkx      3.3
numpy         1.26.4
packaging     24.0
pillow        10.3.0
pip           23.2.1
scikit-image  0.23.1
scikit-learn  1.4.2
scipy         1.13.0
threadpoolctl 3.4.0
tifffile      2024.2.12

> python --version
Python 3.12.2
python scikit-learn multiprocessing python-multiprocessing
1个回答
0
投票

我认为您在

multiprocessing
模块中遇到的奇怪行为是由于 Python 如何处理由
multiprocessing.Pool
创建的子进程中的对象引用的一个微妙问题造成的。

修改

create_model()
返回创建的
kmeans_model
:

def create_model():
    sampled_pixels = np.random.randint(0, 255, (800,3))
    kmeans_model = KMeans(n_clusters=8, random_state=0).fit(sampled_pixels)
    return kmeans_model

然后,在

main()
中,使用
process_image
中返回的模型:

kmeans_model = create_model()

with multiprocessing.Pool(1) as pool:
    pool.apply_async(process_image, args=(kmeans_model,))
© www.soinside.com 2019 - 2024. All rights reserved.