从C/C++调用Python多处理模块导致无限添加新进程

问题描述 投票:0回答:1

我正在开发一个在Windows系统下具有不寻常框架的项目:c++ main()通过C API调用Python Multiprocessing函数。该框架无需多处理即可正常运行。一旦使用多处理模块(即使只有 1 个进程),程序就会不断添加新的 c++ main() exe 进程。 Python 脚本独立与所使用的多处理模块配合良好。 我认为这与多处理模块的实现有关;网上查遍都没有找到相关信息。有人可以提供一些提示吗?谢谢!

开始.py

import time
import multiprocessing as mp



def do_something():
    print('Sleeping 1 second...')
    time.sleep(1)
    print('Done Sleeping...')
    
def benchmark():
    start = time.perf_counter()

    p1 = mp.Process(target=do_something)
    p2 = mp.Process(target=do_something)
    p1.start()
    #p2.start()
    p1.join()
    #p2.join()

    finish = time.perf_counter()

    print(f'Finished in {round(finish - start,2)} second(s)')
 
if __name__ == '__main__':
    benchmark()

callPythonFromCpp.cpp

#include <Windows.h>
#include <iostream>
#include <string>
#include <Python.h>


using namespace std;

void CallPython(string PythonModuleName, string PythonFunctionName)
{
    char* funcname = new char[PythonFunctionName.length() + 1];
    strcpy_s(funcname, PythonFunctionName.length() + 1, PythonFunctionName.c_str());

    char* modname = new char[PythonModuleName.length() + 1];
    strcpy_s(modname, PythonModuleName.length() + 1, PythonModuleName.c_str());

    
    // Initialize the Python interpreter 
    Py_Initialize();

    PyObject* my_module = PyImport_ImportModule(modname);

    PyObject* my_function = PyObject_GetAttrString(my_module, funcname);

    // Call a callable Python object callable, with arguments given by the tuple args. 
    // If no arguments are needed, then args can be NULL.

    PyObject* my_result = PyObject_CallObject(my_function, NULL);

    // Undo all initializations made by Py_Initialize() and subsequent use of Python/C API functions, 
    // and destroy all sub-interpreters (see Py_NewInterpreter() below) that were created and not yet 
    // destroyed since the last call to Py_Initialize(). Ideally, this frees all memory allocated by the Python interpreter.
    Py_Finalize();

    delete[] funcname;
    delete[] modname;
}
int main()
{
    CallPython("start", "benchmark");
    system("pause");
    return 0;
}

enter image description here

  1. 从C++ main()调用单进程编码的Python,效果很好;
  2. 从C++ main()调用多处理Python,它不断添加新进程,Python代码没有运行;
  3. 独立运行多处理Python脚本,它并行工作得很好。

更新: 在Python代码中打印sys.executable和os.getpid(); 当调用 p1.start() 时,会递归调用 C++ main() 的新进程。 我认为这与多处理如何识别主进程和子进程有关。例如,Python 独立版具有“if name == 'main'”,这对于使用多处理模块至关重要。

Python 更新:

def benchmark():
    start = time.perf_counter()

    p1 = mp.Process(target=do_something)
    p2 = mp.Process(target=do_something)
    print('sys.executable: ' + sys.executable + '\n')
    print('os pid: ' + str(os.getpid()) + '\n')
    p1.start()
    #p2.start()
    #p1.join()
    #p2.join()

    finish = time.perf_counter()

    print(f'Finished in {round(finish - start,2)} second(s)')

终端输出:

sys.executable: C:\Users\xxx\source\repos\callPythonFromCpp\x64\Release\callPythonFromCpp.exe

os pid: 25408

Finished in 0.08 second(s)
sys.executable: C:\Users\xxx\source\repos\callPythonFromCpp\x64\Release\callPythonFromCpp.exe

os pid: 32332

Finished in 0.01 second(s)
sys.executable: C:\Users\xxx\source\repos\callPythonFromCpp\x64\Release\callPythonFromCpp.exe

os pid: 9456

Finished in 0.01 second(s)
sys.executable: C:\Users\xxx\source\repos\callPythonFromCpp\x64\Release\callPythonFromCpp.exe

os pid: 8944


更新2023-11-23 关注这篇文章 嵌入式 python:多处理不起作用 我的程序现在可以运行了。 C++代码保持不变; Python代码如下:

import time import multiprocessing as mp def do_something(): print('Sleeping 1 second...') time.sleep(1) print('Done Sleeping...') def benchmark(): sys.argv = [r'C:\path_to_this\start.py'] mp.set_executable(r'C:\path_to_Python_install\python.exe') start = time.perf_counter() p1 = mp.Process(target=do_something) p2 = mp.Process(target=do_something) p1.start() p2.start() p1.join() p2.join() finish = time.perf_counter() print(f'Finished in {round(finish - start,2)} second(s)') if __name__ == '__main__': benchmark()
    
python parallel-processing multiprocessing python-multiprocessing c-api
1个回答
0
投票
遵循这篇文章嵌入式Python:

嵌入式Python:多处理不起作用 我的程序现在可以运行了。 C++代码保持不变; Python 代码还需要 2 行来设置 sys.argv 和 multiprocessing.set_executable。 我认为这是由于multiprocessing模块的实现引起的:默认情况下它假设multiprocessing.set_executable是python.exe,因此会调用多个独立的python.exe实例;但是,当我使用 c++ main() 调用 Python 脚本中的多处理代码时,多处理模块将 c++ main() 视为可执行文件并启动多个 c++ exe,然后将按顺序递归调用更多 c++ exe CPU 的数量。 工作Python代码:

import time import multiprocessing as mp def do_something(): print('Sleeping 1 second...') time.sleep(1) print('Done Sleeping...') def benchmark(): sys.argv = [r'C:\path_to_this\start.py'] mp.set_executable(r'C:\path_to_Python_install\python.exe') start = time.perf_counter() p1 = mp.Process(target=do_something) p2 = mp.Process(target=do_something) p1.start() p2.start() p1.join() p2.join() finish = time.perf_counter() print(f'Finished in {round(finish - start,2)} second(s)')
还有另一个潜在的问题:如果 python 代码中存在多处理代码,我们无法对所有 python 代码进行 cythonize;磁盘上必须有一个 .py 文件供多处理模块解析。请参阅这篇文章:

将多处理 python 代码转换为 cython 时遇到问题

© www.soinside.com 2019 - 2024. All rights reserved.