Python 多处理抛出无法解释的错误

Question

我有一个Python程序，它分为三个文件，dielectric_functions.py，routines.py和cartmoments.py，组织如下--

dielectric_functions.py 是程序的入口点，或多或少只是通过调用在routines.py 中编写的多个函数来提供计算的结构和顺序。
routines.py 包含大部分功能，包括一个引发多处理问题的函数，现在将讨论这一点。
cartmoments.py 实现类对象和这些类对象上的某些函数（并行化）。

代码是介电函数.py：

import routines as routines

def initialize_cell() -> tuple[routines.pbcgto.cell.Cell, dict]:
    cell = routines.build_cell_from_input()                                    # Build cell object
    primgauss = routines.gen_all_1D_prim_gauss(cell)                           # Get all primitive gaussian objects
    primindices = routines.gen_prim_gauss_indices(primgauss)                   # Get all main indices for primitive gaussian objects.
    all_ao = routines.gen_all_atomic_orbitals(cell, primgauss)                 # Get all atomic orbitals
    G_vectors = routines.gen_G_vectors(cell)                                   # Get all relevant G vectors
    R_vectors = routines.construct_R_vectors(cell)
    dark_objects = {
        'primitive_gaussians': primgauss,
        'all_ao': all_ao,
        'G_vectors': G_vectors,
        'primindices': primindices[0],
        'atom_locs': primindices[1],
        'R_vectors': R_vectors
    }
    return cell, dark_objects

def dielectric_RPA(dark_objects: dict) -> None:
    routines.primgauss_1D_overlaps(dark_objects)
    return dark_objects

def main():
    cell, dark_objects = initialize_cell()
    #dark_objects = electronic_structure(cell, dark_objects)
    dark_objects = dielectric_RPA(dark_objects)
    return

if __name__ == '__main__':
    main()

routines.py（仅相关部分）：

from multiprocessing import Pool
import cartmoments
from functools import partial
import numpy as np

def time_wrapper(func):
    """
    Wrapper for printing execution time to logger.
    """
    def wrap(*args, **kwargs):
        logging.info('Entering function {}'.format(func.__name__))
        start = time.time()
        val = func(*args, **kwargs)
        end = time.time()
        logging.info('Exiting function {}. Time taken = {:.2f} s.\n'.format(func.__name__, end - start))
        return val

    return wrap

@time_wrapper
def primgauss_1D_overlaps(dark_objects: dict):
    """
    Store all 1D primitive gaussians in files. 
    NOTE: Currently returns one numpy array for all overlaps. This is not good and will not work for anisotropic systems.
    NOTE: Implement file stores with name parmt.store + '/1d_overlaps/{}_{}.npy'.format(d, q) 
    Inputs:
        dark_objects:   dict: equivalent to a class object, except not self-referential.
    """
    primindices = dark_objects['primindices']
    atom_locs = dark_objects['atom_locs']
    q, G = np.load(parmt.store + '/unique_q.npy'), dark_objects['G_vectors']
    Rv = dark_objects['R_vectors']
    makedir(parmt.store + '/primgauss_1d_integrals/')
    for d in range(3):
        qu, Gu = get_all_unique_nums_in_array(q[:,d], round_to=10), get_all_unique_nums_in_array(G[:,d], round_to=10)
        qG = (qu[:, None] + Gu[None, :]).reshape((-1))
        qG = get_all_unique_nums_in_array(qG, round_to=10)
        qG = qG[np.abs(qG) <= parmt.q_max]
        Ru = get_all_unique_nums_in_array(Rv[:,d], round_to=10)
        with Pool(10) as p:
            res = p.map(partial(cartmoments.primgauss_1D_overlaps_uR, primindices = primindices, q = qG, atom_locs = atom_locs[:,d]), Ru)
            p.close()
            p.join()
    logging.info("Generated overlaps of 1D primitive gaussians.")
    return

我遇到的问题是如果我在 ipython 中运行命令

from dielectric_functions import *
main()

它按预期工作。

但是，如果我尝试运行

python3 dielectric_functions.py

，它会损坏我的日志，并尝试加载routines.py数十次。请注意，代码实际上仍然会产生正确的结果，但是那里有一个错误需要修复，以便日志记录可以继续存在。

cartmoments.py 不导入routines.py，整个功能都在 cartmoments.py 中。

我正在使用Python

3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)]

，在虚拟环境中使用MacOS（M3 Pro，11核）。

我完全迷失了，事实上，在 ipython 上用两行单独的行运行代码和以

python3 dielectric_functions.py

运行它并不等效，这一事实令人惊讶。

Answer 1

虽然还不确定发生了什么，但我可以阐明其中的差异：虽然多处理试图给人一种工作神奇的印象，但事实是 Python 解释器必须再次运行，为每个工作人员加载相同的模块。在通常情况下，

if __name__ == "__main__":

防护是唯一阻止多处理调用代码在每个工作线程中再次运行的东西。

当从 shell 运行

python3 dialectric_functions.py

时，对于第一个进程，

__name__ == "__main__":

将为 true，而当为每个工作进程再次加载模块时，should 为 false。当从解释器中执行

import dialectric_functions

时，即使第一次运行，

__name__

也不是

"__main__"

。

现在，不寻常的是，虽然不是不正确，但看起来确实很奇怪，那就是您在代码中的

for d in range(3):

中多次创建了一个多处理工具

在与第一个模块名称的边缘情况交互中，无论出现什么问题，都可能与这部分有关。

只需将您的池移至

for

方法之外，以便它运行一次，然后池中的进程将在

的每次交互中重用 - 这应该会为您修复它：

...
@time_wrapper
def primgauss_1D_overlaps(dark_objects: dict):
    ...
    with Pool(10) as p:
        for d in range(3):
            qu, Gu = get_all_unique_nums_in_array(q[:,d], round_to=10), get_all_unique_nums_in_array(G[:,d], round_to=10)
            qG = (qu[:, None] + Gu[None, :]).reshape((-1))
            qG = get_all_unique_nums_in_array(qG, round_to=10)
            qG = qG[np.abs(qG) <= parmt.q_max]
            Ru = get_all_unique_nums_in_array(Rv[:,d], round_to=10)
            res = p.map(partial(cartmoments.primgauss_1D_overlaps_uR, primindices=primindices, q=qG, atom_locs=atom_locs[:,d]), Ru)

如果您使用

with

语句块，则无需担心在池上调用“close”和“join”方法：它正是为您做的。

现在，有更多背景信息和其他方法：运行 sbpprocesses 的另一种工作启动方法，

fork

曾经是 MacOS 上的默认设置（并且仍然是 Linux 上的默认设置），直到 Python 3.8 - 他们已经切换了它，因为在某些特定条件下（在 MacOS 下更常见），有可能会出现问题。几乎可以肯定，如果将工作进程启动方法设置为

fork

，那么您不会遇到问题 - 只需在主函数内执行一次即可：

dialectric_functions.py

import multiprocessing
...

def main():
    multiprocessing.set_start_method('fork') 
    ...

if __name__ == "__main__":
    ...

此外，如果从交互式 shell 调用代码时以这种方式调用

set_start_method

函数失败（因为它已经被设置），您将必须使用显式上下文来更改此设置。然后，在调用

Pool

的代码中，改用多处理上下文：

import multiprocessing

...


@time_wrapper
def primgauss_1D_overlaps(dark_objects: dict):
    ...
    ctx = multiprocessing.get_context()
    ctx.set_start_method("fork")
    with ctx.Pool(10) as P:
        for d in range(3):
            ...
```

The "fork" start method duplicates a running process in memory, as it is, and there are no `import` of other Python modules - the information from the same modules as imported in the root process is re-used.

Python 多处理抛出无法解释的错误

问题描述投票：0回答：1

1个回答

最新问题

Python 多处理抛出无法解释的错误

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1