我正在尝试在Fortran90中并行化我的python代码的一小部分。因此,首先,我试图了解生成功能的工作原理。
[首先,我尝试从python父代生成python的子进程。我使用了mpi4py tutorial中的动态过程管理示例。一切正常。在这种情况下,据我了解,仅使用父进程和子进程之间的交互器。
然后,我进入一个示例,该示例从python父代在fortran90中生成一个子进程。为此,我使用了stackoverflow中previous post之一的示例。产生fortran子代的python代码(master.py)如下:
from mpi4py import MPI
import numpy
'''
slavef90 is an executable built starting from slave.f90
'''
# Spawing a process running an executable
# sub_comm is an MPI intercommunicator
sub_comm = MPI.COMM_SELF.Spawn('slavef90', args=[], maxprocs=1)
# common_comm is an intracommunicator accross the python process and the spawned process.
# All kind sof collective communication (Bcast...) are now possible between the python process and the c process
common_comm=sub_comm.Merge(False)
print('parent in common_comm ', common_comm.Get_rank(), ' of ', common_comm.Get_size())
data = numpy.arange(1, dtype='int32')
data[0]=42
print("Python sending message to fortran: {}".format(data))
common_comm.Send([data, MPI.INT], dest=1, tag=0)
print("Python over")
# disconnecting the shared communicators is required to finalize the spawned process.
sub_comm.Disconnect()
common_comm.Disconnect()
产生子进程的相应的fortran90代码(slave.f90)如下:
program test
!
implicit none
!
include 'mpif.h'
!
integer :: ierr,s(1),stat(MPI_STATUS_SIZE)
integer :: parentcomm,intracomm
!
call MPI_INIT(ierr)
call MPI_COMM_GET_PARENT(parentcomm, ierr)
call MPI_INTERCOMM_MERGE(parentcomm, 1, intracomm, ierr)
call MPI_RECV(s, 1, MPI_INTEGER, 0, 0, intracomm,stat, ierr)
print*, 'fortran program received: ', s
call MPI_COMM_DISCONNECT(intracomm, ierr)
call MPI_COMM_DISCONNECT(parentcomm, ierr)
call MPI_FINALIZE(ierr)
endprogram test
我使用mpif90 slave.f90 -o slavef90 -Wall
编译了fortran90代码。我通常使用python master.py
运行python代码。我能够获得所需的输出,但是生成的进程不会断开连接,即Disconnect命令(call MPI_COMM_DISCONNECT(intracomm, ierr)
和call MPI_COMM_DISCONNECT(parentcomm, ierr)
)之后的任何语句都不会在fortran代码中执行(因此, python代码中的断开连接命令也未执行),我的代码不会在终端中终止。
在我看来,在这种情况下,内部通信程序和内部通信程序是合并的,因此子进程和父进程不再是两个不同的组。并且,断开它们的连接时似乎存在一些问题。但是,我无法解决。我尝试重现其中在C ++和python中都生成了子进程的fortran90代码,并且遇到了相同的问题。任何帮助表示赞赏。谢谢。
请注意,您的python脚本首先断开内部通信器的连接,然后断开内部通信器的连接,但是您的Fortran程序首先断开内部通信器的连接,然后断开内部通信器的连接。
固定顺序并释放内部通讯器后,我可以在mac电脑(由Open MPI
安装的mpi4py
和brew
)上运行此测试。
这里是我的master.py
#!/usr/local/Cellar/[email protected]/3.8.2/bin/python3
from mpi4py import MPI
import numpy
'''
slavef90 is an executable built starting from slave.f90
'''
# Spawing a process running an executable
# sub_comm is an MPI intercommunicator
sub_comm = MPI.COMM_SELF.Spawn('slavef90', args=[], maxprocs=1)
# common_comm is an intracommunicator accross the python process and the spawned process.
# All kind sof collective communication (Bcast...) are now possible between the python process and the c process
common_comm=sub_comm.Merge(False)
print('parent in common_comm ', common_comm.Get_rank(), ' of ', common_comm.Get_size())
data = numpy.arange(1, dtype='int32')
data[0]=42
print("Python sending message to fortran: {}".format(data))
common_comm.Send([data, MPI.INT], dest=1, tag=0)
print("Python over")
# free the (merged) intra communicator
common_comm.Free()
# disconnect the inter communicator is required to finalize the spawned process.
sub_comm.Disconnect()
和我的slave.f90
program test
!
implicit none
!
include 'mpif.h'
!
integer :: ierr,s(1),stat(MPI_STATUS_SIZE)
integer :: parentcomm,intracomm
integer :: rank, size
!
call MPI_INIT(ierr)
call MPI_COMM_GET_PARENT(parentcomm, ierr)
call MPI_INTERCOMM_MERGE(parentcomm, .true., intracomm, ierr)
call MPI_COMM_RANK(intracomm, rank, ierr)
call MPI_COMM_SIZE(intracomm, size, ierr)
call MPI_RECV(s, 1, MPI_INTEGER, 0, 0, intracomm,stat, ierr)
print*, 'fortran program', rank, ' / ', size, ' received: ', s
print*, 'Slave frees intracomm'
call MPI_COMM_FREE(intracomm, ierr)
print*, 'Slave disconnect intercomm'
call MPI_COMM_DISCONNECT(parentcomm, ierr)
print*, 'Slave finalize'
call MPI_FINALIZE(ierr)
endprogram test