在 Julia 中获取嵌入 C 的 MPI Communicator

问题描述 投票:0回答:1

我正在尝试将 Julia MPI 嵌入到 C 代码中,如下所示。 MPI 似乎在 C 本身中工作得很好,但每当我尝试在 Julia 中获得排名时,它就会崩溃。程序抱怨通讯器无效。谁能帮助我吗?我正在使用 Open MPI 4.1.3

编辑:

下面是显示我的机器上的问题的最小示例。基本上,它甚至无法获得 Julia 中

MPI.COMM_WORLD
的大小或排名。

#include <mpi.h>
#include <stdio.h>
#include <julia.h>

int main(int argc, char *argv[]) {

    MPI_Init(&argc, &argv);

    jl_init();

    (void) jl_eval_string("println(\"Loading MPI...\")");
    (void) jl_eval_string("using MPI");
    (void) jl_eval_string("println(\"Done.\")");

    (void) jl_eval_string("if MPI.Initialized() ;  println(\"MPI is initialized.\") ; else ; println(\"Warning: MPI is not initialized.\") ; end ");

    (void) jl_eval_string("comm = MPI.COMM_WORLD");
    (void) jl_eval_string("println(comm)");
    (void) jl_eval_string("println(MPI.Comm_size(comm))");

    jl_atexit_hook(0);

    MPI_Finalize();

    return 0;
}

使用下面的代码进行编译:

mpicc main.c -I$JULIA_INC -L$JULIA_LIB -ljulia  -o run.exe

并与

一起运行
mpirun -np 2 ./run.exe

我的输出

Loading MPI...
Done.
MPI is initialized.
MPI.Comm(1140850688)

[4188745] signal (11.1): Segmentation fault
in expression starting at none:1
PMPI_Comm_size at /home/t2hsu/miniconda3/envs/mpi/lib/libmpi.so.40 (unknown line)
MPI_Comm_size at /home/t2hsu/.julia/packages/MPI/TKXAj/src/api/generated_api.jl:999 [inlined]
Comm_size at /home/t2hsu/.julia/packages/MPI/TKXAj/src/comm.jl:78
jfptr_Comm_size_591 at /home/t2hsu/.julia/compiled/v1.9/MPI/nO0XF_FB87d.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
ijl_eval_string at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:113
main at ./run_c.exe (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
_start at ./run_c.exe (unknown line)
Allocations: 2997 (Pool: 2985; Big: 12); GC: 0
Segmentation fault (core dumped)

以下为旧文

main.c

#include <mpi.h>
#include <stdio.h>
#include <julia.h>

int main(int argc, char *argv[]) {
    int rank, size;
    char cmd[1024];

    // Initialize the MPI environment
    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    printf("Hello from process %d of %d\n", rank, size);

    int comm_id = MPI_Comm_c2f(MPI_COMM_WORLD);

    jl_init();


    (void) jl_eval_string("using MPI");

    (void) jl_eval_string("if MPI.Initialized() ;  println(\"MPI is initialized.\") ; else ; println(\"Warning: MPI is not initialized.\") ; end ");

    sprintf(cmd, "comm = MPI.Comm(%d)", comm_id);
    printf("Goinig to evaluate:\n");
    printf(cmd);
    printf("\n");
    (void) jl_eval_string(cmd);


    (void) jl_eval_string("println(comm)");
    (void) jl_eval_string("println(MPI.Comm_rank(comm))");

    jl_atexit_hook(0);

    MPI_Finalize();

    return 0;
}

使用以下代码进行编译:

mpicc main.c -I$JULIA_INC -L$JULIA_LIB -ljulia  -o run.exe

并与

一起运行
mpirun -np 2 ./run.exe

但是,我得到了错误输出:

Hello from process 0 of 2
Hello from process 1 of 2
MPI is initialized.
Goinig to evaluate:
comm = MPI.Comm(0)
MPI is initialized.
Goinig to evaluate:
comm = MPI.Comm(0)
MPI.Comm(0)
[exp-18-53:1727695] *** An error occurred in MPI_Comm_rank
[exp-18-53:1727695] *** reported by process [1988952065,1]
[exp-18-53:1727695] *** on communicator MPI_COMM_WORLD
[exp-18-53:1727695] *** MPI_ERR_COMM: invalid communicator
[exp-18-53:1727695] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[exp-18-53:1727695] ***    and potentially your MPI job)

c julia mpi
1个回答
0
投票

我自己发现了问题。这是因为我的 Julia MPI 没有使用与 C 相同的 MPI 库。

我按照

MPI.jl 文档
中的建议使用 MPIPreferences 来重新配置目标 MPI 库,从而解决了这个问题。

(void) jl_eval_string("using MPIPreferences");
(void) jl_eval_string("MPIPreferences.use_system_binary(; library_names=[\"/home/t2hsu/miniconda3/envs/mpi/lib/libmpi\"]);");

© www.soinside.com 2019 - 2024. All rights reserved.