在并行循环中,我使用子例程处理共享数组,将数组和当前私有索引作为参数传递给该子例程,但程序因数组越界错误而崩溃。如何正确调用子程序来处理共享数组并将并行循环索引传递给它?
main中的代码.F:
PROGRAM TESTER
USE OMP_LIB
USE PRINTER
INTEGER, PARAMETER:: N = 5
REAL*4,DIMENSION(:),ALLOCATABLE, SAVE :: ARG_1, ARG_2
REAL*4,DIMENSION(:),ALLOCATABLE:: RES
C=======================================================================
C$OMP THREADPRIVATE(ARG_1, ARG_2)
C=======================================================================
ALLOCATE(RES(N))
PRINT *,'MAIN: "RES" IS ALLOCATED = ',
> ALLOCATED(RES)
C$OMP PARALLEL PRIVATE(I) SHARED(RES) NUM_THREADS(2)
ALLOCATE(ARG_1(N))
PRINT *,'MAIN: "ARG_1" IS ALLOCATED = ',
> ALLOCATED(ARG_1)
ALLOCATE(ARG_2(N))
PRINT *,'MAIN: "ARG_2" IS ALLOCATED = ',
> ALLOCATED(ARG_2)
C Step 1:Initialize working arrays:
CALL WORK1(ARG_1,N, ARG_2,N)
CALL WORK2(ARG_1,N, ARG_2,N)
C Step 2: Print working arrays:
CALL PRINT_ARR(ARG_1,N)
CALL PRINT_ARR(ARG_2,N)
PRINT *,'===================================='
C Step 3: Parallel Loop:
c-----------------------------------------------------------------------
C$OMP DO
DO I=1,N
CALL WORK3(RES,I,ARG_1(I),ARG_2(I))
ENDDO
C$OMP END DO
CALL PRINT_ARR(RES,N)
c-----------------------------------------------------------------------
C$OMP END PARALLEL
DEALLOCATE(ARG_1,ARG_2)
DEALLOCATE(RES)
END PROGRAM TESTER
工作代码.F文件:
SUBROUTINE WORK1(ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
INTEGER DIM_1, DIM_2,I,J
REAL*4 ARG_ARR_2(DIM_2)
REAL*4 ARG_ARR_1(DIM_1)
REAL*4 ARG1, ARG2
REAL*4,DIMENSION(:),ALLOCATABLE:: ARG_ARR_3
SAVE
c-----------------------------------------------------------------------
C$OMP THREADPRIVATE (I)
c-----------------------------------------------------------------------
DO I=1,DIM_1
ARG_ARR_1(I)= 1.0
ENDDO
RETURN
ENTRY WORK2 (ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
DO I=1,DIM_2
ARG_ARR_2(I)= 2.0
ENDDO
RETURN
ENTRY WORK3 (ARG_ARR_3,J,ARG1,ARG2)
ARG_ARR_3(J)= ARG1+ARG2
RETURN
END SUBROUTINE WORK1
和 module.f 代码:
MODULE PRINTER
CONTAINS
SUBROUTINE PRINT_ARR(ARR_VAR,SIZE)
REAL*4,DIMENSION(:),ALLOCATABLE:: ARR_VAR
INTEGER SIZE
INTEGER,SAVE:: J
c-----------------------------------------------------------------------
C$OMP THREADPRIVATE(J)
c-----------------------------------------------------------------------
DO J=1,SIZE
PRINT *,'ARR_VAR(',J,')=',ARR_VAR(J)
ENDDO
FLUSH(6)
END SUBROUTINE PRINT_ARR
END MODULE PRINTER
我的编译和运行命令:
gfortran -fopenmp -O0 -g -fcheck=all -fbacktrace -c module1.f work.F main.F
gfortran -fopenmp *.o -o a.x
./a.x
我的输出:
MAIN: "RES" IS ALLOCATED = T
MAIN: "ARG_1" IS ALLOCATED = T
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
MAIN: "ARG_1" IS ALLOCATED = T
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
At line 21 of file work.F
Fortran runtime error: Index '4' of dimension 1 of array 'arg_arr_3' above upper bound of 2
Error termination. Backtrace:
#0 0x7f1e90ed3ad0 in ???
#1 0x7f1e90ed2c35 in ???
#2 0x7f1e90c8051f in ???
at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3 0x55d38d47e43d in master.0.work1
at .../work.F:21
#4 0x55d38d47e04f in work3_
at .../work.F:20
#5 0x55d38d47dae6 in MAIN__._omp_fn.0
at .../main.F:44
#6 0x7f1e90e7aa15 in ???
#7 0x55d38d47d45b in tester
at .../main.F:18
#8 0x55d38d47d58d in main
at .../main.F:2
Segmentation fault (core dumped)
我使用gfortran:gcc版本11.4.0(Ubuntu 11.4.0-1ubuntu1~22.04)
我没有在这里带来完整的解决方案,但很难在评论中详细说明。
SAVE
中的WORK1()
语句,它是多线程的潜在杀手。I
了。ARG_ARR_3
参数上的 allocate 属性(并且无论如何它都不会工作,除非您将例程放入模块中): SUBROUTINE WORK1(ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
INTEGER DIM_1, DIM_2,I,J
REAL*4 ARG_ARR_2(DIM_2)
REAL*4 ARG_ARR_1(DIM_1)
REAL*4 ARG1, ARG2
REAL*4 ARG_ARR_3(*)
DO I=1,DIM_1
ARG_ARR_1(I)= 1.0
ENDDO
RETURN
ENTRY WORK2 (ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
DO I=1,DIM_2
ARG_ARR_2(I)= 2.0
ENDDO
RETURN
ENTRY WORK3 (ARG_ARR_3,J,ARG1,ARG2)
ARG_ARR_3(J)= ARG1+ARG2
RETURN
END SUBROUTINE WORK1
此外,在你的主程序中,
THREADPRIVATE(ARG_1, ARG_2)
是多余的:threadprivate
旨在获取并行区域之间的持久私有变量。我看不出这里有什么必要。保持简单并声明:
C$OMP PARALLEL PRIVATE(I,ARG_1,ARG_2) SHARED(RES) NUM_THREADS(2)
最后,
DEALLOCATE(ARG_1,ARG_2)
应放置在平行区域末尾之前。
尝试一下...但这绝对是一个糟糕的设计(
ENTRY
是过去的复兴,也是固定形式的源)。