我尝试使用 OpenMP 在 while 循环内并行化 for 循环,并遇到程序间歇性挂起的问题,特别是当
condition
变量接近 1 时。下面是简化的代码片段:
#include <omp.h>
#include <stdio.h>
void task(int thread_id, int condition) {
printf("Hello from thread %d with condition %d\n", thread_id, condition);
}
int main() {
int condition = 5;
#pragma omp parallel num_threads(4) shared(condition)
{
while(condition) {
#pragma omp master
{
// Update condition in master thread
condition--;
#pragma omp flush(condition)
}
#pragma omp barrier // Wait for master to update condition
#pragma omp for
for(int i = 0; i < 4; ++i) {
task(omp_get_thread_num(), condition);
}
}
}
return 0;
}
输出示例(当
condition
为 1 时挂起。理想情况下,程序应在 condition
等于 0 时停止。):
Hello from thread 0 with condition 4
Hello from thread 1 with condition 4
Hello from thread 2 with condition 4
Hello from thread 3 with condition 4
Hello from thread 3 with condition 3
Hello from thread 1 with condition 3
Hello from thread 0 with condition 3
Hello from thread 2 with condition 3
Hello from thread 2 with condition 2
Hello from thread 3 with condition 2
Hello from thread 1 with condition 2
Hello from thread 0 with condition 2
Hello from thread 1 with condition 1
Hello from thread 3 with condition 1
Hello from thread 0 with condition 1
Hello from thread 2 with condition 1
...
尽管使用
#pragma omp flush(condition)
来确保 condition
变量的可见性并使用 #pragma omp barrier
来同步线程,但程序有时会停止,看似死锁,特别是当 condition
达到 1 时。 while 循环没有按预期退出。
我还尝试在 while 循环末尾添加一个额外的屏障,但问题仍然存在。这可能是由于死锁或其他同步问题造成的吗?任何见解或解决方案将不胜感激。预先感谢您!
编译如下代码会报告代码中的数据争用:
$ clang -O3 -g -fopenmp -fsanitize=thread so-78260058.c
$ ./a.out
WARNING: ThreadSanitizer: data race (pid=80729)
Read of size 4 at 0x7ffee9bf9638 by thread T2:
#0 main.omp_outlined_debug__ so-78260058.c:12:15 (a.out+0xe60b8)
#1 main.omp_outlined so-78260058.c:10:5 (a.out+0xe6345)
#2 __kmp_invoke_microtask <null> (libomp.so+0xbcbd2)
#3 main so-78260058.c:10:5 (a.out+0xe6058)
Previous write of size 4 at 0x7ffee9bf9638 by main thread:
#0 main.omp_outlined_debug__ so-78260058.c:16:26 (a.out+0xe6101)
#1 main.omp_outlined so-78260058.c:10:5 (a.out+0xe6345)
#2 __kmp_invoke_microtask <null> (libomp.so+0xbcbd2)
#3 main so-78260058.c:10:5 (a.out+0xe6058)
Location is stack of main thread.
Location is global '??' at 0x7ffee9bdb000 ([stack]+0x1e638)
Thread T2 (tid=80732, running) created by main thread at:
#0 pthread_create llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1020:234 (a.out+0x5f1bb)
#1 __kmp_create_worker <null> (libomp.so+0x9c096)
SUMMARY: ThreadSanitizer: data race so-78260058.c:12:15 in main.omp_outlined_debug__
该报告是关于第 12 行中的读取 (
while(condition
) 和第 16 行中的写入 (condition--;
)。该列还清楚地标识了条件变量。正如评论中所讨论的,修复数据争用的最小更改是在 master
构造之前添加一个屏障,刷新是多余的,并且在 master
构造之后的屏障也是不必要的,因为在for
区域的末端。
下面的代码更容易阅读:
#include <omp.h>
#include <stdio.h>
void task(int thread_id, int condition) {
printf("Hello from thread %d with condition %d\n", thread_id, condition);
}
int main() {
int condition = 5;
while (condition) {
condition--;
#pragma omp parallel for num_threads(4)
for (int i = 0; i < 4; ++i) {
task(omp_get_thread_num(), condition);
}
}
return 0;
}
由于大多数 OpenMP 实现都维护线程池,因此生成并行区域的开销不会明显高于两个必要的屏障。 (如果您使用嵌套并行性,请检查
KMP_HOT_TEAMS_MAX_LEVEL
环境变量)