如何使用 OpenMP 比较顺序代码和并行代码之间的时间？

Question

我正在尝试测量并行代码的不同线程数的加速比，其中加速比是顺序算法的计算时间与并行算法的时间之比。我在 C++ 中使用 OpenMP 和 FFTW，并使用函数

omp_get_wtime()

计算并行时间，使用

clock()

测量顺序时间。最初，我通过将 1 个线程的并行时间除以其他不同线程的并行时间来计算加速比，因为 1 个线程的并行时间 = 顺序时间。但是，我注意到顺序时间随着线程数量的变化而变化，现在我不确定如何实际计算我的速度。

示例：

static const int nx = 128; 
static const int ny = 128; 
static const int nz = 128;

double start_time, run_time;
int nThreads = 1; 
fftw_complex *input_array;
input_array = (fftw_complex*) fftw_malloc((nx*ny*nz) * sizeof(fftw_complex));
        
        
memcpy(input_array, Re.data(), (nx*ny*nz) * sizeof(fftw_complex));

fftw_complex *output_array;
output_array = (fftw_complex*) fftw_malloc((nx*ny*nz) * sizeof(fftw_complex));

start_time = omp_get_wtime();
clock_t start_time1 = clock();
            
fftw_init_threads();
fftw_plan_with_nthreads(nThreads); //omp_get_max_threads()
fftw_plan forward = fftw_plan_dft_3d(nx, ny, nz, input_array, output_array, FFTW_FORWARD, FFTW_ESTIMATE);
fftw_execute(forward);
fftw_destroy_plan(forward);
fftw_cleanup();
    
run_time = omp_get_wtime() - start_time;
clock_t end1 = clock();

cout << " Parallel Time in s: " <<  run_time << "s\n";
cout << "Serial Time in s: " <<  (double)(end1-start_time1) / CLOCKS_PER_SEC << "s\n";

        
memcpy(Im.data(),output_array, (nx*ny*nz) * sizeof(fftw_complex));
        
fftw_free(input_array);
fftw_free(output_array);

上述代码的结果如下：

对于 1 个线程：

Parallel Time in s: 0.0231161s
Serial Time in s: 0.023115s

加速比 = 1，这是有道理的

对于 2 个线程（约 2 倍加速）：

Parallel Time in s: 0.0132717s
Serial Time in s: 0.025434s

等等。那么，问题是为什么串行时间随着线程数量的增加而增加？或者我应该仅使用

omp_get_wtime()

来测量加速，并将 1 个线程视为我的顺序时间。我对上面代码的加速/性能感到非常困惑，它要么是 5/6 倍快（等于我计算机上的核心数量），要么只有两倍快，具体取决于我如何计算顺序时间。

如何使用 OpenMP 比较顺序代码和并行代码之间的时间？

问题描述投票：0回答：0

最新问题

如何使用 OpenMP 比较顺序代码和并行代码之间的时间？

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0