是否允许/可能在Rcpp中的pragma openmp parallel for循环中调用R函数或fortran代码？

Question

在一个Rcpp项目中，我希望能够call an R function（来自cobs包的cobs函数进行凹拟样条拟合）或它所依赖的call the fortran code（cobs函数使用quantreg的rq.fit.sfnc函数来拟合约束样条模型，反过来依赖于srqfnc中的fortran编码的quantreg函数pragma openmp parallel for loop（我的代码的其余部分主要需要一些简单的线性代数，因此这没有问题，但遗憾的是每个内循环迭代也需要我做一个凹样条拟合）。我想知道这是否允许或可能，因为我认为这样的调用不是线程安全的？是否有一个简单的解决方案，比如用#pragma omp critical包围这些电话？有人会有这方面的例子吗？或者在这种情况下唯一的方法是使用线程安全的Armadillo类来完成Rcpp和cobs函数的完整rq.fit.sfnc端口？

Answer 1

引用the manual：

从线程代码调用任何R API是“仅限专家”并且强烈建议不要使用。 R API中的许多函数修改内部R数据结构，如果同时从多个线程调用，可能会破坏这些数据结构。大多数R API函数都可以发出错误信号，这些错误只能发生在R主线程上。此外，外部库（例如LAPACK）可能不是线程安全的。

我一直把它解释为“不能从线程代码中调用R API函数”。无论内部使用什么，从omp并行区域内调用R函数就是这样。使用#pragma omp critical可能会起作用，但如果它破坏了你必须保留碎片......

重新实现相关代码或在C ++ / C / Fortran中查找现有实现并直接调用它会更安全。

Answer 2

所以我只是尝试了，似乎在#pragma openmp parallel for循环中调用R函数只有在#pragma omp critical之前才有效（否则会导致堆栈不平衡，并导致R崩溃）。当然，这将导致代码的这一部分按顺序执行，但在某些情况下这可能仍然有用。

例：

Rcpp部分，保存为文件"fitMbycol.cpp"：

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// #define RCPP_ARMADILLO_RETURN_COLVEC_AS_VECTOR
using namespace Rcpp;
using namespace arma;
using namespace std;

#include <omp.h>
// [[Rcpp::plugins(openmp)]]

// [[Rcpp::export]]
arma::mat fitMbycol(arma::mat& M, Rcpp::Function f, const int nthreads) {

  // ARGUMENTS
  // M: matrix for which we want to fit given function f over each column
  // f: fitting function to use with one single argument (vector y) that returns the fitted values as a vector
  // nthreads: number of threads to use

  // we apply fitting function over columns
  int c = M.n_cols;
  int r = M.n_rows;
  arma::mat out(r,c);
  int i;
  omp_set_num_threads(nthreads);
#pragma omp parallel for shared(out)
  for (i = 0; i < c; i++) {
      arma::vec y = M.col(i); // ith column of M
#pragma omp critical
{
      out.col(i) = as<arma::colvec>(f(NumericVector(y.begin(),y.end())));
}
  }

  return out;

}

然后在R：

首先是纯R版本：

（我们使用泊松噪声模拟一些高斯峰形，然后使用cobs函数对它们进行对数凹曲线拟合）

x=1:100
n=length(x)
ncols=50
gauspeak=function(x, u, w, h=1) h*exp(((x-u)^2)/(-2*(w^2)))
Y_nonoise=do.call(cbind,lapply(seq(min(x), max(x), length.out=ncols), function (u) gauspeak(x, u=u, w=10, h=u*100)))
set.seed(123)
Y=apply(Y_nonoise, 2, function (col) rpois(n,col))

# log-concave spline fit on each column of matrix Y using cobs
require(cobs)
logconcobs = function(y, tau=0.5, nknots=length(y)/10) {
  x = 1:length(y)
  offs = max(y)*1E-6
  weights = y^(1/2)
  fit.y = suppressWarnings(cobs(x=x,y=log10(y+offs), 
              constraint = "concave", lambda=0, 
              knots = seq(min(x),max(x), length.out = nknots), 
              nknots=nknots, knots.add = FALSE, repeat.delete.add = FALSE,
              keep.data = FALSE, keep.x.ps = TRUE,
              w=weights, 
              tau=tau, print.warn = F, print.mesg = F, rq.tol = 0.1, maxiter = 100)$fitted)
  return(pmax(10^fit.y - offs, 0))
}
library(microbenchmark)
microbenchmark(Y.fitted <- apply(Y, 2, function(col) logconcobs(y=col, tau=0.5)),times=5L) # 363 ms, ie 363/50=7 ms per fit
matplot(Y,type="l",lty=1)
matplot(Y_nonoise,type="l",add=TRUE, lwd=3, col=adjustcolor("blue",alpha.f=0.2),lty=1)
matplot(Y.fitted,type="l",add=TRUE, lwd=3, col=adjustcolor("red",alpha.f=0.2),lty=1)

现在使用Rcpp在logconcobs中调用我们的R拟合函数#pragma openmp parallel for loop，用#pragma omp critical封闭：

library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp('fitMbycol.cpp')
microbenchmark(Y.fitted <- fitMbycol(Y, function (y) logconcobs(y, tau=0.5, nknots=10), nthreads=8L ), times=5L) # 361 ms

在这种情况下，OpenMP当然不会产生任何影响，因为#pragma omp critical会导致所有内容按顺序执行，但在更复杂的示例中，这仍然有用。

是否允许/可能在Rcpp中的pragma openmp parallel for循环中调用R函数或fortran代码？

问题描述投票：3回答：2

2个回答

最新问题

是否允许/可能在Rcpp中的pragma openmp parallel for循环中调用R函数或fortran代码？

问题描述 投票：3回答：2

2个回答

最新问题

问题描述投票：3回答：2