Crossprod 比 %*% 慢,为什么?

问题描述 投票:0回答:1

在过去几天我一直在编码的算法的计算时间方面进行了各种尝试,我想测试

crossprod
%*%
给出的有效改进。我惊讶地发现,使用
%*%
,我的算法会运行得更快。

因此,我决定在一般矩阵上使用

microbenchmark()
(以及
system.time()
)来比较这两个例程,并得到以下结果:

M <- 1000
K <- 100
A <- matrix(rnorm(M*K, 10, 1), ncol = M)
b <- rnorm(K, 10, 1)
B <- matrix(rnorm(M*K, 10, 1), ncol = M)

microbenchmark(crossprod(A, A), t(A)%*%A, times = 1000, unit="ms")
Unit: milliseconds
            expr       min        lq     mean    median       uq      max neval cld
 crossprod(A, A) 112.58885 121.05406 149.8290 129.31873 147.6489 358.3164  1000   b
      t(A) %*% A  76.77698  81.68934 108.0526  89.50015 105.9617 304.7395  1000  a 

microbenchmark(crossprod(A), t(A)%*%A, times = 1000, unit="ms")
Unit: milliseconds
         expr      min       lq     mean   median       uq      max neval cld
 crossprod(A) 58.26374 61.56330 69.35781 64.65561 71.42403 314.9268  1000  a 
   t(A) %*% A 76.97771 81.80069 92.21863 85.75894 93.50332 273.5133  1000   b

microbenchmark(crossprod(A, B), t(A)%*%B, times = 1000, unit="ms")
Unit: milliseconds
            expr       min       lq      mean    median        uq      max neval cld
 crossprod(A, B) 109.27471 111.6751 118.00118 112.97533 117.55815 284.6910  1000   b
      t(A) %*% B  74.36276  77.0441  83.33582  77.89172  82.58609 258.3154  1000  a

microbenchmark(crossprod(A, b), t(A)%*%b, times = 1000, unit="ms")
Unit: milliseconds
            expr      min        lq      mean    median       uq       max neval cld
 crossprod(A, b) 0.149644 0.1553795 0.1884534 0.1577500 0.167333  6.737466  1000  a 
      t(A) %*% b 0.338180 0.6239705 0.8052485 0.6423505 0.678017 13.011479  1000   b

microbenchmark(crossprod(b, d), t(b)%*%d, times = 1000, unit="ms")
Unit: milliseconds
            expr      min        lq        mean   median        uq      max neval cld
 crossprod(b, d) 0.000814 0.0009130 0.001153643 0.000973 0.0010740 0.018912  1000  a 
      t(b) %*% d 0.002547 0.0029275 0.003554290 0.003087 0.0033005 0.057184  1000   b

microbenchmark(crossprod(b), t(b)%*%b, times = 1000, unit="ms")
Unit: milliseconds
         expr      min       lq        mean   median       uq      max neval cld
 crossprod(b) 0.000758 0.000801 0.000866091 0.000848 0.000883 0.004277  1000  a 
   t(b) %*% b 0.002546 0.002686 0.002872259 0.002785 0.002898 0.033779  1000   b

显然,至少在我的机器上,只有在处理向量或求矩阵平方而不指定 y 参数时,crossprod 才会更快。

我知道理论上,

crossprod
通常应该更快,所以,这怎么可能?

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Manjaro Linux

Matrix products: default
BLAS:   /usr/lib/libblas.so.3.8.0
LAPACK: /usr/lib/liblapack.so.3.8.0
r rstudio lapack blas cross-product
1个回答
0
投票

我也遇到了同样的问题,请问你找到原因了吗?

# both matrices have elements generated out of a standard random normal, not sparse at all
r$> dim(mat1)
[1] 100 600

r$> dim(mat2)
[1] 100 350

r$> microbenchmark::microbenchmark(crossprod(mat1, mat2), t(mat1) %*% mat2)
Unit: milliseconds
                  expr     min       lq      mean   median      uq     max neval
 crossprod(mat1, mat2) 11.7534 11.82795 12.362565 11.89140 12.1628 26.4312   100
      t(mat1) %*% mat2  7.8864  8.01000  8.438531  8.09025  8.2568 20.2051   100

继续奔跑

r$> version
               _
platform       x86_64-w64-mingw32
arch           x86_64
os             mingw32
crt            ucrt
system         x86_64, mingw32
status
major          4
minor          4.1
year           2024
month          06
day            14
svn rev        86737
language       R
version.string R version 4.4.1 (2024-06-14 ucrt)
nickname       Race for Your Life
© www.soinside.com 2019 - 2024. All rights reserved.