在过去几天我一直在编码的算法的计算时间方面进行了各种尝试,我想测试
crossprod
对%*%
给出的有效改进。我惊讶地发现,使用 %*%
,我的算法会运行得更快。
因此,我决定在一般矩阵上使用
microbenchmark()
(以及system.time()
)来比较这两个例程,并得到以下结果:
M <- 1000
K <- 100
A <- matrix(rnorm(M*K, 10, 1), ncol = M)
b <- rnorm(K, 10, 1)
B <- matrix(rnorm(M*K, 10, 1), ncol = M)
microbenchmark(crossprod(A, A), t(A)%*%A, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(A, A) 112.58885 121.05406 149.8290 129.31873 147.6489 358.3164 1000 b
t(A) %*% A 76.77698 81.68934 108.0526 89.50015 105.9617 304.7395 1000 a
microbenchmark(crossprod(A), t(A)%*%A, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(A) 58.26374 61.56330 69.35781 64.65561 71.42403 314.9268 1000 a
t(A) %*% A 76.97771 81.80069 92.21863 85.75894 93.50332 273.5133 1000 b
microbenchmark(crossprod(A, B), t(A)%*%B, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(A, B) 109.27471 111.6751 118.00118 112.97533 117.55815 284.6910 1000 b
t(A) %*% B 74.36276 77.0441 83.33582 77.89172 82.58609 258.3154 1000 a
microbenchmark(crossprod(A, b), t(A)%*%b, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(A, b) 0.149644 0.1553795 0.1884534 0.1577500 0.167333 6.737466 1000 a
t(A) %*% b 0.338180 0.6239705 0.8052485 0.6423505 0.678017 13.011479 1000 b
microbenchmark(crossprod(b, d), t(b)%*%d, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(b, d) 0.000814 0.0009130 0.001153643 0.000973 0.0010740 0.018912 1000 a
t(b) %*% d 0.002547 0.0029275 0.003554290 0.003087 0.0033005 0.057184 1000 b
microbenchmark(crossprod(b), t(b)%*%b, times = 1000, unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval cld
crossprod(b) 0.000758 0.000801 0.000866091 0.000848 0.000883 0.004277 1000 a
t(b) %*% b 0.002546 0.002686 0.002872259 0.002785 0.002898 0.033779 1000 b
显然,至少在我的机器上,只有在处理向量或求矩阵平方而不指定 y 参数时,crossprod 才会更快。
我知道理论上,
crossprod
通常应该更快,所以,这怎么可能?
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Manjaro Linux
Matrix products: default
BLAS: /usr/lib/libblas.so.3.8.0
LAPACK: /usr/lib/liblapack.so.3.8.0
我也遇到了同样的问题,请问你找到原因了吗?
# both matrices have elements generated out of a standard random normal, not sparse at all
r$> dim(mat1)
[1] 100 600
r$> dim(mat2)
[1] 100 350
r$> microbenchmark::microbenchmark(crossprod(mat1, mat2), t(mat1) %*% mat2)
Unit: milliseconds
expr min lq mean median uq max neval
crossprod(mat1, mat2) 11.7534 11.82795 12.362565 11.89140 12.1628 26.4312 100
t(mat1) %*% mat2 7.8864 8.01000 8.438531 8.09025 8.2568 20.2051 100
继续奔跑
r$> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
crt ucrt
system x86_64, mingw32
status
major 4
minor 4.1
year 2024
month 06
day 14
svn rev 86737
language R
version.string R version 4.4.1 (2024-06-14 ucrt)
nickname Race for Your Life