M3 Mac 上的 R 矩阵计算基准测试比 2019 Intel 慢得多

问题描述 投票:0回答:1

我运行了一些基准测试,将我的新 M3 Pro Macbook Pro 与我的旧 2019 Intel Macbook Pro 进行比较,并且我得到了一些非常糟糕的矩阵运算计时结果。对此有解释吗,要么是由于硬件本身,要么是我设置计算机的方式?

例如:

  • 英特尔
    2,500 x 2,500 cross-product matrix (b = a' * a): 0.0902 (sec).
  • M3
    2,500 x 2,500 cross-product matrix (b = a' * a): 7.53 (sec).

英特尔 Macbook 信息

2.3 GHz 8 核英特尔酷睿 i9

> devtools::session_info(info="all")
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.4.1
 system   x86_64, darwin20

 [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library

 BLAS           /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
 lapack         /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib
 lapack_version 3.11.0

M3 Macbook 信息

version  R version 4.4.0 (2024-04-24)
 os       macOS Sonoma 14.4.1
 system   aarch64, darwin20
 

BLAS           /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
 lapack         /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib
 lapack_version 3.12.0

英特尔结果

> benchmarkme::benchmark_std(runs = 10)

# Programming benchmarks (5 tests):
    3,500,000 Fibonacci numbers calculation (vector calc): 0.174 (sec).
    Grand common divisors of 1,000,000 pairs (recursion): 0.577 (sec).
    Creation of a 3,500 x 3,500 Hilbert matrix (matrix calc): 0.216 (sec).
    Creation of a 3,000 x 3,000 Toeplitz matrix (loops): 1.07 (sec).
    Escoufier's method on a 60 x 60 matrix (mixed): 3.05 (sec).
# Matrix calculation benchmarks (5 tests):
    Creation, transp., deformation of a 5,000 x 5,000 matrix: 0.503 (sec).
    2,500 x 2,500 normal distributed random matrix^1,000: 0.146 (sec).
    Sorting of 7,000,000 random values: 0.627 (sec).
    2,500 x 2,500 cross-product matrix (b = a' * a): 0.0902 (sec).
    Linear regr. over a 5,000 x 500 matrix (c = a \ b'): 0.047 (sec).
# Matrix function benchmarks (5 tests):
    Cholesky decomposition of a 3,000 x 3,000 matrix: 0.413 (sec).
    Determinant of a 2,500 x 2,500 random matrix: 0.495 (sec).
    Eigenvalues of a 640 x 640 random matrix: 0.672 (sec).
    FFT over 2,500,000 random values: 0.233 (sec).
    Inverse of a 1,600 x 1,600 random matrix: 0.575 (sec).

M3 结果

> benchmarkme::benchmark_std(runs = 10)

# Programming benchmarks (5 tests):
    3,500,000 Fibonacci numbers calculation (vector calc): 0.0826 (sec).
    Grand common divisors of 1,000,000 pairs (recursion): 0.192 (sec).
    Creation of a 3,500 x 3,500 Hilbert matrix (matrix calc): 0.101 (sec).
    Creation of a 3,000 x 3,000 Toeplitz matrix (loops): 0.488 (sec).
    Escoufier's method on a 60 x 60 matrix (mixed): 0.406 (sec).
# Matrix calculation benchmarks (5 tests):
    Creation, transp., deformation of a 5,000 x 5,000 matrix: 0.162 (sec).
    2,500 x 2,500 normal distributed random matrix^1,000: 0.0794 (sec).
    Sorting of 7,000,000 random values: 0.481 (sec).
    2,500 x 2,500 cross-product matrix (b = a' * a): 7.53 (sec).
    Linear regr. over a 5,000 x 500 matrix (c = a \ b'): 0.629 (sec).
# Matrix function benchmarks (5 tests):
    Cholesky decomposition of a 3,000 x 3,000 matrix: 4.13 (sec).
    Determinant of a 2,500 x 2,500 random matrix: 1.48 (sec).
    Eigenvalues of a 640 x 640 random matrix: 0.349 (sec).
    FFT over 2,500,000 random values: 0.0689 (sec).
    Inverse of a 1,600 x 1,600 random matrix: 1.16 (sec).
r macos performance
1个回答
0
投票

问题是

devtools::session_info
错误地告诉我,当我在 RStudio 中时,我正在使用 Acclerate BLAS,但随后又告诉我,当我在 RStudio 之外运行脚本时,我正在使用标准 R BLAS。

当我使用此处的说明重新链接 Accelerate BLAS 时:https://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Which-BLAS-is-used-and-how-can-it -be-changed_003f

糟糕的矩阵性能消失了。

© www.soinside.com 2019 - 2024. All rights reserved.