使用 -mcpu/-march 允许在 x86 或 altivec 上启用一组扩展指令,例如 sse,但是在针对当前 cpu 进行构建时,这并不总是足够的。
例如,将
-mcpu=cascadelake
传递给 clang 并不意味着启用 bmi 或级联湖 cpu 上可能存在的各种 avx512 扩展。
这就是为什么 gcc 作为额外的可能性,即
-mtune=native
。使用此选项将启用生成当前主机 cpu 支持的扩展的所有编译器标志。但是 clang 相当于什么?
来自手册页
man clang
man clang | grep march
-march=<cpu>
Specify that Clang should generate code for a specific processor family member and later. For example, if you specify -march=i486, the compiler is allowed to generate instructions that are valid on i486 and later processors, but which may not exist on earlier ones.
-march
启用CPU功能-mtune
来优化某个微架构。在我的
11th Gen Intel(R) Core(TM) i7-11700KF @ 3.60GHz
和 clang 12 它似乎很重视一个 -march=native
标志。
来自这里:
clang -march=native -E -v - </dev/null 2>&1 | grep cc1
(分别与gcc
)
我确实得到了一组不同的标志被激活用于编译。
与
-march=native
$ clang -march=native -E -v - </dev/null 2>&1 | grep cc1
"/usr/lib/llvm-12/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -E -disable-free -disable-llvm-verifier -discard-value-names -main-file-name - -mrelocation-model static -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu icelake-client -target-feature +sse2 -target-feature -tsxldtrk -target-feature +cx16 -target-feature +sahf -target-feature -tbm -target-feature +avx512ifma -target-feature +sha -target-feature +gfni -target-feature -fma4 -target-feature +vpclmulqdq -target-feature +prfchw -target-feature +bmi2 -target-feature -cldemote -target-feature +fsgsbase -target-feature -ptwrite -target-feature -amx-tile -target-feature -uintr -target-feature +popcnt -target-feature -widekl -target-feature +aes -target-feature +avx512bitalg -target-feature -movdiri -target-feature +xsaves -target-feature -avx512er -target-feature -avxvnni -target-feature +avx512vnni -target-feature -amx-bf16 -target-feature +avx512vpopcntdq -target-feature -pconfig -target-feature -clwb -target-feature +avx512f -target-feature +xsavec -target-feature -clzero -target-feature +pku -target-feature +mmx -target-feature -lwp -target-feature +rdpid -target-feature -xop -target-feature +rdseed -target-feature -waitpkg -target-feature -kl -target-feature -movdir64b -target-feature -sse4a -target-feature +avx512bw -target-feature +clflushopt -target-feature +xsave -target-feature +avx512vbmi2 -target-feature +64bit -target-feature +avx512vl -target-feature -serialize -target-feature -hreset -target-feature +invpcid -target-feature +avx512cd -target-feature +avx -target-feature +vaes -target-feature -avx512bf16 -target-feature +cx8 -target-feature +fma -target-feature -rtm -target-feature +bmi -target-feature -enqcmd -target-feature +rdrnd -target-feature -mwaitx -target-feature +sse4.1 -target-feature +sse4.2 -target-feature +avx2 -target-feature +fxsr -target-feature -wbnoinvd -target-feature +sse -target-feature +lzcnt -target-feature +pclmul -target-feature -prefetchwt1 -target-feature +f16c -target-feature +ssse3 -target-feature -sgx -target-feature -shstk -target-feature +cmov -target-feature +avx512vbmi -target-feature -amx-int8 -target-feature +movbe -target-feature -avx512vp2intersect -target-feature +xsaveopt -target-feature +avx512dq -target-feature +adx -target-feature -avx512pf -target-feature +sse3 -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /usr/lib/llvm-12/lib/clang/12.0.0 -internal-isystem /usr/local/include -internal-isystem /usr/lib/llvm-12/lib/clang/12.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir /home/joel -ferror-limit 19 -fgnuc-version=4.2.1 -faddrsig -o - -x c -
$ clang -march=native -E -v - </dev/null 2>&1 | grep cc1 | wc
2 228 2870
没有
-march=native
$ clang -E -v - </dev/null 2>&1 | grep cc1
"/usr/lib/llvm-12/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -E -disable-free -disable-llvm-verifier -discard-value-names -main-file-name - -mrelocation-model static -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /usr/lib/llvm-12/lib/clang/12.0.0 -internal-isystem /usr/local/include -internal-isystem /usr/lib/llvm-12/lib/clang/12.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir /home/joel -ferror-limit 19 -fgnuc-version=4.2.1 -faddrsig -o - -x c -
$ clang -E -v - </dev/null 2>&1 | grep cc1 | wc
2 58 799