这类代码在Core i5-7500上运行速度要快得多,每次迭代只有一个时钟周期,并处于单核Turbo模式。在3.8 GHz时,预期运行时间约为3.25秒。
我编写了一个c程序,运行在Intel i5-7500(kubuntu,virtualbox运行在win10)和Intel Xeon E5-26xx v4(tenxun云)上。我认为Intel i5-7500会更快(CPU MHz:3.4GHz),但是Intel Xeon E5-26xx v4(CPU MHz:2.4GHz)实际上更快。有人可以告诉我原因吗,]?
#include <stdio.h> int main(int argc, const char *argv[]) { long long s = 0, i = 0; for (i = 0; i < 12345678900; i++) { s += i; } printf("%lld\n", s); return 0; }
我用
gcc -std=c11 a.c -O2 && time ./a.out
运行
[英特尔(R)至强(R)CPU E5-26xx v4 env:
➜ ~ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.4.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
➜ ~ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-26xx v4 Stepping: 1 CPU MHz: 2394.454 BogoMIPS: 4788.90 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K NUMA node0 CPU(s): 0 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch pti bmi1 avx2 bmi2 rdseed adx xsaveopt
结果是
➜ ~ gcc -std=c11 a.c -O2 && time ./a.out 2420917449941559086 ./a.out 4.81s user 0.00s system 99% cpu 4.849 total
[Intel(R)Core(TM)i5-7500 CPU @ 3.40GHz env:
➜ ~ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.4.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
➜ ~ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz Stepping: 9 CPU MHz: 3408.002 BogoMIPS: 6816.00 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 6144K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase avx2 invpcid rdseed clflushopt flush_l1d
结果是
➜ ~ gcc -std=c11 a.c -O2 && time ./a.out
2420917449941559086
./a.out 7.01s user 0.00s system 99% cpu 7.019 total
我编写了一个c程序,运行在Intel i5-7500(kubuntu,virtualbox运行在win10)和Intel Xeon E5-26xx v4(tenxun云)上。我认为Intel i5-7500会更快(CPU MHz:3.4GHz),但是Intel Xeon E5-26xx v4(...] >>
这类代码在Core i5-7500上运行速度要快得多,每次迭代只有一个时钟周期,并处于单核Turbo模式。在3.8 GHz时,预期运行时间约为3.25秒。
除非CPU有处理此类短循环的小故障(我对此表示怀疑),可能是Windows和Virtualbox开销的组合使您的情况变慢了。电源管理配置错误可能是另一个原因。
这类代码在Core i5-7500上运行速度要快得多,每次迭代只有一个时钟周期,并处于单核Turbo模式。在3.8 GHz时,预期运行时间约为3.25秒。