我想在 valgrind 中检查我的 C++ 应用程序,因为我在某些机器上遇到奇怪的崩溃,但在我用于开发的机器上却没有。但我哪儿也去不了。当应用程序启动时,它就会出错。我使用Ubuntu 24.04。我跑
valgrind --leak-check=full -s test
我得到输出:
==45078== Memcheck, a memory error detector
==45078== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==45078== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==45078== Command: test
==45078==
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x48 0x7F 0x5 0x5A 0x9B 0x3F 0x0
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==45078== valgrind: Unrecognised instruction at address 0x13e59c.
==45078== at 0x13E59C: Wrapper (opencl.hpp:2073)
==45078== by 0x13E59C: Context (opencl.hpp:3283)
==45078== by 0x13E59C: Data (clFunctions.hpp:56)
==45078== by 0x13E59C: __static_initialization_and_destruction_0() (clMain.cpp:31)
==45078== by 0x96C5303: call_init (libc-start.c:145)
==45078== by 0x96C5303: __libc_start_main@@GLIBC_2.34 (libc-start.c:347)
==45078== by 0x147244: (below main) (in /home/rainer/x/build)
==45078== Your program just tried to execute an instruction that Valgrind
==45078== did not recognise. There are two possible reasons for this.
==45078== 1. Your program has a bug and erroneously jumped to a non-code
==45078== location. If you are running Memcheck and you just saw a
==45078== warning about a bad jump, it's probably your program's fault.
==45078== 2. The instruction is legitimate but Valgrind doesn't handle it,
==45078== i.e. it's Valgrind's fault. If you think this is the case or
==45078== you are not sure, please let us know and we'll try to fix it.
==45078== Either way, Valgrind will now raise a SIGILL signal which will
==45078== probably kill your program.
==45078==
==45078== Process terminating with default action of signal 4 (SIGILL)
==45078== Illegal opcode at address 0x13E59C
==45078== at 0x13E59C: Wrapper (opencl.hpp:2073)
==45078== by 0x13E59C: Context (opencl.hpp:3283)
==45078== by 0x13E59C: Data (clFunctions.hpp:56)
==45078== by 0x13E59C: __static_initialization_and_destruction_0() (clMain.cpp:31)
==45078== by 0x96C5303: call_init (libc-start.c:145)
==45078== by 0x96C5303: __libc_start_main@@GLIBC_2.34 (libc-start.c:347)
==45078== by 0x147244: (below main) (in /home/rainer/x/build)
==45078==
==45078== HEAP SUMMARY:
==45078== in use at exit: 287,899 bytes in 2,009 blocks
==45078== total heap usage: 3,543 allocs, 1,534 frees, 422,209 bytes allocated
==45078==
==45078== 336 bytes in 7 blocks are possibly lost in loss record 292 of 330
==45078== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==45078== by 0x505B3AD: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.555.42.06)
==45078== by 0x506524C: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.555.42.06)
==45078== by 0x4CB9914: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.555.42.06)
==45078== by 0x51404A1: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.555.42.06)
==45078== by 0x4CB7012: ??? (in /usr/lib/x86_64-linux-gnu/libcuda.so.555.42.06)
==45078== by 0x4005823: call_init (dl-init.c:120)
==45078== by 0x4005823: _dl_init (dl-init.c:121)
==45078== by 0x401F59F: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==45078== by 0x6: ???
==45078== by 0x1FFF000026: ???
==45078== by 0x1FFF000039: ???
==45078== by 0x1FFF00003C: ???
==45078==
==45078== LEAK SUMMARY:
==45078== definitely lost: 0 bytes in 0 blocks
==45078== indirectly lost: 0 bytes in 0 blocks
==45078== possibly lost: 336 bytes in 7 blocks
==45078== still reachable: 285,547 bytes in 1,981 blocks
==45078== suppressed: 0 bytes in 0 blocks
==45078== Reachable blocks (those to which a pointer was found) are not shown.
==45078== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==45078==
==45078== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Illegal instruction (core dumped)
我的程序包含OpenCL代码,所以看看它:
clMain.cpp:31,这就是我声明一个包含 cl::Context 的结构的地方
进入 opencl.hpp,有问题的行是
template <typename T>
class Wrapper
{
public:
typedef T cl_type;
protected:
cl_type object_;
public:
Wrapper() : object_(nullptr) { } //<---------------- line 2073
...
现在 Wrapper 类的构造函数怎么会是无法识别的指令呢?我不知道这一切是怎么回事?
这是
0:62 f1 fd 48 7f 55 a9 vmovdqa64 ZMMWORD PTR [rbp-0x15c0],zmm2
正如评论中所说,这是不支持的 AVX512。
这是 Valgrind 中最需要的功能,但我们没有人来处理它(而且这是一个大项目)。