当我使用 gdb 调试带有输出的 futex 锁时,程序陷入了一个奇怪的循环。
#include <mutex>
#include <iostream>
#include <thread>
#include <unistd.h>
volatile int counter(0); // non-atomic counter
std::mutex mtx;
void increases10k()
{
for(int i=0;i<100000000;i++)
{
sleep(1);
std::cout << "The ID of this thread is: " << std::this_thread::get_id() << std::endl;
std::cout << counter <<std::endl;
mux.lock();
++counter;
std::cout << counter <<std::endl;
mtx.unlock();
}
}
int main(int argc,char **argv)
{
std::thread threads[10];
for(int i=0;i<10;i++)
{
threads[i]=std::thread(increases10k);
}
for(auto& th:threads)
th.join();
std::cout << " successful increases of the counter " << counter <<std::endl;
return 0;
}
我使用gdb命令
catch system futex
来检查系统调用futex。
然后我使用
c
来保持程序运行,但是程序停止了输出并陷入了奇怪的调用和ret循环中。
Thread 8 "a.out" hit Catchpoint 1 (call to syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0db5ffb700 (LWP 8923)]
Thread 9 "a.out" hit Catchpoint 1 (call to syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0db7fff700 (LWP 8919)]
Thread 5 "a.out" hit Catchpoint 1 (returned from syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0dbd044700 (LWP 8917)]
Thread 3 "a.out" hit Catchpoint 1 (call to syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0dbd845700 (LWP 8916)]
Thread 2 "a.out" hit Catchpoint 1 (returned from syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0dbd044700 (LWP 8917)]
Thread 3 "a.out" hit Catchpoint 1 (returned from syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
无穷无尽。我还注意到一个线程被困在 write() 中,但按 c 无法让它继续。
Id Target Id Frame
1 Thread 0x7f0dbe8ac5c0 (LWP 8915) "a.out" 0x00007f0dbe48a6dd in pthread_join (threadid=139696990869248, thread_return=0x0)
at pthread_join.c:90
2 Thread 0x7f0dbd845700 (LWP 8916) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
3 Thread 0x7f0dbd044700 (LWP 8917) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
4 Thread 0x7f0dbc843700 (LWP 8918) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
5 Thread 0x7f0db7fff700 (LWP 8919) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
6 Thread 0x7f0db77fe700 (LWP 8920) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
7 Thread 0x7f0db6ffd700 (LWP 8921) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
8 Thread 0x7f0db67fc700 (LWP 8922) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
9 Thread 0x7f0db5ffb700 (LWP 8923) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
* 10 Thread 0x7f0db57fa700 (LWP 8924) "a.out" 0x00007f0dbd92198d in write () at ../sysdeps/unix/syscall-template.S:84
11 Thread 0x7f0db4ff9700 (LWP 8925) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
当我注释掉输出时,程序在 gdb 中使用
c
可以正常工作。
我不明白为什么“futex 处的断点”和 write() 会互相影响。
当我使用gdb调试futex锁时
请注意,使用 GDB 调试 futex 锁(或任何其他多线程问题)是不可能的——您需要通过构造来使程序正确。
您可以做的唯一调试就是了解程序死锁后的位置。
无穷无尽
你期待什么?每次调用
mux.lock()
和 mux.unlock()
may 都会执行 futex
调用,并且您正在执行 2 * 10 * 100'000'000 个调用(还有一个额外的因子 2,因为 GDB 在 entry 停止)并从系统调用退出)。
当然你会永远被困在那里。
http://xyproblem.info在这里似乎很合适。