为什么Helgrind显示 "锁顺序被违反 "的错误信息?

问题描述 投票:1回答:1

请看下面的代码

    #include <stdio.h>
    #include <pthread.h>
    #include <assert.h>
    #include <stdlib.h>

    pthread_mutex_t g = PTHREAD_MUTEX_INITIALIZER;
    pthread_mutex_t m1 = PTHREAD_MUTEX_INITIALIZER;
    pthread_mutex_t m2 = PTHREAD_MUTEX_INITIALIZER;

    void* worker(void* arg) 
    {
        pthread_mutex_lock(&g);

        if ((long long) arg == 0) {
        pthread_mutex_lock(&m1);
        pthread_mutex_lock(&m2);
        } else {
        pthread_mutex_lock(&m2);
        pthread_mutex_lock(&m1);
        }
        pthread_mutex_unlock(&m1);
        pthread_mutex_unlock(&m2);

        pthread_mutex_unlock(&g);
        return NULL;
    }

    int main(int argc, char *argv[]) {
        pthread_t p1, p2;
        pthread_create(&p1, NULL, worker, (void *) (long long) 0);
        pthread_create(&p2, NULL, worker, (void *) (long long) 1);
        pthread_join(p1, NULL);
        pthread_join(p2, NULL);
        return 0;
    }

Helgrind抛出以下错误。

==10035== Helgrind, a thread error detector
==10035== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==10035== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==10035== Command: ./Hw5
==10035== 
==10035== ---Thread-Announcement------------------------------------------
==10035== 
==10035== Thread #3 was created
==10035==    at 0x538987E: clone (clone.S:71)
==10035==    by 0x5050EC4: create_thread (createthread.c:100)
==10035==    by 0x5050EC4: pthread_create@@GLIBC_2.2.5 (pthread_create.c:797)
==10035==    by 0x4C36A27: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x1088BD: main (Hw5.c:28)
==10035== 
==10035== ----------------------------------------------------------------
==10035== 
==10035== Thread #3: lock order "0x309080 before 0x3090C0" violated
==10035== 
==10035== Observed (incorrect) order is: acquisition of lock at 0x3090C0
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x10882E: worker (Hw5.c:16)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035== 
==10035==  followed by a later acquisition of lock at 0x309080
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x10883A: worker (Hw5.c:17)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035== 
==10035== Required order was established by acquisition of lock at 0x309080
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x108814: worker (Hw5.c:13)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035== 
==10035==  followed by a later acquisition of lock at 0x3090C0
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x108820: worker (Hw5.c:14)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035== 
==10035==  Lock at 0x309080 was first observed
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x108814: worker (Hw5.c:13)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035==  Address 0x309080 is 0 bytes inside data symbol "m1"
==10035== 
==10035==  Lock at 0x3090C0 was first observed
==10035==    at 0x4C3403C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x108820: worker (Hw5.c:14)
==10035==    by 0x4C36C26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==10035==    by 0x50506DA: start_thread (pthread_create.c:463)
==10035==    by 0x538988E: clone (clone.S:95)
==10035==  Address 0x3090c0 is 0 bytes inside data symbol "m2"
==10035== 
==10035== 
==10035== 
==10035== For counts of detected and suppressed errors, rerun with: -v
==10035== Use --history-level=approx or =none to gain increased speed, at
==10035== the cost of reduced accuracy of conflicting-access information
==10035== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7)

我认为外锁 g 将不允许两个线程同时进入关键部分,只有一个线程可以获得锁。g 在特定时间。所以我认为不可能出现僵局。我说的不对吗?为什么helgrind会出现这个错误?请解释一下。

c multithreading pthreads valgrind deadlock
1个回答
4
投票

Helgrind抱怨你的线程被观察到锁定了mutexes。m1m2 在不同的相对顺序中,这一点从代码的检查中也很清楚。Helgrind寻找并标记这种获取顺序的差异,因为一般来说,它们会产生死锁的风险。

我认为外锁 g 将不允许两个线程同时进入关键部分。只有一个线程可以获得锁 g 在特定时间。所以我认为不可能出现僵局。我说错了吗?

你没有说错。 所提出的特定程序不会死锁,因为每个线程都必须获得 g 才能获取其他任何一个mutexes。

为什么 helgrind 会出现这个错误?

因为 helgrind 的是一个 启发式 的分析 运行时行为 您的方案在 单跑. 它并不假设程序的一次运行就能证明所有可能的行为。 另一方面,你的评估是基于源代码分析的。

您在这里看到的启发式规则是,任何线程都不应该以不同的相对顺序获取一对互斥。 对于您的特定程序来说,这会产生一个假阳性,但您的程序似乎是专门为产生这种情况而设计的。 没有必要使用mutexes m1m2 首先如果缄默 g 将始终保持当其他任何一个获得。 如果任何其他线程有可能获得 m1m2 纵然 g然而,无论其他线程的获取顺序如何,死锁风险都是真实存在的。

那么,无论如何,这个警告都意味着你的代码存在真正的问题:要么你正在执行不需要的mutex操作,要么你存在真正的当前或未来的死锁风险。

© www.soinside.com 2019 - 2024. All rights reserved.