为什么基于 Valgrind 的 Helgrind 线程错误检测器和 Google ThreadSanitizer (TSan) 报告不同的数据竞争检测结果？

Question

1.背景

① 下面清单 1 中的 C++ 代码旨在为 s 函数（在

Produce

线程上运行）和

main

函数（在

Consume 上运行）的

t

 变量进行

处理数据竞争

在

main

函数开始时创建的线程）可能会遇到。

清单1：

// a.cpp file
#include <chrono>                                               //01
#include <condition_variable>                                   //02
#include <iostream>                                             //03
#include <mutex>                                                //04
#include <thread>                                               //05
                                                                //06
std::string s = "";                                             //07
std::mutex m;                                                   //08
std::condition_variable cv;                                     //09
bool stop = false;                                              //10
                                                                //11
void Consume() {                                                //12
  while (true) {                                                //13
    std::unique_lock<std::mutex> ul{m};                         //14
    cv.wait(ul, []() { return stop || s != ""; });              //15
    if (stop) break;                                            //16
    std::cout << "Consume \"" << s << "\"\n";                   //17
    s = "";                                                     //18
    m.unlock();                                                 //19
    cv.notify_all();                                            //20
    m.lock();                                                   //21
  }                                                             //22
}                                                               //23
                                                                //24
void Produce(std::string s1) {                                  //25
  std::unique_lock<std::mutex> ul{m};                           //26
  cv.wait(ul, []() { return s == ""; });                        //27 
  s = s1;                                                       //28
  m.unlock();                                                   //29
  cv.notify_all();                                              //30
  m.lock();                                                     //31
}                                                               //32
                                                                //33
void Stop() {                                                   //34
  std::unique_lock<std::mutex> ul{m};                           //35
  cv.wait(ul, []() { return s == ""; });                        //36   
  stop = true;                                                  //37
  m.unlock();                                                   //38
  cv.notify_all();                                              //39
  m.lock();                                                     //40
}                                                               //41
                                                                //42
int main() {                                                    //43
  auto t = std::thread(Consume);                                //44
  Produce("A");                                                 //45
  Produce("B");                                                 //46
  Produce("C");                                                 //47
  Stop();                                                       //48
  t.join();                                                     //49
  return 0;                                                     //50
}                                                               //51

② 使用 Ubuntu clang++ 版本 14.0.0-1ubuntu1.1 编译上述 C++ 代码时，Google ThreadSanitizer（又名 TSan）未检测到数据竞争，如下面的清单 2 所示。

清单 2：

$ clang++ -fsanitize=thread -g -O1 -Wall a.cpp          #01
$ ./a.out                                               #02
Consume "A"                                             #03
Consume "B"                                             #04
Consume "C"                                             #05 
$                                                       #06

2.关注

使用

Ubuntu clang++ 版本 14.0.0-1ubuntu1.1 编译上述 C++ 代码时，基于 Valgrind 的 Helgrind 线程错误检测器会检测到可能的数据竞争，如下面的 清单 3 所示。

清单 3：

$ clang++ -g -O1 -Wall a.cpp                                                           //01
$ valgrind --tool=helgrind ./a.out                                                     //02
==<PID>== Helgrind, a thread error detector                                            //03
==<PID>== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.              //04
==<PID>== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info           //05
==<PID>== Command: ./a.out                                                             //06
==<PID>==                                                                              //07
==<PID>== ---Thread-Announcement------------------------------------------             //08
==<PID>==                                                                              //09
==<PID>== Thread #1 is the program's root thread                                       //10
==<PID>==                                                                              //11
==<PID>== ---Thread-Announcement------------------------------------------             //12
==<PID>==                                                                              //13
==<PID>== Thread #2 was created                                                        //14
==<PID>==  at 0x53729F3: clone                      (clone.S         : 76)             //15
==<PID>==  by 0x53738EE: __clone_internal           (clone-internal.c: 83)             //16
==<PID>==  by 0x52E16D8: create_thread              (pthread_create.c:295)             //17
==<PID>==  by 0x52E21FF: pthread_create@@GLIBC_2.34 (pthread_create.c:828)             //18
==<PID>==  by 0x4E13585: pthread_create_WRK         (hg_intercepts.c :445)             //19
==<PID>==  by 0x4E14A8C: pthread_create@*           (hg_intercepts.c :478)             //20
==<PID>==  by 0x50FD328: std::thread::_M_start_thread(                                 //21
                           std::unique_ptr<                                            //22
                             std::thread::_State,                                      //23
                             std::default_delete<std::thread::_State>                  //24
                           >,                                                          //25
                           void (*)()                                                  //26
                         )                                                             //27
                         (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)            //28
==<PID>==  by 0x10A691: thread<void (&)(), void>    (std_thread.h:143)                 //22
==<PID>==  by 0x10A691: main (???:43)                                                  //23
==<PID>==                                                                              //24
==<PID>== ----------------------------------------------------------------             //25
==<PID>==                                                                              //26
==<PID>== Possible data race during read of size 4 at 0x10D110 by thread #1            //27
==<PID>== Locks held: none                                                             //28
==<PID>==  at 0x52E4F4A: pthread_mutex_lock@@GLIBC_2.2.5 (pthread_mutex_lock.c: 94)    //29
==<PID>==  by 0x4E1009A: mutex_lock_WRK                  (hg_intercepts.c     :937)    //30
==<PID>==  by 0x4E14E73: pthread_mutex_lock              (hg_intercepts.c     :960)    //31
==<PID>==  by 0x10A4D4 : __gthread_mutex_lock            (gthr-default.h      :749)    //32
==<PID>==  by 0x10A4D4 : lock                            (std_mutex.h         :100)    //33
==<PID>==  by 0x10A4D4 : lock                            (unique_lock.h       :139)    //34
==<PID>==  by 0x10A4D4 : unique_lock                     (unique_lock.h       : 69)    //35
==<PID>==  by 0x10A4D4 : Produce(                                                      //36
                           std::__cxx11::basic_string<                                 //37
                             char,                                                     //38
                             std::char_traits<char>,                                   //39
                             std::allocator<char>                                      //40
                           >                                                           //41
                         )                                (???                  :26)   //42
==<PID>==  by 0x10A6C5 : main (a.cpp:44)                                               //43
==<PID>==                                                                              //44
==<PID>== This conflicts with a previous write of size 4 by thread #2                  //45
==<PID>== Locks held: none                                                             //46
==<PID>==  at 0x52E696C: __pthread_mutex_unlock_usercnt  (pthread_mutex_unlock.c:  62) //47
==<PID>==  by 0x52E08AD: __pthread_cond_wait_common      (pthread_cond_wait.c   : 419) //48
==<PID>==  by 0x52E08AD: pthread_cond_wait@@GLIBC_2.3.2  (pthread_cond_wait.c   : 627) //49
==<PID>==  by 0x4E1390B: pthread_cond_wait_WRK           (hg_intercepts.c       :1291) //50
==<PID>==  by 0x4E14EAA: pthread_cond_wait@*             (hg_intercepts.c       :1318) //51
==<PID>==  by 0x10A3A9 : wait<(lambda at a.cpp:15:17)>   (condition_variable    : 103) //52
==<PID>==  by 0x10A3A9 : Consume()                       (???                   :  15) //53
==<PID>==  by 0x10A833 : __invoke_impl<void, void (*)()> (invoke.h              :  61) //54
==<PID>==  by 0x10A833 : __invoke<void (*)()>            (invoke.h              :  96) //55
==<PID>==  by 0x10A833 : _M_invoke<0UL>                  (std_thread.h          : 259) //56
==<PID>==  by 0x10A833 : operator()                      (std_thread.h          : 266) //57
==<PID>==  by 0x10A833 : std::thread::_State_impl<                                     //58
                           std::thread::_Invoker<                                      //59
                             std::tuple<void (*)()>                                    //60
                           >                                                           //61
                         >::_M_run()                     (std_thread.h          : 211) //62
==<PID>==  by 0x50FD252: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)        //63
==<PID>==  by 0x4E13779: mythread_wrapper                (hg_intercepts.c       : 406) //64
==<PID>==                                                                              //65
==<PID>==  Address 0x10d110 is 8 bytes inside data symbol "m"                          //66
==<PID>==                                                                              //67
==<PID>== ----------------------------------------------------------------             //68

<-- Text Truncated -->

==<PID>== Use --history-level=approx or =none to gain increased speed, at
==<PID>== the cost of reduced accuracy of conflicting-access information
==<PID>== For lists of detected and suppressed errors, rerun with: -s
==<PID>== ERROR SUMMARY: 71 errors from 36 contexts (suppressed: 0 from 0)
$

3.问题

Google TSan 或Valgrind Helgrind 的报告哪一个是正确的，为什么？

Answer 1

在这些情况下，C++ 标准库以一种看起来像是错误的方式使用 pthread（至少在 DRD 和 Helgrind 的作者在编写它们时认为是这样）。

请参阅此问题的答案，了解 DRD 和 Helgrind 期望的解释和示例：理解 pthread_cond_wait() 和 pthread_cond_signal()。

通常情况下，要“解决”这种问题而不把孩子和洗澡水一起倒掉是很棘手的。

我能做的就是添加一个选项，比如

   --ignore-dubious=yes|no  Ignore dubious usage of pthread_cond_{signal|broadcast}
                            where the signaller is not holding the mutex [no]

同时您可以抑制此类错误。

{
   dubious DRD condition signal/broadcast
   drd:CondRaceErr
   fun:pthread_cond_*
}
{
   dubious Helgrind condition signal/broadcast
   Helgrind:Dubious
   fun:pthread_cond*
}

为什么基于 Valgrind 的 Helgrind 线程错误检测器和 Google ThreadSanitizer (TSan) 报告不同的数据竞争检测结果？

问题描述投票：0回答：1

1.背景

2.关注

3.问题

1个回答

最新问题

为什么基于 Valgrind 的 Helgrind 线程错误检测器和 Google ThreadSanitizer (TSan) 报告不同的数据竞争检测结果？

问题描述 投票：0回答：1

1.背景

2.关注

3.问题

1个回答

最新问题

问题描述投票：0回答：1