C++ 为什么这段代码会在多个线程中并发写入向量？它不会被损坏

Question

注意：这是用

GCC 11.2.0

编译的

以下应该是我的更复杂代码的简化示例。我的实际代码太复杂而无法解释，因此我尝试创建一个简单的可重现问题的版本，但它实际上有效。现在除了我的代码之外，我更感兴趣的是为什么这个简单的示例甚至可以工作。

基本上，我正在创建地图矢量。地图的数量取决于我正在运行的线程数量。每个线程将填充向量中的每个映射。我预计，由于向量是由多个线程同时写入的，因此它应该损坏数据，因为它都是连续的内存，在运行时动态变化，对吧？那么这如何不损坏数据呢？

#define _GLIBCXX_USE_NANOSLEEP  //add it top of c++ code
#include <iostream>
#include <vector>
#include <unordered_map>
#include <thread>

void add_map(std::unordered_map<int, int>& um, int map_size) {
  for (int i = 0; i < map_size; i++) {
    um.insert({i, i});
  }
}

void print_map(std::unordered_map<int, int>& um) {
  for (auto& u : um) { std::cout << u.first << " " << u.second << std::endl; }
}

int main(int argc, char* argv[]) {
  // Get number of threads from user input and set random seed
  int num_threads = std::stoi(argv[1]);
  std::srand((unsigned) time(NULL));

  // Populate a vector with num_threads maps, so each trhead can write to each map inside the vector
  std::vector<std::unordered_map<int, int>> chunks;
  for (int i = 0; i < num_threads; i++) {
    std::unordered_map<int, int> empty_map;
    chunks.push_back(empty_map);
  }

  // Come up with the sizes for each map that will be handled by each thread
  std::vector<int> chunk_sizes;
  for (int i = 0; i < num_threads; i++) {
    int chunk_size = 1 + (std::rand() % 5);
    chunk_sizes.push_back(chunk_size);
  }
  std::cout << "Chunk sizes" << std::endl; for (int i = 0; i < num_threads; i++) { std::cout << chunk_sizes[i] << " "; } std::cout << std::endl;

  // For each thread, create a <int, int> unordered map so that the parts of the vector 'chunks' gets populated concurrently
  std::vector<std::thread> threads;
  for (int i = 0; i < num_threads; i++) {
    threads.push_back(std::thread(add_map, std::ref(chunks[i]), chunk_sizes[i]));
  }
  for (int i = 0; i < threads.size(); i++) { threads[i].join(); }

  // Print the maps
  for (int i = 0; i < chunks.size(); i++) {
    std::cout << "=== chunk: " << i << "; size: " << chunk_sizes[i] << " ===" << std::endl;
    print_map(chunks[i]);
  }
}

当使用

参数运行时，我得到以下输出

Chunk sizes
2 5 5 4 4
=== chunk: 0; size: 2 ===
1 1
0 0
=== chunk: 1; size: 5 ===
4 4
3 3
2 2
1 1
0 0
=== chunk: 2; size: 5 ===
4 4
3 3
2 2
1 1
0 0
=== chunk: 3; size: 4 ===
3 3
2 2
1 1
0 0
=== chunk: 4; size: 4 ===
3 3
2 2
1 1
0 0

看起来地图矢量正在按预期处理。但为什么呢？

Answer 1

我没有看到您的代码中的多线程中禁止执行任何操作。

现在，证明这一点是很难，因为理论上标准的任何部分都可以添加“并且看起来像这样的线是竞争条件”。

但我可以告诉你简单的经验法则。

两个线程可以访问同一个
```
std
```
容器，只要使用的方法是
```
const
```
。
单线程以任何方式访问
```
std
```
容器都不会生成竞争条件。
您可以从不同线程访问 std 容器中的不同元素，而不会导致竞争条件。
就规则 1 而言，不创建新元素的“查找”操作算作
const
```
。这些内容类似于关联容器上的 
```
.find
```
 和 
```
[]
```
 上的 
```
std::vector
```
，但不是 
```
[]
```
 上的 
```
std::map
```
。
```
任何类型的迭代器失效基本上都是一种竞争条件。

这些规则并不完整。但你会注意到你没有破坏它们中的任何一个。

简而言之，您可以从一个线程自由地读取/写入向量的一个成员，而另一个线程则在另一个线程中读取/写入。只是不要重新分配内存（使迭代器无效）或让一个线程写入，而另一个线程读取同一成员。

C++ 为什么这段代码会在多个线程中并发写入向量？它不会被损坏

问题描述投票：0回答：1

1个回答

最新问题

C++ 为什么这段代码会在多个线程中并发写入向量？它不会被损坏

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1