无锁队列,只有一个线程执行推入和弹出操作,其他线程执行窃取。
但是,我不明白为什么steal()
需要std::atomic_thread_fence(std::memory_order_seq_cst)
。
我认为,steal()
仅具有一个store操作,即_top.compare_exchange_strong
,并且具有memory_order_seq_cst
。那么,为什么还需要seq_cst围栏?
template <typename T>
class WorkStealingQueue {
public:
WorkStealingQueue() : _bottom(1), _top(1) { }
~WorkStealingQueue() { delete [] _buffer; }
int init(size_t capacity) {
if (capacity & (capacity - 1)) {
LOG(ERROR) << "Invalid capacity=" << capacity
<< " which must be power of 2";
return -1;
}
_buffer = new(std::nothrow) T[capacity];
_capacity = capacity;
return 0;
}
// Steal one item from the queue.
// Returns true on stolen.
// May run in parallel with push() pop() or another steal().
bool steal(T* val) {
size_t t = _top.load(std::memory_order_acquire);
size_t b = _bottom.load(std::memory_order_acquire);
if (t >= b) {
// Permit false negative for performance considerations.
return false;
}
do {
std::atomic_thread_fence(std::memory_order_seq_cst);
b = _bottom.load(std::memory_order_acquire);
if (t >= b) {
return false;
}
*val = _buffer[t & (_capacity - 1)];
} while (!_top.compare_exchange_strong(t, t + 1,
std::memory_order_seq_cst,
std::memory_order_relaxed));
return true;
}
// Pop an item from the queue.
// Returns true on popped and the item is written to `val'.
// May run in parallel with steal().
// Never run in parallel with push() or another pop().
bool pop(T* val) {
const size_t b = _bottom.load(std::memory_order_relaxed);
size_t t = _top.load(std::memory_order_relaxed);
if (t >= b) {
// fast check since we call pop() in each sched.
// Stale _top which is smaller should not enter this branch.
return false;
}
const size_t newb = b - 1;
_bottom.store(newb, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst);
t = _top.load(std::memory_order_relaxed);
if (t > newb) {
_bottom.store(b, std::memory_order_relaxed);
return false;
}
*val = _buffer[newb & (_capacity - 1)];
if (t != newb) {
return true;
}
// Single last element, compete with steal()
const bool popped = _top.compare_exchange_strong(
t, t + 1, std::memory_order_seq_cst, std::memory_order_relaxed);
_bottom.store(b, std::memory_order_relaxed);
return popped;
}
// Push an item into the queue.
// Returns true on pushed.
// May run in parallel with steal().
// Never run in parallel with pop() or another push().
bool push(const T& x) {
const size_t b = _bottom.load(std::memory_order_relaxed);
const size_t t = _top.load(std::memory_order_acquire);
if (b >= t + _capacity) { // Full queue.
return false;
}
_buffer[b & (_capacity - 1)] = x;
_bottom.store(b + 1, std::memory_order_release);
return true;
}
private:
DISALLOW_COPY_AND_ASSIGN(WorkStealingQueue);
std::atomic<size_t> _bottom;
size_t _capacity;
T* _buffer;
std::atomic<size_t> BAIDU_CACHELINE_ALIGNMENT _top;
};
您不必使用seq-cst-fence,但随后必须使对_bottom
的操作顺序一致。原因是必须确保steal
中的加载操作可以看到写入pop
中的更新值。否则,您可能会遇到竞态条件,即同一物品可能被退回两次(一次从流行球退回,一次从盗窃中退回)。
作为比较,您可以看一下我对Chase-Lev-Deque的实现:https://github.com/mpoeter/xenium/blob/master/xenium/chase_work_stealing_deque.hpp