为什么已经使用seq_cst CAS的无锁队列中需要atomic_thread_fence（memory_order_seq_cst）？

Question

无锁队列，只有一个线程执行推入和弹出操作，其他线程执行窃取。

但是，我不明白为什么steal()需要std::atomic_thread_fence(std::memory_order_seq_cst)。

我认为，steal()仅具有一个store操作，即_top.compare_exchange_strong，并且具有memory_order_seq_cst。那么，为什么还需要seq_cst围栏？

template <typename T>
class WorkStealingQueue {
public:
    WorkStealingQueue() : _bottom(1), _top(1) { }
    ~WorkStealingQueue() { delete [] _buffer; }

    int init(size_t capacity) {
        if (capacity & (capacity - 1)) {
            LOG(ERROR) << "Invalid capacity=" << capacity
                       << " which must be power of 2";
            return -1;
        }

        _buffer = new(std::nothrow) T[capacity];
        _capacity = capacity;
        return 0;
    }

    // Steal one item from the queue.
    // Returns true on stolen.
    // May run in parallel with push() pop() or another steal().
    bool steal(T* val) {
        size_t t = _top.load(std::memory_order_acquire);
        size_t b = _bottom.load(std::memory_order_acquire);
        if (t >= b) {
            // Permit false negative for performance considerations.
            return false;
        }

        do {
            std::atomic_thread_fence(std::memory_order_seq_cst);
            b = _bottom.load(std::memory_order_acquire);
            if (t >= b) {
                return false;
            }
            *val = _buffer[t & (_capacity - 1)];
        } while (!_top.compare_exchange_strong(t, t + 1,
                                               std::memory_order_seq_cst,
                                               std::memory_order_relaxed));
        return true;
    }

    // Pop an item from the queue.
    // Returns true on popped and the item is written to `val'.
    // May run in parallel with steal().
    // Never run in parallel with push() or another pop().
    bool pop(T* val) {
        const size_t b = _bottom.load(std::memory_order_relaxed);
        size_t t = _top.load(std::memory_order_relaxed);
        if (t >= b) {
            // fast check since we call pop() in each sched.
            // Stale _top which is smaller should not enter this branch.
            return false;
        }

        const size_t newb = b - 1;
        _bottom.store(newb, std::memory_order_relaxed);
        std::atomic_thread_fence(std::memory_order_seq_cst);

        t = _top.load(std::memory_order_relaxed);
        if (t > newb) {
            _bottom.store(b, std::memory_order_relaxed);
            return false;
        }

        *val = _buffer[newb & (_capacity - 1)];
        if (t != newb) {
            return true;
        }

        // Single last element, compete with steal()
        const bool popped = _top.compare_exchange_strong(
            t, t + 1, std::memory_order_seq_cst, std::memory_order_relaxed);
        _bottom.store(b, std::memory_order_relaxed);
        return popped;
    }

    // Push an item into the queue.
    // Returns true on pushed.
    // May run in parallel with steal().
    // Never run in parallel with pop() or another push().
    bool push(const T& x) {
        const size_t b = _bottom.load(std::memory_order_relaxed);
        const size_t t = _top.load(std::memory_order_acquire);
        if (b >= t + _capacity) { // Full queue.
            return false;
        }

        _buffer[b & (_capacity - 1)] = x;
        _bottom.store(b + 1, std::memory_order_release);
        return true;
    }

private:
    DISALLOW_COPY_AND_ASSIGN(WorkStealingQueue);

    std::atomic<size_t> _bottom;
    size_t _capacity;
    T* _buffer;
    std::atomic<size_t> BAIDU_CACHELINE_ALIGNMENT _top;
};

Answer 1

您不必使用seq-cst-fence，但随后必须使对_bottom的操作顺序一致。原因是必须确保steal中的加载操作可以看到写入pop中的更新值。否则，您可能会遇到竞态条件，即同一物品可能被退回两次（一次从流行球退回，一次从盗窃中退回）。

作为比较，您可以看一下我对Chase-Lev-Deque的实现：https://github.com/mpoeter/xenium/blob/master/xenium/chase_work_stealing_deque.hpp

为什么已经使用seq_cst CAS的无锁队列中需要atomic_thread_fence（memory_order_seq_cst）？

问题描述投票：2回答：1

1个回答

最新问题

为什么已经使用seq_cst CAS的无锁队列中需要atomic_thread_fence（memory_order_seq_cst）？

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1