Latches and Barriers

General

This section describes various concepts related to thread coordination, and defines the latch, barrier and flex_barrier classes.

This section uses the term 'thread' throughout. Where relevant, it should be updated to refer to execution agents when these are adopted in the standard. See N4231 and N4156.

Terminology

In this subclause, a synchronization point represents a point at which a thread may block until a given condition has been reached.

Latches

Latches are a thread coordination mechanism that allow one or more threads to block until an operation is completed. An individual latch is a single-use object; once the operation has been completed, the latch cannot be reused.

Header <experimental/latch> synopsis


namespace std {
namespace experimental {
inline namespace concurrency_v1 {
  class latch {
   public:
    explicit latch(ptrdiff_t count);
    latch(const latch&) = delete;
    
    latch& operator=(const latch&) = delete;
    ~latch();

    
    void count_down_and_wait();
    void count_down(ptrdiff_t n);

    bool is_ready() const noexcept;
    void wait() const;

   private:
    ptrdiff_t counter_; // exposition only
  };
} // namespace concurrency_v1
} // namespace experimental
} // namespace std

Class latch

A latch maintains an internal counter_ that is initialized when the latch is created. Threads may block at a synchronization point waiting for counter_ to be decremented to 0. When counter_ reaches 0, all such blocked threads are released.

Calls to count_down_and_wait(), count_down(), wait(), and is_ready() behave as atomic operations.

explicit latch(ptrdiff_t count); count >= 0. None. counter_ == count. ~latch(); No threads are blocked at the synchronization point. May be called even if some threads have not yet returned from wait() or count_down_and_wait() provided that counter_ is 0. The destructor might not return until all threads have exited wait() or count_down_and_wait(). void count_down_and_wait(); counter_ > 0. Decrements counter_ by 1. Blocks at the synchronization point until counter_ reaches 0. Synchronizes with all calls that block on this latch and with all is_ready calls on this latch that return true. Nothing. void count_down(ptrdiff_t n); counter_ >= n and n >= 0. Decrements counter_ by n. Does not block. Synchronizes with all calls that block on this latch and with all is_ready calls on this latch that return true. Nothing. void wait() const; If counter_ is 0, returns immediately. Otherwise, blocks the calling thread at the synchronization point until counter_ reaches 0. Nothing. SG1 seems to have a convention that blocking functions are never marked noexcept (e.g. future::wait) even if they never throw. LWG requests that SG1 check whether this pattern is intended, and update the noexcept clauses here accordingly. is_ready() const noexcept; counter_ == 0. Does not block.

Barrier types

Barriers are a thread coordination mechanism that allow a set of participating threads to block until an operation is completed. Unlike a latch, a barrier is reusable: once the participating threads are released from a barrier's synchronization point, they can re-use the same barrier. It is thus useful for managing repeated tasks, or phases of a larger task, that are handled by multiple threads.

The barrier types are the standard library types barrier and flex_barrier. They shall meet the requirements set out in this subclause. In this description, b denotes an object of a barrier type.

Each barrier type defines a completion phase as a (possibly empty) set of effects. When the member functions defined in this subclause arrive at the barrier's synchronization point, they have the following effects:

  1. The function blocks.
  2. When all threads in the barrier's set of participating threads are blocked at its synchronization point, one participating thread is unblocked and executes the barrier type's completion phase.
  3. When the completion phase is completed, all other participating threads are unblocked. The end of the completion phase synchronizes with the returns from all calls unblocked by its completion.

The expression b.arrive_and_wait() shall be well-formed and have the following semantics:

void arrive_and_wait(); The current thread is a member of the set of participating threads. Arrives at the barrier's synchronization point. It is safe for a thread to call arrive_and_wait() or arrive_and_drop() again immediately. It is not necessary to ensure that all blocked threads have exited arrive_and_wait() before one thread calls it again. The call to arrive_and_wait() synchronizes with the start of the completion phase. Nothing.

The expression b.arrive_and_drop() shall be well-formed and have the following semantics:

void arrive_and_drop(); The current thread is a member of the set of participating threads. Either arrives at the barrier's synchronization point and then removes the current thread from the set of participating threads, or just removes the current thread from the set of participating threads. Removing the current thread from the set of participating threads can cause the completion phase to start. The call to arrive_and_drop() synchronizes with the start of the completion phase. Nothing. If all participating threads call arrive_and_drop(), any further operations on the barrier are undefined, apart from calling the destructor. If a thread that has called arrive_and_drop() calls another method on the same barrier, other than the destructor, the results are undefined.

Calls to arrive_and_wait() and arrive_and_drop() never introduce data races with themselves or each other.

Header <experimental/barrier> synopsis


namespace std {
namespace experimental {
inline namespace concurrency_v1 {
  class barrier;
  class flex_barrier;
} // namespace concurrency_v1
} // namespace experimental
} // namespace std

Class barrier

barrier is a barrier type whose completion phase has no effects. Its constructor takes a parameter representing the initial size of its set of participating threads.


class barrier {
 public:
  explicit barrier(ptrdiff_t num_threads);
  barrier(const barrier&) = delete;
  
  barrier& operator=(const barrier&) = delete;
  ~barrier();
  
  

  void arrive_and_wait();
  void arrive_and_drop();
};
explicit barrier(ptrdiff_t num_threads); num_threads >= 0. If num_threads is zero, the barrier may only be destroyed. Initializes the barrier for num_threads participating threads. The set of participating threads is the first num_threads threads to arrive at the synchronization point. ~barrier(); No threads are blocked at the synchronization point. Destroys the barrier.

Class flex_barrier

flex_barrier is a barrier type whose completion phase can be controlled by a function object.


class flex_barrier {
 public:
  template <class F>
    flex_barrier(ptrdiff_t num_threads, F completion);
  explicit flex_barrier(ptrdiff_t num_threads);
  flex_barrier(const flex_barrier&) = delete;
  flex_barrier& operator=(const flex_barrier&) = delete;

  ~flex_barrier();

  void arrive_and_wait();
  void arrive_and_drop();

 private:
  function<ptrdiff_t()> completion_;  // exposition only
};

The completion phase calls completion_(). If this returns -1, then the set of participating threads is unchanged. Otherwise, the set of participating threads becomes a new set with a size equal to the returned value. If completion_() returns 0 then the set of participating threads becomes empty, and this object may only be destroyed.

template <class F> flex_barrier(ptrdiff_t num_threads, F completion);
  • num_threads >= 0.
  • F shall be CopyConstructible.
  • completion shall be Callable (C++14 ยง[func.wrap.func]) with no arguments and return type ptrdiff_t.
  • Invoking completion shall return a value greater than or equal to -1 and shall not exit via an exception.
Initializes the flex_barrier for num_threads participating threads, and initializes completion_ with std::move(completion). The set of participating threads consists of the first num_threads threads to arrive at the synchronization point. If num_threads is 0 the set of participating threads is empty, and this object may only be destroyed.
explicit flex_barrier(ptrdiff_t num_threads); num_threads >= 0. Has the same effect as creating a flex_barrier with num_threads and with a callable object whose invocation returns -1 and has no side effects. ~flex_barrier(); No threads are blocked at the synchronization point. Destroys the barrier.