This proposal is a collection of great ideas from many people and spanning years of work and experience. Please see the Acknowledgments. And know that I probably missed some key ones (sorry in advance).
I'm writing this with the knowledge that some parts of this proposal are hotly contested. Additionally, this is a huge proposal, encompassing thread-launching, mutexes/locks and condition variables. It is hoped that the table of contents will allow you to quickly jump to your area of interest and gather all of the information important to you quickly and concisely. Not covered herein are the incredibly important topics: memory model, atomics, and thread local data.
I believe that we need flexibility in the way we launch threads (or really just units of work). Below is a top-down view of a fully prototyped API.
This part of the proposal supports the views in N2096 with only a few minor changes (differences that can certainly be worked out). Indeed, much of the work presented herein is based on the work in N2096. The important characteristics are consistent: A future:
Example use cases include:
int f(); double g(int x, const char* y); std::vector<int> h(int x); int main() { std::future<int> f1 = std::launch_in_thread(f); std::future<double> f2 = std::launch_in_thread(std::bind(g, 1, "two")); // ... int i = f1(); double d = f2(); // ... std::thread_pool launch_in_pool; std::future<std::vector<int>> f3 = launch_in_pool(std::bind(h, 100)); // ... std::vector<int> v = f3(); }
futures can be easily and cheaply copied around, and queried by multiple threads, or even the same thread multiple times. The future can also be waited on by several (simultaneous) clients. If the work-unit computing the result for the future terminates via an exception, then that information is propagated to clients who wait on the future via a standard exception (new language features will be required to propagate a copy of an arbitrary exception).
future synopsis:
template <class R> class future { private: future(const task<R>&); public: future(); ~future(); future(task<R>&& t); future(const future& t); future& operator=(const future& t); bool ready() const; bool is_normal() const; void wait(); bool timed_wait(const timespec& xt); const R& operator()() const; };
Not proposed herein is future groups. That is not because I think they're a bad idea. It is simply because I ran out of time to prototype it. Everything proposed herein has been prototyped by myself, and in some cases by several more people independently. I strongly suspect future groups (e.g. (f1 || f2) && f3) would be a valuable layer above futures and encourage an open mind towards this idea. I do not currently have the expertise/experience to propose it.
The introduction of move semantics in C++ impacts the way interfaces can be designed. In particular, moveable but non-copyable types such as fstream can be handled much more elegantly in this new environment. For example clients will now be able to return such types from functions, and put such types into containers.
std::vector<std::ifstream> find_files(some_search_criteria);
It is important that C++ programmers be able to easily execute such functions with non-copyable return types. Lack of attention to this detail will mean that C++ programmers will be able to execute such functions with ordinary (synchronous) calls, but will be confronted by a compile time error when they try to query a std::future<std::vector<std::ifstream>> resulting from an asynchronous call.
Thus a task is introduced. It functions much like a future, but does not require its type to be CopyConstructible (only MoveConstructible). The task itself is also not CopyConstructible but only MoveConstructible. You can place such objects in containers, and return them from functions, but you can not make copies of them, nor query them more than once. It is a one-shot view of a work-unit's return value.
Example use cases include:
int f(); double g(int x, const char* y); std::vector<std::unique_ptr<Foo>> h(int x); int main() { std::task<int> f1 = std::launch_in_thread(f); std::task<double> f2 = std::launch_in_thread(std::bind(g, 1, "two")); // ... int i = f1(); double d = f2(); // ... std::thread_pool launch_in_pool; std::task<std::vector<std::unique_ptr<Foo>>> f3 = launch_in_pool(std::bind(h, 100)); // ... std::vector<std::unique_ptr<Foo>> v = f3(); }
Note that the exact same launching functions and thread pools can be used with task as is used with future. The task fits seamlessly into this API, and has an ease of use equivalent to future.
Why do we need a separate class task? Can't we just have a move-from-future call?
Consider the following program with two threads calling two different functions, but with the same type of future as an argument:
Thread A Thread B void foo(future<T> f) { T t = f(); } void bar(future<T> f) { T t = f(); }
There is no problem with this code, even if both future's refer to the same result. However, consider if one thread attempted to modify that result by moving from it:
Thread A Thread B void foo(future<T> f) { T t = f(); } void bar(future<T> f) { T t = move(f()); // race condition? }
By using a separate class task which is by design incapable of sharing with future's, or even with other task's, it is clear that there can be no race condition:
Thread A Thread B void foo(future<T> f) { T t = f(); } void bar(task<T> f) { T t = f(); // Ok } void foo(task<T> f) { T t = f(); } void bar(task<T> f) { T t = f(); // Also Ok }
In a nutshell, the future API must be such that clients are incapable of modifying the work-unit's return value, and task must not share ownership. To do otherwise presents an API to the C++ programmer that is overly error-prone. This is exactly why this proposal has no set_value functionality for future as proposed in N2096. Herein, task_wrapper has the unique and protected privilege of setting the value associated with a task or future.
Not only does task provide the functionality of returning move-only types, and aids in reasoning about program correctness, it can also provide significant performance benefits for those use cases where one returns a type that is expensive to copy, but cheap to move.
Consider this benchmark:
std::vector<int> func(unsigned s, int i) { return std::vector<int>(s, i); } const unsigned Repeat = 10000; const unsigned Len = 100000; int main() { std::thread_pool launch_in_pool; for (int i = 0; i < Repeat; ++i) { std::future<std::vector<int>> f = launch_in_pool(std::bind(func, Len, i)); std::vector<int> v = f(); } }
To study the overhead of this thread launching API, the above program has been run in 5 different configurations on a single processor machine, and the timing measured. The for-loop was varied for each configuration:
Ordinary function call:
std::vector<int> v = func();
launch in pool, catch in task:
task<std::vector<int>> f = launch_in_pool(std::bind(func, Len, i)); std::vector<int> v = f();
launch in thread, catch in task:
task<std::vector<int>> f = launch_in_thread(std::bind(func, Len, i)); std::vector<int> v = f();
launch in pool, catch in future:
future<std::vector<int>> f = launch_in_pool(std::bind(func, Len, i)); std::vector<int> v = f();
launch in thread, catch in future:
future<std::vector<int>> f = launch_in_thread(std::bind(func, Len, i)); std::vector<int> v = f();
The resulting timings are:
Ordinary function call : 8.40 seconds launch in pool, catch in task : 9.54 seconds launch in thread, catch in task : 12.18 seconds launch in pool, catch in future : 28.87 seconds launch in thread, catch in future : 30.71 seconds
For the ordinary function call case, move semantics does not come into play. The vector is returned via RVO.
Executing in a pool adds the overhead of coordinating with an already running thread which executes func and then signals the main thread to get the result. If the result is caught with a task it is moved out. If the result is caught with a future, it is copied out. If func is executed in a thread, instead of a pool, then more overhead is added for OS thread startup and shutdown.
It is clear that there are cases where the expense of copying from a future is significant enough to negate the performance one might gain from the use of threads (or even thread pools) on a multi-processor architecture. The vector is not that big in this example. It consumes only 391Kb, a small amount of memory by today's standards. As the expense of copying the return value grows, so does the cost of using a future over that of a task or ordinary function call. If the free lunch is truly over, and high-level threading is the answer, then we need to pay attention to order-of-magnitude effects in performance such as this.
task synopsis:
template <class R> class task { private: task(const task&); task& operator=(const task&); public: task(); ~task(); task(task&& t); task& operator=(task&& t); bool ready() const; bool is_normal() const; void wait(); bool timed_wait(const timespec& xt); R operator()(); };
Note that this task is a different class than that proposed in N2096. This proposal has a similar class to that proposed in N2096, but it is called task_wrapper instead.
A thread pool is an object that maps M tasks onto N OS threads, where M is typically much larger than N. This can circumvent the cost of thread startup and shut down. The design laid out here is basic. One can specify the number of OS threads in the constructor, or default construct the class to let the implementation pick an "optimum" number of OS threads. A function-call syntax is used to queue tasks which will be executed as an OS thread becomes available for work. The client can optionally catch the return value of this queuing, either as a future or a task to examine at a later point (example use shown under Tasks and Futures).
thread_pool synopsis:
class thread_pool { private: thread_pool(const thread_pool&); thread_pool & operator=(const thread_pool&); public: explicit thread_pool(unsigned n = thread_util::number_of_processors()); ~thread_pool() template <class F> task<typename result_of<F()>::type> operator()(F fn); };
Some clients may want to create their own factory functions or classes similar to std::launch_in_thread(F) or std::thread_pool. This is facilitated by a lower-level class called task_wrapper. N2096 refers to this class as task.
task_wrapper synopsis:template <class R, class F> class task_wrapper { public: task_wrapper(F f, task<R>& ft); task_wrapper(F f, future<R>& ft); task_wrapper(const task_wrapper& tw); task_wrapper(task_wrapper&& tw); task_wrapper& operator=(const task_wrapper& tw); task_wrapper& operator=(task_wrapper&& tw); ~task_wrapper(); void operator()() };
task_wrapper is responsible for binding a future or task to a nullary function returning a type R, and creating a new functor which manages storage for the return value of the user's function, and provides for that return value to be constructed in the provided-for storage. It is also the task_wrapper which is responsible for detecting abnormal termination of the user's function and translating that into a suitable exception in the waiter's thread.
Glancing at the implementation of std::launch_in_thread reveals how easy it is to craft a custom launcher:
template <class F> task<typename result_of<F()>::type> launch_in_thread(F f) { typedef typename result_of<F()>::type R; task<R> t; thread(task_wrapper<R, F>(std::move(f), t)); return t; }
Upon return from this function, both task_wrapper and task share ownership of the storage for the return value of f. task_wrapper is now a function object that can be sent somewhere (anywhere) to be executed (thread in this example).
A factory function could just as easily return a future instead of a task as shown above. Returning a task has the advantage of allowing your clients to traffic in either futures or tasks as an rvalue task is implicitly convertible to a future (and with very little expense - the cost of two pointer assignments and a pointer comparison).
Here is another example: the launching operator of thread_pool:
template <class F> task<typename result_of<F()>::type> thread_pool::operator()(F f) { typedef typename result_of<F()>::type R; task<R> t; { exclusive_lock<mutex> lock(mut_); queue_.push_back(task_wrapper<R, F>(std::move(f), t)); } cond_.notify_all(); return t; }
Here again the function object task_wrapper binds the user's function with a task to be returned to the user. The task_wrapper function object is then stored in a container for later execution.
Recall that task and future merely represent views of a work-unit's result. They don't actually represent the work-unit itself, much less an OS-thread. Indeed, much of a future's (or task's) visible life can be spent waiting in a pool_thread queue as opposed to executing.
To represent an OS thread, a simple (low-level), thread class is introduced. This class is designed to be as thin a layer over the OS thread as possible. After all, task's, future's, and possibly user-defined abstractions will be built on top of thread and thus all overhead is significant.
thread synopsis:class thread { private: thread(const thread&); thread& operator=(const thread&); public: thread(); ~thread(); thread(thread&& x); thread& operator=(thread&& x); template <class F> explicit thread(F f); void operator()(); typedef implementation-defined native_handle_type; native_handle_type native_handle(); thread_util::id native_id() const; };
thread provides no return type. It is not copyable, only movable. It provides no exception notification. It models only the OS work-unit, and not any results (normal or abnormal results) of that work-unit. Typical implementations will have only one or two simple types which represent the OS thread (e.g. pthread_t). There is no need for member mutexes or condition variables at this level. There is no need to store the user's function F within this class. It can be directly transferred to the work-unit from within the thread constructor.
For accessing OS-specific functionality not covered by this API, the OS's thread handle is exposed. Implementations are allowed to provide implementation-defined constructors in addition to the constructors shown here. Such constructors might set OS thread attributes such as stack size or thread priority.
A default constructed thread refers to no thread, and spawns no thread of execution. The thread destructor detaches from the thread of execution if it exists at that time. Function operator syntax is used to allow a single thread to join. Clients needing the functionality of multiple threads joining (even if the return type is void), should use future instead.
The above is very similar to boost::threads and N1907. This is not by accident. This API has a proven track record. Yes, there are complaints and frustrations with this low-level API. This is what the higher-level task and future address.
A notable departure from N1907 is the addition of movability. This addition adds no overhead, and allows significant functionality such as putting thread into containers (which is very handy when building a thread_pool). Indeed, the ability to put thread into containers largely negates the motivation for boost::thread_group. One can just as easily create a std::vector<thread>, or std::list<thread> as the need arises. It is not difficult to imagine a std::priority_queue<some-user-defined-thread-wrapper> which merely adds some ordering (priority) to thread.
Another difference is that of thread identity. Since thread is not copyable, it is not possible for two thread objects to compare equal (refer to the same OS thread). However clients still need the ability to store, compare and copy objects which refer to thread identity. Common use cases include comparing a stored thread id with the currently executing thread. To meet this demand, thread identity, along with several utilities that operate only on the current thread of execution have been recast as free standing functions in a self-describing namespace:
thread_util synopsis:namespace std { namespace thread_util { typedef implementation-defined id; bool equal(id x, id y); id current(); id not_any_thread(); void yield(); void sleep(const timespec& rel_time); unsigned number_of_processors(); } // thread_util } // std
Using this facility, clients can easily store and compare OS thread id's:
std::thread_util::id id; ... if (!std::thread_util::equal(id, std::thread_util::current())) id = std::thread_util::current(); else id = std::thread_util::not_any_thread();
Thus this proposal encompasses all of the OS thread functionality offered by N1907 (and boost) but places some of it in namespace thread_util instead of having static thread functions.
This proposal also adds two low-level thread utilities that N1907/boost do not offer:
There are a set of four mutex concepts, each a refinement of the previous. Notably absent from all of these concepts is Movable or Copyable. Mutex concepts are introduced to allow user-defined mutexes in addition to the standard supplied mutexes. C++ programmers will be able to plug their own mutexes into the standard framework much like they can use their own containers with standard supplied algorithms today.
This is the base mutex concept which only models mutually exclusive ownership. It has five required signatures. Notably absent from this requirement is that of CopyConstructible, CopyAssignable, MoveConstructible and MoveAssignable. Implementations are encouraged to supply additional implementation defined constructors which allow setting various OS dependent mutex attributes.
The exception policy is that if a precondition is violated, the results are undefined. However the locking functions may fail due to lack of system resources, and are allowed to throw an exception in such an event. The try-locking, unlocking, and non-blocking conversion functions are specified to not throw an exception. On these signatures the empty throw specification is intended to indicate that the function will not throw a signature, and is not intended to indicate that the function may call unexpected() (it may or may not do so if preconditions are violated leading to undefined behavior).
Not blocking
This constructs a mutex with default attributes (appropriate for the environment), and leaves the mutex in an unlocked state. The constructor may throw an exception to indicate failure.
Not blocking
Preconditions: The mutex is in an unlocked state.
The mutex destructor does not throw any exceptions. It releases any resources that the mutex may have owned.
Blocking
Precondition: This thread does not currently have sharable or upgradable ownership of the mutex. It can only have exclusive ownership if the mutex supports recursive locking.
Effects: Obtains exclusive ownership of the mutex, blocking if necessary. No other thread can have exclusive ownership, upgradable ownership, or sharable ownership while this thread owns exclusive ownership. The mutex may or may not support recursive locking on the same thread. If it does, the mutex must be unlocked the same number of times it is locked.
lock() may throw an exception on failure.
Not blocking
Precondition: This thread does not currently have sharable or upgradable ownership of the mutex. It can only have exclusive ownership if the mutex supports recursive locking.
Effects: Will attempt to obtain exclusive ownership of the mutex. The intent is that the call will return immediately if it is unable to obtain ownership. This ownership can not be shared with other thread's exclusive, upgradable or sharable ownership. The mutex may or may not support recursive locking on the same thread. If it does, the mutex must be unlocked the same number of times it is locked.
Returns: true if the lock was obtained, else returns false.
Not blocking
Precondition: This thread currently owns exclusive ownership of the mutex. If the mutex is recursive, it must be unlocked the same number of times it was locked via the various routines which create exclusive ownership.
Effects: Will release exclusive ownership of the mutex. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
The shared ownership concept is a refinement of the exclusive ownership concept. In addition to implementing all of the requirements of the exclusive ownership concept, the mutex conforming to the shared ownership concept will also implement three additional signatures:
This concept introduces the ability for multiple threads to simultaneously own a mutex. The intent is that the threads holding shared ownership will not modify the protected data, but only read it. While any thread holds a sharable lock, no thread will hold an exclusive lock, and vice-versa.
Blocking
Precondition: This thread does not currently have exclusive or upgradable ownership of the mutex. It can only have sharable ownership if the mutex supports recursive sharable locking.
Effects: Obtains sharable ownership of the mutex, blocking if necessary. No other thread can have exclusive ownership. But other threads may have upgradable ownership, or sharable ownership. The mutex may or may not support recursive sharable locking on the same thread. If it does, the mutex must be sharable-unlocked the same number of times it is sharable-locked.
lock_sharable() may throw an exception on failure.
Not blocking
Precondition: This thread does not currently have exclusive or upgradable ownership of the mutex. It can only have sharable ownership if the mutex supports recursive sharable locking.
Effects: Will attempt to obtain sharable ownership of the mutex. The intent is that the call will return immediately if it is unable to obtain ownership. This ownership can not be shared with other thread's exclusive ownership, but may be shared with other thread's upgradable or sharable ownership. The mutex may or may not support recursive sharable locking on the same thread. If it does, the mutex must be sharable-unlocked the same number of times it is sharable-locked.
Returns: true if the lock was obtained, else returns false.
Not blocking
Precondition: This thread currently owns sharable ownership of the mutex. If the mutex is recursive, it must be sharable-unlocked the same number of times it was sharable-locked via the various routines which create sharable ownership.
Effects: Will release sharable ownership of the mutex. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
The convertible shared ownership concept is a refinement of the shared ownership concept. In addition to implementing all of the requirements of the shared ownership concept, the mutex conforming to the convertible shared ownership concept will also implement two additional signatures:
This concept introduces the ability to convert an exclusive lock to a sharable lock (in a non-blocking manner), and to try to convert a sharable lock into an exclusive lock in a non-blocking manner, which is not always possible.
In converting an exclusive lock to a sharable lock, no other thread will be able to obtain exclusive ownership of the mutex during this transition. Thus whatever changes the thread made to the data under the exclusive lock are still valid under the sharable lock.
The conversion from a sharable lock to an exclusive lock is only guaranteed to succeed if no other thread is waiting for exclusive ownership, and if no other thread currently owns sharable or upgradable ownership of the mutex. Whether or not successful, the protected data which was read under the sharable lock will still be valid after the attempted conversion.
Not blocking
Precondition: This thread currently has exclusive ownership of the mutex. If the mutex is recursive, the exclusive lock count must be 1.
Effects: Will atomically release exclusive ownership of the mutex and obtain sharable ownership. After this operation, other threads waiting on sharable or upgradable ownership may also obtain ownership. Other threads waiting on exclusive ownership will not have a chance to gain exclusive ownership during this operation. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
Not blocking
Precondition: This thread currently has sharable ownership of the mutex. If the mutex is recursive, the sharable lock count must be 1.
Effects: Will try to atomically release sharable ownership of the mutex and obtain exclusive ownership. If ownership is obtained it is guaranteed that no other thread will obtain exclusive ownership before this thread does. Therefore the thread may assume that what it has read under the sharable ownership is still valid (if exclusive ownership was obtained).
Returns: true if exclusive ownership was obtained, else returns false. If false is returned, the thread still has sharable ownership of the mutex.
The upgradable ownership concept is a refinement of the convertible shared ownership concept. In addition to implementing all of the requirements of the convertible shared ownership concept, the mutex conforming to the upgradable ownership concept will also implement eight additional signatures:
This concept introduces the ability for a single thread which holds upgradable ownership to both share with other threads holding sharable ownership, while uniquely holding the privilege of upgrading this "shared ownership" to exclusive ownership without the danger of another thread gaining exclusive ownership during the transition (even though it is a potentially blocking transition). The protected data read under an upgradable lock will still be valid after the upgrade to an exclusive lock.
As the concept of upgradable is meaningless without also the concept of converting one type of lock to the other, the lock conversion functionality is included within this concept instead of having a separate convertible upgradable ownership concept.
Blocking
Precondition: This thread does not currently have exclusive or sharable ownership of the mutex. It can only have upgradable ownership if the mutex supports recursive upgradable locking.
Effects: Obtains upgradable ownership of the mutex, blocking if necessary. No other thread can have exclusive or upgradable ownership. But other threads may have sharable ownership. The mutex may or may not support recursive upgradable locking on the same thread. If it does, the mutex must be upgradable-unlocked the same number of times it is upgradable-locked.
lock_upgradable() may throw an exception on failure.
Not blocking
Precondition: This thread does not currently have exclusive or sharable ownership of the mutex. It can only have upgradable ownership if the mutex supports recursive upgradable locking.
Effects: Will attempt to obtain upgradable ownership of the mutex. The intent is that the call will return immediately if it is unable to obtain ownership. This ownership can not be shared with other thread's exclusive ownership or upgradable ownership, but may be shared with other thread's sharable ownership. The mutex may or may not support recursive upgradable locking on the same thread. If it does, the mutex must be upgradable-unlocked the same number of times it is upgradable-locked.
Returns: true if the lock was obtained, else returns false.
Not blocking
Precondition: This thread currently has upgradable ownership of the mutex. If the mutex is recursive, it must be upgradable-unlocked the same number of times it was upgradable-locked via the various routines which create upgradable ownership.
Effects: Will release upgradable ownership of the mutex. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
Blocking
Precondition: This thread currently has upgradable ownership of the mutex. If the mutex is recursive, the upgradable lock count must be 1.
Effects: Will atomically release upgradable ownership of the mutex and obtain exclusive ownership. During this operation it is possible that other threads will obtain sharable or upgradable ownership. However, it is guaranteed that no other thread will obtain exclusive ownership before this thread does. Therefore the thread may assume that what it has read under the upgradable ownership is still valid.
unlock_upgradable_and_lock() may throw an exception on failure.Not blocking
Precondition: This thread currently has upgradable ownership of the mutex. If the mutex is recursive, the upgradable lock count must be 1.
Effects: Will try to atomically release upgradable ownership of the mutex and obtain exclusive ownership. If ownership is obtained it is guaranteed that no other thread will obtain exclusive ownership before this thread does. Therefore the thread may assume that what it has read under the upgradable ownership is still valid (if exclusive ownership was obtained).
Returns: true if exclusive ownership was obtained, else returns false. If false is returned, the thread still has upgradable ownership of the mutex.
Not blocking
Precondition: This thread currently has exclusive ownership of the mutex. If the mutex is recursive, the exclusive lock count must be 1.
Effects: Will atomically release exclusive ownership of the mutex and obtain upgradable ownership. After this operation, other threads waiting on sharable ownership may also obtain ownership. Other threads waiting on exclusive or upgradable ownership will not have a chance to gain ownership during this operation. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
Not blocking
Precondition: This thread currently has upgradable ownership of the mutex. If the mutex is recursive, the upgradable lock count must be 1.
Effects: Will atomically release upgradable ownership of the mutex and obtain sharable ownership. After this operation, other threads waiting on sharable or upgradable ownership may also obtain ownership. Other threads waiting on exclusive ownership will not have a chance to gain exclusive ownership during this operation. Some implementations may briefly block to gain permission to write to the mutex state. But in general this is a non-blocking operation.
Not blocking
Precondition: This thread currently has sharable ownership of the mutex. If the mutex is recursive, the sharable lock count must be 1.
Effects: Will try to atomically release sharable ownership of the mutex and obtain upgradable ownership. If ownership is obtained it is guaranteed that no other thread will obtain exclusive ownership before this thread has an opportunity to upgrade to exclusive ownership. Therefore the thread may assume that what it has read under the sharable ownership is still valid (if exclusive ownership is eventually obtained).
Returns: true if upgradable ownership was obtained, else returns false. If false is returned, the thread still has sharable ownership of the mutex.
The implementation provides the following types which meet these concepts:
It is unspecified whether all of the above mutex types are actually different types, or simply typedef's to a smaller set of common types.
The null_mutex models the upgradable ownership concept, and simply immediately returns for each call (with a value of true for the try-locks). This mutex can be useful for types which take a "synchronization policy" as a template parameter.
There are three different types of locks:
These three locks manage an appropriate mutex with a largely generic interface. The common part of this interface is described below for an imaginary type called basic_lock which essentially represents a lock concept. The interfaces unique to the three concrete types of locks are described later.
extern detail::defer_lock_type defer_lock; extern detail::try_lock_type try_to_lock; extern detail::accept_ownership_type accept_ownership; template <class Mutex> class basic_lock { public: typedef Mutex mutex_type; private: mutex_type* m_; // exposition only bool owns_; // exposition only public: basic_lock(); explicit basic_lock(mutex_type& m); basic_lock(mutex_type& m, detail::defer_lock_type); basic_lock(mutex_type& m, detail::try_lock_type); basic_lock(mutex_type& m, detail::accept_ownership_type); ~basic_lock(); basic_lock(basic_lock&& bl); basic_lock& operator=(basic_lock&& bl); private: basic_lock(const basic_lock&); basic_lock& operator=(const basic_lock&); public: void lock(); bool try_lock(); void unlock(); bool owns() const; operator unspecified-bool-type() const; mutex_type* mutex() const; void swap(basic_lock&& bl); mutex_type* release(); };
Each lock is templated on the mutex type, which is also represented by the nested type: mutex_type. These locks manage a pointer to a mutex (which may be null), and an owns flag. The lock does not govern the lifetime of the mutex. That is, the lock's responsibility is to lock / unlock the mutex in various ways, but not to own the memory which the mutex resides as a smart pointer would.
Effects: The default constructor constructs a lock which owns no mutex, and refers to no mutex.
Throws: Nothing.
Postconditions:
mutex() == 0 owns() == false
Effects: This explicit constructor constructs a lock which refers to the mutex m, and owns it. Calls lock() which will lock the mutex in a way defined differently for each type of lock, and may block in the process.
Preconditions: The lifetime of m is greater than the lifetime of this lock and of any object which ownership of this mutex may be transferred to.
Postconditions:
mutex() == &m owns() == true
Effects: This explicit constructor constructs a lock which refers to the mutex m, but does not own it. It will not perform any operation on the mutex.
Preconditions: The lifetime of m is greater than the lifetime of this lock and of any object which ownership of this mutex may be transferred to.
Throws: Nothing.
Postconditions:
mutex() == &m owns() == false
Effects: This explicit constructor constructs a lock which refers to the mutex m. Calls try_lock() which will attempt to gain ownership of m in a non-blocking manner defined differently for each type of lock.
Preconditions: The lifetime of m is greater than the lifetime of this lock and of any object which ownership of this mutex may be transferred to.
Postconditions:
mutex() == &mowns() will report the success or failure of the attempt to gain lock ownership.
Effects: This explicit constructor constructs a lock which refers to the mutex m, and owns it, but will not perform any operation on the mutex.
Preconditions: This thread has appropriate ownership of this mutex, and no other object is managing that ownership. The lifetime of m is greater than the lifetime of this lock and of any object which ownership of this mutex may be transferred to.
Throws: Nothing.
Postconditions:
mutex() == &m owns() == true
Effects: if owns() then calls unlock() which is defined appropriately for each lock.
Throws: Nothing.
Effects: Mutex ownership (if any) is transferred from the rvalue source to this. There is no effect on the referenced mutex.
Throws: Nothing.
Postconditions:
mutex() == the value of bl.mutex() before the call.
owns() == the value of bl.owns() before the call.
bl.mutex() == 0
bl.owns() == false
Effects: Mutex ownership (if any) is transferred from the rvalue source to this. There is no effect on the referenced source mutex. If this lock owns a mutex prior to the assignment, that mutex is unlocked with unlock().
Throws: Nothing.
Postconditions:
mutex() == the value of bl.mutex() before the call.
owns() == the value of bl.owns() before the call.
bl.mutex() == 0
bl.owns() == false
Notes: With a recursive mutex it is possible that both this and bl own the same mutex before the assignment. In this case, this will own the mutex after the assignment (and bl will not), but the mutex's lock count will be decremented by one.
Private and left undefined. Lock types are not copyable.
Private and left undefined. Lock types are not copyable.
Effects: If mutex() == 0 or if owns() == true, throws a lock_error(). Otherwise the referenced mutex is locked in a manner defined appropriately for each lock (which may block execution of this thread).
Throws: lock_error. Exceptions may be thrown by the referenced mutex's locking function.
Postconditions: If an exception is thrown, there is no change of state to the lock. If an exception is not thrown, owns() == true.
Effects: If mutex() == 0 or if owns() == true, throws a lock_error(). Otherwise the referenced mutex is locked in a manner defined appropriately for each lock, but in a non-blocking manner that may not succeed.
Throws: lock_error. Exceptions may be thrown by the referenced mutex's try-locking function.
Postconditions: If an exception is thrown, there is no change of state to the lock. If an exception is not thrown, owns() reflects the success or failure of the attempted locking function.
Returns: true if the lock was obtained.
Effects: If mutex() == 0 or if owns() == false, throws a lock_error(). Otherwise the referenced mutex is unlocked in a manner defined appropriately for each lock.
Throws: lock_error.
Postconditions: owns() == false.
Effects: If mutex() == 0 returns false. Else returns true if this lock owns a lock on the mutex, else returns false.
Throws: Nothing.
Effects: Returns non-null if owns() == true, else returns null.
Throws: Nothing.
Effects: Returns a pointer to the referenced mutex, or 0 if there is no mutex to reference.
Throws: Nothing.
Effects: Swaps state with bl.
Throws: Nothing.
Postconditions: The states of *this and bl are swapped.
Effects: Returns a pointer to the referenced mutex, or 0 if there is no mutex to reference.
Throws: Nothing.
Postconditions:
mutex() == 0 owns() == false
exclusive_lock is meant to carry out the tasks for locking, unlocking and try-locking (recursive or not) for the Mutex. The Mutex must meet the mutex concept of exclusive ownership. Additionally, if the constructor accepting ownership from a sharable_lock is instantiated, the mutex must meet the convertible shared ownership requirements. If the constructor accepting ownership from an upgradable_lock is instantiated, the mutex must meet the upgradable ownership requirements.
Below the complete interface for exclusive_lock is shown. However the description of each member will rely on the description of basic_lock as much as possible to prevent repetition. Lack of description for exclusive_lock implies that the functionality is more completely described under basic_lock.
template <class Mutex> class exclusive_lock { public: typedef Mutex mutex_type; private: mutex_type* m_; // exposition only bool owns_; // exposition only public: exclusive_lock(); explicit exclusive_lock(mutex_type& m); exclusive_lock(mutex_type& m, detail::defer_lock_type); exclusive_lock(mutex_type& m, detail::try_lock_type); exclusive_lock(mutex_type& m, detail::accept_ownership_type); ~exclusive_lock(); exclusive_lock(exclusive_lock&& sl); explicit exclusive_lock(upgradable_lock<mutex_type>&& r); exclusive_lock(upgradable_lock<mutex_type>&& r, detail::try_lock_type); exclusive_lock(sharable_lock<mutex_type>&& r, detail::try_lock_type); exclusive_lock& operator= (exclusive_lock&& sl); private: exclusive_lock(exclusive_lock const&); explicit exclusive_lock(upgradable_lock<mutex_type> const&); exclusive_lock(upgradable_lock<mutex_type> const&, detail::try_lock_type); exclusive_lock(sharable_lock<mutex_type> const&, detail::try_lock_type); exclusive_lock& operator= (exclusive_lock const&); public: void lock(); bool try_lock(); void unlock(); bool owns() const; operator unspecified-bool-type() const; const mutex_type* mutex() const; void swap(exclusive_lock&& w); mutex_type* release(); }; template <class Mutex> void swap(exclusive_lock<Mutex>& x, exclusive_lock<Mutex>& y); template <class Mutex> void swap(exclusive_lock<Mutex>&& x, exclusive_lock<Mutex>& y); template <class Mutex> void swap(exclusive_lock<Mutex>& x, exclusive_lock<Mutex>&& y);
exclusive_lock(upgradable_lock<mutex_type>&& r);
Effects: If r.owns() returns true then calls unlock_upgradable_and_lock() on the referenced mutex. r.release() is called.
Postconditions:
mutex() == the value r.mutex() had before the construction.
owns() == the value r.owns() had before the construction.
r.mutex() == 0.
r.owns() == false.
Notes: If r is owned, this constructor will lock this exclusive_lock while unlocking r. If r is unlocked, then this exclusive_lock will be unlocked as well. Only rvalue upgradable_lock's will match this signature. lvalue exclusive_lock's can not transfer ownership unless "cast" to an rvalue. An lvalue upgradable_lock can be "cast" to an rvalue with the expression: move(lock); This constructor may block if other threads hold a sharable_lock on this mutex (sharable_lock's can share ownership with an upgradable_lock).
[Example:
mutex m; upgradable_lock<mutex> l1(mut); // l1 upgradable-locked exclusive_lock<mutex> l2(move(l1)); // l2 locked, l1 unlocked
-- end example]
exclusive_lock(upgradable_lock<mutex_type>&& r, detail::try_lock_type);
Effects: If r.owns() then calls try_unlock_upgradable_and_lock() on the referenced mutex.
Else r.owns() is false. mutex() obtains the value from r.release() and owns() is set to false
Notes: This construction will not block. It will try to obtain mutex ownership from r immediately, while changing the lock type from a "read lock" to a "write lock". If the "read lock" isn't held in the first place, the mutex merely changes type to an unlocked "write lock". If the "read lock" is held, then mutex transfer occurs only if it can do so in a non-blocking manner.
[Example:
mutex m; upgradable_lock<mutex> l1(mut); // l1 upgradable-locked exclusive_lock<mutex> l2(move(l1), try_to_lock); // l2 tries to obtain lock from l1
-- end example]
exclusive_lock(sharable_lock<mutex_type>&& r, detail::try_lock_type);
Effects: If r.owns() then calls try_unlock_sharable_and_lock() on the referenced mutex.
Else r.owns() is false. mutex() obtains the value from r.release() and owns() is set to false
Notes: This construction will not block. It will try to obtain mutex ownership from r immediately, while changing the lock type from a "read lock" to a "write lock". If the "read lock" isn't held in the first place, the mutex merely changes type to an unlocked "write lock". If the "read lock" is held, then mutex transfer occurs only if it can do so in a non-blocking manner.
[Example:
mutex m; sharable_lock<mutex> l1(mut); // l1 sharable-locked exclusive_lock<mutex> l2(move(l1), try_to_lock); // l2 tries to obtain lock from l1
-- end example]
void lock();
Effects: Calls lock() on the referenced mutex.
bool try_lock();
Effects: Calls try_lock() on the referenced mutex.
void unlock();
Effects: Calls unlock() on the referenced mutex.
sharable_lock is meant to carry out the tasks for sharable-locking (such as read-locking), unlocking and try-sharable-locking (recursive or not) for the Mutex. The Mutex must meet the mutex concept of shared ownership. Additionally, if the constructor or assignment operator accepting ownership from a exclusive_lock is instantiated, the mutex must meet the convertible shared ownership requirements. If the constructor or assignment accepting ownership from an upgradable_lock is instantiated, the mutex must meet the upgradable ownership requirements.
Below the complete interface for sharable_lock is shown. However the description of each member will rely on the description of basic_lock as much as possible to prevent repetition. Lack of description for sharable_lock implies that the functionality is more completely described under basic_lock.
template <class RW_Mutex> class sharable_lock { public: typedef RW_Mutex mutex_type; private: mutex_type* m_; // exposition only bool owns_; // exposition only public: sharable_lock(); explicit sharable_lock(mutex_type& m); sharable_lock(mutex_type& m, detail::defer_lock_type); sharable_lock(mutex_type& m, detail::try_lock_type); ~sharable_lock(); sharable_lock(sharable_lock&& r); explicit sharable_lock(upgradable_lock<mutex_type>&& r); explicit sharable_lock(exclusive_lock<mutex_type>&& w); sharable_lock& operator=(sharable_lock&& r); private: sharable_lock(sharable_lock const&); explicit sharable_lock(upgradable_lock<mutex_type> const&); explicit sharable_lock(exclusive_lock<mutex_type> const&); sharable_lock& operator=(sharable_lock const&); public: void lock(); bool try_lock(); void unlock(); bool owns() const; operator unspecified-bool-type() const; const mutex_type* mutex() const; void swap(sharable_lock&& r); mutex_type* release(); }; template <typename Mutex> void swap(sharable_lock<Mutex>& x, sharable_lock<Mutex>& y);
sharable_lock(upgradable_lock<mutex_type>&& r);
Effects: m_->unlock_upgradable_and_lock_sharable() if r.owns().
Postconditions:
mutex() == the value r.mutex() had before the construction.
owns() == the value r.owns() had before the construction.
r.mutex() == 0.
r.owns() == false.
Notes: If r is owned, this constructor will lock this sharable_lock while unlocking r.
[Example:
mutex m; upgradable_lock<mutex> l1(mut); // l1 upgradable-locked sharable_lock<mutex> l2(move(l1)); // l2 sharable-locked, l1 unlocked
-- end example]
sharable_lock(exclusive_lock<mutex_type>&& w);
Effects: If w.owns(), m_->unlock_and_lock_sharable().
Postconditions:
mutex() == the value w.mutex() had before the construction.
owns() == the value w.owns() had before the construction.
w.mutex() == 0.
w.owns() == false.
Notes: If w is owned, this constructor will transfer the exclusive ownership to a sharable-ownership of this sharable_lock. Only rvalue exclusive_lock's will match this signature. lvalue exclusive_lock's can not transfer ownership unless "cast" to an rvalue. An lvalue exclusive_lock can be "cast" to an rvalue with the expression: move(lock);
[Example:
mutex m; exclusive_lock<mutex> wl(mut); // wl locked sharable_lock<mutex> rl(move(wl)); // rl sharable-locked, wl unlocked
-- end example]
void lock();
Effects: Calls lock_sharable() on the referenced mutex.
bool try_lock();
Effects: Calls try_lock_sharable() on the referenced mutex.
void unlock();
Effects: Calls unlock_sharable() on the referenced mutex.
upgradable_lock is meant to carry out the tasks for read-locking, unlocking and try-read-locking (recursive or not) for the mutex. Additionally the upgradable_lock can transfer ownership to a exclusive_lock with move syntax. The mutex must meet the mutex concept of upgradable ownership.
Below the complete interface for upgradable_lock is shown. However the description of each member will rely on the description of basic_lock as much as possible to prevent repetition. Lack of description for upgradable_lock implies that the functionality is more completely described under basic_lock.
template <class RW_Mutex> class upgradable_lock { public: typedef RW_Mutex mutex_type; private: mutex_type* m_; // exposition only bool owns_; // exposition only public: upgradable_lock(); explicit upgradable_lock(mutex_type& m); upgradable_lock(mutex_type& m, detail::defer_lock_type); upgradable_lock(mutex_type& m, detail::try_lock_type); ~upgradable_lock(); upgradable_lock(upgradable_lock&& r); explicit upgradable_lock(exclusive_lock<mutex_type>&& w); upgradable_lock(sharable_lock<mutex_type>&& r, detail::try_lock_type); upgradable_lock& operator=(upgradable_lock&& r); private: upgradable_lock(upgradable_lock const&); explicit upgradable_lock(exclusive_lock<mutex_type> const&); upgradable_lock(sharable_lock<mutex_type> const& r, detail::try_lock_type); upgradable_lock& operator=(upgradable_lock const&); public: void lock(); bool try_lock(); void unlock(); bool owns() const; operator unspecified-bool-type() const; const mutex_type* mutex() const; void swap(upgradable_lock&& r); mutex_type* release(); }; template <typename Mutex> void swap(upgradable_lock<Mutex>& x, upgradable_lock<Mutex>& y);
Example code using upgradable_lock:
std::upgradable_mutex mut; void foo() { std::upgradable_lock<std::upgradable_mutex> read_lock(mut); // ... do read operation if (/* sometimes need to write */) { std::exclusive_lock<std::upgradable_mutex> write_lock(std::move(read_lock)); // ... do write operation, what was read hasn't changed } }
An upgradable_lock is most useful when a read operation sometimes decides it needs to write to the protected data, and needs to do so without allowing for the read data to change before acquiring the write lock (as could happen if the thread merely releases the read lock and blocks on a write lock).
upgradable_lock(exclusive_lock<mutex_type>&& w);
Effects: If w.owns(), m_.unlock_and_lock_upgradable().
Postconditions:
mutex() == the value w.mutex() had before the construction.
owns() == the value w.owns() had before the construction.
w.mutex() == 0.
w.owns() == false.
Notes: If w is owned, this constructor will transfer the exclusive-ownership to a upgradable-ownership of this upgradable_lock. Only rvalue exclusive_lock's will match this signature. lvalue exclusive_lock's can not transfer ownership unless "cast" to an rvalue. An lvalue exclusive_lock can be "cast" to an rvalue with the expression: move(lock);
[Example:
mutex m; exclusive_lock<mutex> wl(mut); // wl locked upgradable_lock<mutex> rl(move(wl)); // rl upgradable-locked, wl unlocked
-- end example]
upgradable_lock(sharable_lock<mutex_type>&& r, detail::try_lock_type);
Effects: If r.owns() then calls try_unlock_sharable_and_lock_upgradable() on the referenced mutex.
Else r.owns() is false. mutex() obtains the value from r.release() and owns() is set to false
Postconditions:
mutex() == the value r.mutex() had before the construction.
owns() == the value r.owns() had before the construction.
r.mutex() == 0.
r.owns() == false.
Notes: This construction will not block. It will try to obtain mutex ownership from r immediately, while changing the lock type from a "read lock" to an "upgradable lock". If the "read lock" isn't held in the first place, the mutex merely changes type to an unlocked "upgradable lock". If the "read lock" is held, then mutex transfer occurs only if it can do so in a non-blocking manner.
[Example:
mutex m; sharable_lock<mutex> l1(mut); // l1 sharable-locked upgradable_lock<mutex> l2(move(l1), try_to_lock); // l2 tries to obtain lock from l1
-- end example]
void lock();
Effects: Calls lock_upgradable() on the referenced mutex.
bool try_lock();
Effects: Calls try_lock_upgradable() on the referenced mutex.
void unlock();
Effects: Calls unlock_upgradable() on the referenced mutex.
Consider an algorithm which wants to easily switch back and forth among upgradable_lock, exclusive_lock, and sharable_lock. Such code might look like:
std::upgradable_mutex mut; void foo() { typedef std::sharable_lock<std::upgradable_mutex> ReadLock; typedef std::upgradable_lock<std::upgradable_mutex> UpgradableLock; typedef std::exclusive_lock<std::upgradable_mutex> ExclusiveLock; typedef std::temporary_lock<ExclusiveLock, UpgradableLock> WriteLock; UpgradableLock upgradable_lock(mut); // ... do read operation, sharing with other shared-locked threads if (/* sometimes need to write */) { WriteLock write_lock(upgradable_lock); // ... do write operation, what was read hasn't changed } // ... do more reading under upgradable_lock, // sharing with other shared-locked threads if (/* sometimes need to write again */) { WriteLock write_lock(upgradable_lock); // ... do write operation, what was read hasn't changed } ReadLock read_lock(std::move(upgradable_lock)); // ... do more reading under read_lock, // allowing for other shared-locked and upgradable-locked threads }
The above example uses std::temporary_lock which is a generic, non-movable lock that simply transfers ownership from an lvalue lock to itself, and only for the current scope, and then on destruction transfers ownership back to the original lock.
The example above also demonstrates transferring ownership from the upgradable_lock to a sharable_lock once it is known that no further exclusive_lock's are needed. This allows for a little more concurrency as only a single thread can have upgradable ownership of a mutex at once. Note the use of move in this transfer which signals a complete relinquishing of control by upgradable_lock.
temporary_lock is quite simple:
template <class Lock1, class Lock2> class temporary_lock { private: Lock1 lock1_; Lock2& lock2_; public: temporary_lock(Lock2& lock2) : lock1_(std::move(lock2)), lock2_(lock2) {} ~temporary_lock() {lock2_ = Lock2(std::move(lock1_));} };
temporary_lock's function is to temporarily transfer mutex ownership to itself, from another lock, and then transfer it back.
template <class Lock1, class Lock2> void lock(Lock1& l1, Lock2& l2); template <class Lock1, class Lock2, class Lock3> void lock(Lock1& l1, Lock2& l2, Lock3& l3); template <class Lock1, class Lock2, class Lock3, class Lock4> void lock(Lock1& l1, Lock2& l2, Lock3& l3, Lock4& l4);
These algorithms take a collection of locks and lock them without danger of deadlock. The algorithm used is unimportant but could be an ordering algorithm (say using the address of their contained mutex), or the try and back off algorithm. When implementing the latter, it is helpful to also have some generic try-locking algorithms (which might be generally useful as well):
template <class Lock1, class Lock2> unsigned try_lock(Lock1& l1, Lock2& l2); template <class Lock1, class Lock2, class Lock3> unsigned try_lock(Lock1& l1, Lock2& l2, Lock3& l3); template <class Lock1, class Lock2, class Lock3, class Lock4> unsigned try_lock(Lock1& l1, Lock2& l2, Lock3& l3, Lock4& l4);
The above return 0 if all locks were successfully locked, else they return with none of the locks locked. The return value in this case is the 1-based index of the lock which failed to lock. Here is how the three-lock, dead-lock safe algorithm could be coded on top of try_lock:
template <class Lock1, class Lock2, class Lock3> void lock(Lock1& l1, Lock2& l2, Lock3& l3) { unsigned i = 1; while (true) { switch (i) { case 1: l1.lock(); i = try_lock(l2, l3); if (i == 0) return; l1.unlock(); ++i; break; case 2: l2.lock(); i = try_lock(l1, l3); if (i == 0) return; l2.unlock(); if (i > 1) ++i; break; case 3: l3.lock(); i = try_lock(l1, l2); if (i == 0) return; l3.unlock(); break; } } }
Here is example code which creates a thread-safe assignment operator using exclusive_lock, sharable_lock, and lock:
class A { private: typedef std::sharable_mutex Mutex; typedef std::sharable_lock<Mutex> ReadLock; typedef std::exclusive_lock<Mutex> WriteLock; mutable Mutex mut_; // ... other data public: A& operator=(const A& a) { if (this != &a) { WriteLock write_lock( mut_, std::defer_lock); ReadLock read_lock (a.mut_, std::defer_lock); std::lock(write_lock, read_lock); // now safe to assign other data ... } return *this; } };
Without the use of std::lock in the above algorithm dead-lock becomes a possibility. All that would have to happen is for one thread to execute a1 = a2; while another thread executes a2 = a1;. Note that it doesn't matter that we're dealing with two different kinds of locks above (ReadLock and WriteLock). The generic locking algorithms don't need to know what kinds of locks they're operating on thanks to the generic interface of the locks.
The common question at this point is:
What happens if someone wants to lock more than four locks?
I have two answers to that question:
One can easily create a templated lock adaptor class Lock2, or even Lock3 or Lock4:
template <class L1, class L2> class Lock2 { private: L1& l1_; L2& l2_; bool owns_; public: Lock2(L1& l1, L2& l2) : l1_(l1), l2_(l2), owns_(false) {} void lock() { std::lock(l1_, l2_); owns_ = true; } bool try_lock() { owns_ = std::try_lock(l1_, l2_) == 0; return owns_; } void unlock() { owns_ = false; l1_.unlock(); l2_.unlock(); } // etc. };
And then one merely needs to apply this adaptor to bundle up locks together:
// Define L1 through L4 lock bundling types typedef std::exclusive_lock<std::mutex> L1; typedef Lock2<L1, L1> L2; typedef Lock2<L2, L1> L3; typedef Lock2<L2, L2> L4; // Here's 10 locks that need locking L1 l1(m1, std::defer_lock); L1 l2(m2, std::defer_lock); ... L1 l10(m10, std::defer_lock); // Bundle the 10 locks into 4 locks; L2 l12(l1, l2); L2 l34(l3, l4); L2 l56(l5, l6); L4 l1234(l12, l34); L3 l567(l56, l7); L2 l89(l8, l9); // Lock 'em std::lock(l1234, l567, l89, l10); // 10 locks safely locked!
Using such a technique, there is no limit to the number of locks one can safely lock at one time.
Synopsis:
namespace std { template <class Lock> class condition { private: condition(const condition&); condition& operator=(const condition&); public: condition(); ~condition(); void notify_one(); void notify_all(); void wait(Lock& lock); template <class Predicate> void wait(Lock& lock, Predicate pred); bool timed_wait(Lock& lock, const timespec& abs_time); template <class Predicate> bool timed_wait(Lock& lock, const timespec& abs_time, Predicate pred); }; } // namespace std
Differences between this and N1907:
This condition variable is intended to work with user-defined locks and mutexes. All that is required is that Lock supply lock() and unlock() member functions.
For typical Windows implementations, a data structure consisting a couple of mutexes and semaphores is commonly used (Alexander Terekhov's algorithm 8a). This implementation only uses lock() and unlock() on the external mutex/lock.
For a pthreads implementation a std::condition can contain an internal native mutex and native condition variable. For waiting one simply locks the internal mutex, unlocks the external mutex, and waits on the native condition variable with the native mutex. On returning from the wait, the external lock is locked, and the internal mutex is unlocked. For signaling in a pthreads implementation one locks the internal mutex, signals the native condition variable, and unlocks the internal mutex.
As the above algorithm is more expensive than need be when locking on a native mutex type, condition can be specialized on mutex and exclusive_lock<mutex>. These specializations would hold nothing but a pthread_cond_t and the member functions would inline straight into the appropriate pthread_cond_* functions.
This blend of flexibility and efficiency is not possible with the N1907 (boost) condition variable because one needs to have different data members to achieve both a general implementation and a native-efficient implementation (at least on pthreads).
Additionally, if need be, clients can specialize std::condition on their user-defined mutexes or locks, if the general implementation is too inefficient, or does not work for some reason.
With this design, clients can enjoy such flexibility as waiting with a read-locked mutex. But they only pay for such flexibility if they use it. condition<native-mutex> remains as efficient as the underlying OS allows. The design also prevents accidental mismatching of condition variable and mutex/lock types. For example given a condition<mutex>, it is a compile time error to use that object to wait on a my_special_mutex.
I'd like to start at the beginning. Alexander Terekhov showed how to create a condition variable out of semaphores and mutexes. This has been incredibly helpful. Many people attempted the same task and it is Alexander's 8a algorithm that is universally used. The importance of this foundation layer is not to be underestimated.
Much thanks to William Kempf for the original boost::threads design. This is an important fundamental design. I have personally learned a lot from William.
Most the proposed thread-launching layer is based on code written by Peter Dimov. The critical breakthrough Peter accomplished is the complete orthogonality of the higher-level future level from the low-level thread level. This higher level layer is efficient while being competely non-intrusive on the lower-level layer. futures can layer on anything, completley independently.
Ion Gaztañaga also independently implemented future (as proposed by Kevlin Henney) on top of boost::threads non-intrusively (a critical breakthrough imho). Not only that but Ion has also independently implemented the exclusive/shared/upgradable mutex and locks concepts presented herein, and for processes, not just threads. Ion has also been very influential in the basic lock concept, contributing important features such as accept_ownership and release.
Kevlin Henney's N1883 was an eye-opener for the LWG (and indeed the C++ community) regarding futures. Thank you. N1883 has been extremely influential in this proposal.
Thanks to Pete Becker for N1907 upon which much of this work is based.
Beman Dawes has proposed and explored the area of exception propagation from an executing work unit to joining thread. This will no doubt have an important impact on C++.
Herb Sutter is the mind behind the very enticing futures group idea.
Thanks to Sean Parent for helping me find the difference between a thread identity and a thread handle.
Much thanks to concurrent work which is not represented here, but is nevertheless critical to this proposal's success. Hans Boehm's and Herb Sutter's work on the C++ memory model has incalculable value. Without such a foundation, this proposal is for naught. And the atomics foundation is another critical level upon which this foundation depends. Lawrence Crowl's thread-local-data work is similarly much appreciated.
And finally a giant thanks to Clark Nelson for graciously putting up with my late delivery of this paper, requiring after hours work on his part.