ISO/IEC JTC1 SC22 WG21 N2659 = 08-0169 - 2008-06-11
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
This proposal is a revision of N2545 = 08-0055 - 2008-03-16. The revision consists of wording changes arising from core language subcommittee review.
In multi-threaded applications, there often arises the need to maintain data that is unique to a thread. We call this thread-local storage.
Several techniques have been used to accomplish this task.
Notable among them is the POSIX
getthreadspecific
and setthreadspecific
facility.
Unfortunately, this facility is clumsy and slow.
In addition, the facility is not particularly helpful
when converting a single-threaded application
to a multi-threaded application.
Several vendors have provided a language extension for a new storage class that indicates that a variable has thread storage duration. Use of thread variables is relatively easy and access to thread variables is relatively fast. In addition, the conversion of a single-threaded application using static-duration variables to a multi-threaded application using thread-duration variables requires less wholesale program restructuring.
Roughly equivalent extensions are available from
The C++ standard should adopt existing practice for thread-local storage. In addition, the C++ standard should extend existing practice to enable broader use.
The specification outline is as follows. We defer detailed changes to the text of the standard to the final section.
Add a new storage duration called thread storage duration. Objects with thread storage duration are unique to each thread.
Those objects that may have static storage duration may have thread storage duration instead. These objects include namespace-scope variables, function-local static variables, and class static member variables.
thread_local
Add thread_local
,
a new keyword and storage class specifier.
The thread_local
specifier
indicates that the variable has thread storage duration.
Variables declared with the thread_local
specifier
are bound as they would be without the thread_local
specifier.
The address-of operator (&
),
when applied to a thread variable,
is evaluated at run time
and returns the address of the current thread's variable.
Therefore, the address of a thread variable is not a constant.
Thread-local storage defines lifetime and scope, not accessibility. That is, one may take the address of a thread-local variable and pass it to other threads.
The address of a thread variable is stable for the lifetime of the corresponding thread. The address of a thread variable may be freely used during the variable's lifetime by any thread in the program. When a thread terminates, all addresses of that thread's variables are invalid and may not be used.
A thread variable may be statically initialized as would any other static-duration variable.
At present, all implementations of thread-local storage do not support dynamic initalization (and presumably non-trivial destructors). There was mild consensus at the Mont Treblant meeting to support dynamic initialization of function-local, thread-local variables. The intialization of such variables is already guarded and synchronous, so new technology is not required. On the other hand, the implementation for dynamic initialization of namespace-scope variables is much more difficult, and may require additional linker and operating system support. There was no consensus to support dynamic initialization of namespace-scope variables at that time. However, interviews with prospective users indicated a firm desire for full dynamic initialization of thread storage duration variables. The programmers simply did not want to partition their types this way.
The implementation of dynamic initialization and destruction can be implemented with two approaches.
.init
sections.init
sections
to also include sections for thread-local storage.
These thread-local inits will be invoked
whenever the corresponding storage section is allocated.
This approach requires operating-system support.
In either case, the initialization of a thread-local variable must place the destruction on a thread-local list for subsequent handling on exit from the thread (potentially with cancellation cleanup functions).
There are some other issues that deserve mention even though they are not properly part of the C++ standard because they affect real programs.
The allocation of thread-local storage for the full product of threads and dynamic libraries could result in very large storage requirements. The Sun Microsystems implementation only allocates thread-local storage for a dynamic library when the thread uses a variable from that library. That is, the Sun implementation allocates memory lazily for each thread and dynamic library pair. To avoid bloated programs, the language definition must permit this optimization.
The system may immediately deallocate the storage associated with a thread and dynamic library pair when either the thread terminates or the library is closed. The system is not required to deallocate immediately. However, the system is required to not leak storage. Thread-local storage for a thread must be reclaimed no later than a subsequent thread creation. Thread-local storage for a library within a thread must be reclaimed no later than a subsequent open of that library. (Opening another library does not require storage reclamation, though doing so would certainly reduce storage consumption.)
While storage deallocation can be defered, variable destruction must not be defered because destruction depends on access to thread state. In the presence of programmed closing of a dynamic library, its thread-local variables may need to be destroyed out of order with respect to thread-local variables outside of the library.
When dlsym()
is used on a thread variable,
the address returned
will be the address of the currently executing thread's variable.
The text of the standard changes as specified in this section.
To table 3, add thread_local
.
In paragraph 4, edit as follows. This change is the minimal necessary to accommodate thread-duration objects. A more robust specification of termination is needed. See 18.4 Start and termination [support.start.term].
Calling the function
std::exit(int)
declared in<cstdlib>
(18.4) terminates the program without leaving the current block and hence without destroying any objects with automatic storage duration (12.4). Ifstd::exit
is called to end a program during the destruction of an object with static or thread storage duration, the program has undefined behavior.
Before paragraph 1, add a new paragraph
There are two broad classes of named non-local objects, those with static storage duration (3.7.1) and those with thread storage duration (3.7.2(new)). Non-local objects with static storage duration are initialized as a consequence of program initiation. Non-local objects with thread storage duration are initialized as a consequence of thread execution. Within each of these phases of initiation, initialization occurs as follows.
In paragraph 1, edit
Objects with static storage duration (3.7.1) or thread storage duration (3.7.2(new)) shall be zero-initialized (8.5) before any other initialization takes place. A reference with static or thread storage duration and an object of trivial or literal type with static or thread storage duration can be initialized with a constant expression (5.19); this is called constant initialization. ....
In paragraph 2, edit
An implementation is permitted to perform the initialization of an object of namespace scope
with static storage durationas a static initialization even if such initialization is not required to be done statically, provided that
- the dynamic version of the initialization does not change the value of any other object of namespace scope
with static storage durationprior to its initialization, and- ....
- [Note: as a consequence, if the initialization of an object
obj1
refers to an objectobj2
of namespace scopewith static storage durationand ....
In paragraph 3, edit
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope with static storage duration is done before the first statement of
main
. ....
After paragraph 3, add new paragraph 4.
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope and with thread storage duration is done before the first statement of the initial function of the thread. If the initialization is deferred to some point in time after the first statement of the initial function of the thread, it shall occur before the first use of any object with thread storage duration defined in the same translation unit as the object to be initialized.
In the existing paragraph 4, edit
If construction or destruction of a non-local static or thread duration object ends in throwing an uncaught exception, the result is a call to
std::terminate
(18.7.3.3).
In paragraph 1, edit
Destructors (12.4) for initialized objects of static storage duration (declared at block scope or at namespace scope) are called as a result of returning from
main
and as a result of callingstd::exit
(18.3). Destructors (12.4) for initialized objects with thread storage duration (declared at block scope or at namespace scope) within a given thread are called as a result of that thread callingstd::exit
or returning from the initial function of the thread. Objects with thread storage duration are destroyed before those of static storage duration. Otherwise, theseTheseobjects are destroyed in the reverse order of the completion of their constructor or of the completion of their dynamic initialization. If an object is initialized statically, the object is destroyed in the same order as if the object was dynamically initialized. For an object of array or class type, all subobjects of that object are destroyed before any local object with static storage duration initialized during the construction of the subobjects is destroyed.
In paragraph 2, edit
If a function contains a local object of static or thread storage duration that has been destroyed and the function is called during the destruction of an object with static or thread storage duration, the program has undefined behavior if the flow of control passes through the definition of the previously destroyed local object.
In paragraph 4, implicitly adding thread duration, edit
Calling the function
std::abort()
declared in<cstdlib>
terminates the program without executing any destructorsfor objects of automatic or static storage durationand without calling the functions passed tostd::atexit()
orstd::at_quick_exit()
.
To the list of storage durations in paragraph 1, between static and automatic, add
In paragraph 2, edit
Static, thread, and automatic storage durations are associated with objects introduced by declarations (3.1) and implicitly created by the implementation (12.2). The dynamic storage duration is associated with objects created with
operator new
(5.3.4).
In paragraph 1, edit
All objects which
neitherdo not have dynamic storage duration, do not have thread storage duration, andnorare not local, have static storage duration. The storage for these objects shall last for the duration of the program (3.6.2, 3.6.3).
Add a new section after 3.7.1 Static storage duration [basic.stc.static] with the following contents.
All objects or references declared with the
thread_local
keyword have thread storage duration. The storage for these objects or references shall last for the duration of the thread in which they are created. There is a distinct object or reference per thread, and use of the declared name refers to the object or reference associated with the current thread.An object or reference with thread storage duration shall be initialized before its first use, and if constructed, shall be destroyed on thread exit.
In paragraph 4, edit
[Note: in particular, a global allocation function is not called to allocate storage for objects with static storage duration (3.7.1), for objects or references with thread storage duration (3.7.2(new)), for objects of type
std::type_info
(5.2.8), for the copy of an object thrown by a throw expression (15.1). —end note]
This restriction says that
allocation of storage for thread-duration variables
does not go through the global operator new
functions.
This restriction is necesssary to enable link-time preallocation.
In paragraph 8, edit
If a program ends the lifetime of an object of type
T
with static (3.7.1), thread (3.7.2(new), or automatic(3.7.2)(3.7.3(new)) storage duration and ifT
has a non-trivial destructor, ....
In footnote 35, edit
that is, an object for which a destructor will be called implicitly —
either eitherupon exit from the block for an object with automatic storage duration, upon exit from the thread for an object with thread storage duration, or upon exit from the program for an object with static storage duration.
In paragraph 9, edit
Creating a new object at the storage location that a
const
object with static, thread, or automatic storage duration occupies or, at the storage location that such aconst
object used to occupy before its lifetime ended results in undefined behavior.
Within paragraph 4, edit
The zero-initialization (8.5) of all local objects with static storage duration (3.7.1) or thread storage duration (3.7.2(new)) is performed before any other initialization takes place. A local object of trivial or literal type (3.9) with static or thread storage duration initialized with constant-expressions is initialized before its block is first entered. An implementation is permitted to perform early initialization of other local objects with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize an object with static or thread storage duration in namespace scope (3.6.2).
In paragraph 5, edit
The destructor for a local object with static or thread storage duration will be executed if and only if the variable was constructed. [Note: 3.6.3 describes the order in which local objects with static or thread storage duration are destroyed. —end note]
In paragraph 1, add "thread_local
"
to the list of storage class specifiers.
In paragraph 1, edit
At most one storage-class-specifier shall appear in a given decl-specifier-seq
., except thatthread_local
may appear withstatic
orextern
. Ifthread_local
appears in any declaration of an object or reference, it shall be present in all declarations of that object.
After paragraph 3, add a new paragraph
The
thread_local
specifier can be applied only to the names of objects or references of block scope that also specifystatic
or to the names of objects or references of namespace scope. It specifies that the named object or reference has thread storage duration (3.7.2(new)).
In paragraph 4, edit
A
static
specifier used in the declaration of an object declares the object to have static storage duration (3.7.1), unless accompanied by thethread_local
specifier, which declares the object or reference to have thread storage duration (3.7.2(new))
In paragraph 2 edit as follows.
A
static
,thread_local
,extern
,register
,mutable
,friend
,inline
,virtual
, ortypedef
specifier applies directly to each declarator-id in an init-declarator-list; the type specified for each declarator-id depends on both the decl-specifier-seq and its declarator.
In paragraph 2, edit
Automatic, register, thread, static, and namespace-scoped external variables
of namespace scopecan be initialized by arbitrary expressions involving literals and previously declared variables and functions.
Paragraph 7 remains unchanged, which implies that thread storage duration objects may be uninitialized at program startup.
In paragraph 14, edit as follows. The expanded scope of 3.6.2 leaves this text mostly untouched.
When an aggregate with static or thread storage duration is initialized with a brace-enclosed initializer-list, if all the member initializer expressions are constant expressions, and the aggregate is a trivial type, the initialization shall be done during the static phase of initialization (3.6.2); otherwise, it is unspecified whether the initialization of members with constant expressions takes place during the static phase or during the dynamic phase of initialization.
In paragraph 6, edit
A member shall not be declared with the
extern
orregister
storage-class-specifier. Within a class definition, a member shall not be declared with thethread_local
storage-class-specifier unless also declaredstatic
.
In paragraph 1, edit
A
static
data member is not part of the subobjects of a class. For such a member declaredthread_local
, there is one copy of the member per thread. For such a member not declaredthread_local
, thereThereisonlyone copy ofathe data member shared by all the objects of the class.static
In paragraph 8, edit
Default constructors are called implicitly to create class objects of static, thread, or automatic storage duration (3.7.1, 3.7.2(new), 3.7.3(new)) defined without an initializer (8.5), ...
In paragraph 5, edit
In addition, the destruction of temporaries bound to references shall take into account the ordering of destruction of objects with static, thread, or automatic storage duration (3.7.1, 3.7.2(new), 3.7.3(new));
In paragraph 9, edit
Destructors are invoked implicitly (1) for a constructed object with static storage duration (3.7.1) at program termination (3.6.3), (2) for a constructed object with thread storage duration (3.7.2(new)) at thread exit, (
23) for a constructed object with automatic storage duration (3.7.23(new)) when the block in which the object is created exits (6.7), (34) for a constructed temporary object when the lifetime of the temporary object ends (12.2), (45) for a constructed object allocated by a new-expression (5.3.4), through use of a delete-expression (5.3.5), (56) in several situations due to the handling of exceptions (15.3).
In paragraph 4, edit
[ Note: the order in which objects with static or thread storage duration are initialized is described in 3.6.2 and 6.7. —end note ]
In paragraph 13, edit
Exceptions thrown in destructors of objects with static storage duration or in constructors of static-duration namespace-scope objects are not caught by a function-try-block on
main()
. Exceptions thrown in destructors of objects with thread storage duration or in constructors of thread-duration namespace-scope objects are not caught by a function-try-block on the initial function of the thread.
std::terminate()
function [except.terminate]In paragraph 1, in the list of causes for termination, edit
when construction or destruction of a non-local object with static or thread storage duration exits using an exception (3.6.2), or
Another possibility is to propogate the exception to the joiner, but then there would be no distinction between the thread function exiting with an exception and one of its thread-duration objects exiting with an exception.
In paragraph 3, edit
The function
abort()
has additional behavior in this International Standard:
- The program is terminated without executing destructors for objects of automatic, thread, or static storage duration and without calling the functions passed to atexit() (3.6.3).
Paragraph 7, discusses the interaction of destruction
and calling exit
.
The following edit is the minimum possible change to the standard
to occomodate thread storage duration objects.
The function
exit()
has additional behavior in this International Standard:
- First, objects with thread storage duration and associated with the current thread are destroyed. Next, objects with static storage duration are destroyed and functions registered by calling
atexit
are called.Non-localOtherwise, non-local objects with static or thread storage duration are destroyed in the reverse order of the completion of their constructor. (Automatic objects are not destroyed as a result of callingexit()
.) Functions registered withatexit
are called in the reverse order of their registration, except that a function is called after any previously registered functions that had already been called at the time it was registered. A function registered withatexit
before a non-local objectobj1
of static storage duration is initialized will not be called untilobj1
's destruction has completed. A function registered withatexit
after a non-local objectobj2
of static storage duration is initialized will be called beforeobj2
's destruction starts. A local static objectobj3
is destroyed at the same time it would be if a function calling theobj3
destructor were registered withatexit
at the completion of theobj3
constructor.