ISO/IEC JTC1 SC22 WG21 N2280 = 07-0140 - 2007-05-02
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
This proposal is a revision of N2147 = 07-0007 - 2007-01-05.
The revision is to replace the keyword __thread
with the keyword thread_local
.
In multi-threaded applications, there often arises the need to maintain data that is unique to a thread. We call this thread-local storage.
Several techniques have been used to accomplish this task.
Notable among them is the POSIX
getthreadspecific
and setthreadspecific
facility.
Unfortunately, this facility is clumsy and slow.
In addition, the facility is not particularly helpful
when converting a single-threaded application
to a multi-threaded application.
Several vendors have provided a language extension for a new storage class that indicates that a variable has thread storage duration. Use of thread variables is relatively easy and access to thread variables is relatively fast. In addition, the conversion of a single-threaded application using static-duration variables to a multi-threaded application using thread-duration variables requires less wholesale program restructuring.
Roughly equivalent extensions are available from
The C++ standard should adopt existing practice for thread-local storage. In addition, the C++ standard should extend existing practice to enable broader use.
The specification outline is as follows. We defer detailed changes to the text of the standard to the final section.
Add a new storage duration called thread storage duration. Objects with thread storage duration are unique to each thread.
Those objects which may have static storage duration may have thread storage duration instead. These objects include namespace-scope variables, function-local static variables, and class static member variables.
thread_local
Add thread_local
,
a new keyword and storage class specifier.
The thread_local
specifier
indicates that the variable has thread storage duration.
Variables declared with the thread_local
specifier
are bound as they would be without the thread_local
specifier.
The address-of operator (&
),
when applied to a thread variable,
is evaluated at run time
and returns the address of the current thread's variable.
Therefore, the address of a thread variable is not a constant.
Thread-local storage defines lifetime and scope, not accessibility. That is, one may take the address of a thread-local variable and pass it to other threads.
The address of a thread variable is stable for the lifetime of the corresponding thread. The address of a thread variable may be freely used during the variable's lifetime by any thread in the program. When a thread terminates, all addresses of that thread's variables are invalid and may not be used.
A thread variable may be statically initialized as would any other static-duration variable.
At present, all implementations of thread-local storage do not support dynamic initalization (and presumably non-trivial destructors). There was mild consensus at the Mont Treblant meeting to support dynamic initialization of function-local, thread-local variables. The intialization of such variables is already guarded and synchronous, so new technology is not required. On the other hand, the implementation for dynamic initialization of namespace-scope variables is much more difficult, and may require additional linker and operating system support. There was no consensus to support dynamic initialization of namespace-scope variables at this time. However, interviews with prospective users indicated a firm desire for full dynamic initialization of thread storage duration variables. The programmers simply did not want to partition their types this way.
The implementation of dynamic initialization and destruction can be implemented with two approaches.
.init
sections.init
sections
to also include sections for thread-local storage.
These thread-local inits will be invoked
whenever the corresponding storage section is allocated.
This approach requires operating-system support.
In either case, the initialization of a thread-local variable must place the destruction on a thread-local list for subsequent handling on exit from the thread (potentially with cancellation cleanup functions).
There are some other issues that deserve mention even though they are not properly part of the C++ standard because they affect real programs.
The allocation of thread-local storage for the full product of threads and dynamic libraries could result in very large storage requirements. The Sun Microsystems implementation only allocates thread-local storage for a dynamic library when the thread uses a variable from that library. That is, the Sun implementation allocates memory lazily for each thread and dynamic library pair. To avoid bloated programs, the language definition must permit this optimization.
The system may immediately deallocate the storage associated with a thread and dynamic library pair when either the thread terminates or the library is closed. The system is not required to deallocate immediately. However, the system is required to not leak storage. Thread-local storage for a thread must be reclaimed no later than a subsequent thread creation. Thread-local storage for a library within a thread must be reclaimed no later than a subsequent open of that library. (Opening another library does not require storage reclamation, though doing so would ceratinly reduce storage consumption.)
While storage deallocation can be defered, variable destruction must not be defered because destruction depends on access to thread state. In the presence of programmed closing of a dynamic library, its thread-local variables may need to be destructed out of order with respect to thread-local variables outside of the library.
When dlsym()
is used on a thread variable,
the address returned
will be the address of the currently executing thread's variable.
The text of the standard changes as specified in this section.
To table 3, add thread_local
.
In paragraph 4, edit as follows. This change is the minimal necessary to accomodate thread-duration objects. A more robust specification of termination is needed. See 18.4 Start and termination [support.start.term].
Calling the functionstd::exit(int)
declared in<cstdlib>
(18.4) terminates the program without leaving the current block or current thread and hence without destroying any objects with automatic storage duration (12.4) or thread storage duration (3.7.2(new)). Ifstd::exit
is called to end a program during the destruction of an object with static or thread storage duration, the program has undefined behavior.
Before paragraph 1, add a new paragraph
There are two broad classes of non-local objects, those with static storage duration (3.7.1) and those with thread storage duration (3.7.2(new)). Objects with static storage duration are initialized as a consequence of program initiation. Objects with thread storage duration are initialized as a consequence of thread initiation. Within each initiation, initialization occurs as follows.
In paragraph 1, edit
Objects with static storage duration (3.7.1) or thread storage duration (3.7.2(new)) shall be zero-initialized (8.5) before any other initialization takes place. A reference with static or thread storage duration and an object of POD type with static or thread storage duration can be initialized with a constant expression (5.19);
In paragraph 2, edit
An implementation is permitted to perform the initialization of an object of namespace scopewith static storage durationas a static initialization even if such initialization is not required to be done statically, provided that
- the dynamic version of the initialization does not change the value of any other object of namespace scope
with static storage durationprior to its initialization, and- ....
- [Note: as a consequence, if the initialization of an object
obj1
refers to an objectobj2
of namespace scopewith static storage duration....
In paragraph 3, edit
It is implementation-defined whether or not
the dynamic initialization (8.5, 9.4, 12.1, 12.6.1)
of an object of namespace scope and with static storage duration
is done before the first statement of main
.
....
After paragraph 3, add new paragraph 4.
It is implementation-defined whether or not the dynamic initialization (8.5, 9.4, 12.1, 12.6.1) of an object of namespace scope and with thread storage duration is done before the first statement of the initial function of the thread. If the initialization is deferred to some point in time after the first statement of the initial function of the thread, it shall occur before the first use of any object with thread storage duration defined in the same translation unit as the object to be initialized.
In existing paragraph 4, edit
If construction or destruction of anon-local staticobject of namespace scope ends in throwing an uncaught exception, the result is a call tostd::terminate
(18.7.3.3).
In paragraph 1, edit
Destructors (12.4) for initialized objects of static storage duration (declared at block scope or at namespace scope) are called as a result of returning frommain
and as a result of callingexit
(18.3). Destructors (12.4) for initialized objects with thread storage duration (declared at block scope or at namespace scope) are called as a result of returning from the initial function of a thread. When the initial function of a thread is themain
function, the objects are destructed before those of static storage duration. These objects are destroyed in the reverse order of the completion of their constructor or of the completion of their dynamic initialization. If an object is initialized statically, the object is destroyed in the same order as if the object was dynamically initialized. For an object of array or class type, all subobjects of that object are destroyed before any local object with static storage duration initialized during the construction of the subobjects is destroyed.
In paragraph 4, edit
Calling the functionstd::abort()
declared in<cstdlib<
terminates the program without executing destructors for objectsofwith automatic, thread, orwithstatic storage duration and without calling the functions passed tostd::atexit()
.
To the list of storage durations in paragraph 1, between static and automatic, add
In paragraph 2, edit
Static, thread, and automatic durations are associated with objects introduced by declarations (3.1) and implicitly created by the implementation (12.2).
In paragraph 3, edit
The storage class specififersstatic
,thread_local
, andauto
are related to storage duration as described below.
In paragraph 1, edit
All objects whichneitherdo not have dynamic storage duration, do not have thread storage duration, andnorare not local, have static storage duration.
Add a new section after 3.7.1 Static storage duration [basic.stc.static] with the following contents.
All objects declared with the
thread_local
keyword have thread storage duration. The storage for these objects shall last for the duration of the thread in which they are created. There is a distinct object per thread, and use of the declared name refers to the object associated with the current thread.An object with thread storage duration shall be initialized before its first use, and if initialized, shall be destroyed on thread exit.
In paragraph 4, edit
[ Note: in particular,
a global allocation function is not called to allocate storage
for objects with static storage duration (3.7.1),
for objects with thread storage duration (3.7.2(new)),
for objects of type std::type_info
(5.2.8),
for the copy of an object thrown by a throw expression (15.1). --end note ]
In paragraph 8, edit
If a program ends the lifetime of an object of typeT
with static (3.7.1), thread (3.7.2(new), or automatic(3.7.2)(3.7.3(new)) storage duration and ifT
has a non-trivial destructor,
In footnote 40, edit
that is, an object for which a destructor will be called implicitly --either eitherupon exit from the block for an object with automatic storage duration, upon exit from the thread for an object with thread storage duration, or upon exit from the program for an object with static storage duration.
In paragraph 9, edit
Creating a new object at the storage location that aconst
object with static, thread, or automatic storage duration occupies or, at the storage location that such aconst
object used to occupy before its lifetime ended results in undefined behavior.
Paragraph 2 remains unchanged, intepreting "static" as modifying initialization rather than as a reference to duration.
Other expressions are considered constant-expressions only for the purpose of non-local static object initialization (3.6.2). Such constant expressions shall evaluate to one of the following:
Paragraphs 4 (address constant expressions) and 5 (reference constant expressions) remain unchanged. The omission of thread storage duration becomes significant, though, in that objects with thread storage duration do not have constant addresses.
In paragraph 4, edit
The zero-initialization (8.5) of all local objects with static storage duration (3.7.1) or thread storage duration (3.7.2(new)) is performed before any other initialization takes place. A local object of POD type (3.9) with static or thread storage duration initialized with constant-expressions is initialized before its block is first entered. An implementation is permitted to perform early initialization of other local objects with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize an object with static or thread storage duration in namespace scope (3.6.2).
Paragraph 5 is unchanged, which by implication states that thread storage duration objects must be destructed.
In paragraph 1, add "thread_local
"
to the list of storage class specifiers.
In paragraph 1, edit
At most one storage-class-specifier shall appear in a given decl-specifier-seq., except thatthread_local
may appear withstatic
andextern
. Ifthread_local
does appear, it shall be present in all declarations referring to the same object.
After paragraph 3, add a new paragraph
Thethread_local
specifier can be applied only to the names of objects of block scope that also specifystatic
or to the names of objects of namespace scope. It specifies that the named object has thread storage duration (3.7.2(new)).
In paragraph 4, edit
Astatic
specifier used in the declaration of an object declares the object to have static storage duration (3.7.1), unless accompanied by thethread_local
specifier, which declares the object to have thread storage duration (3.7.2(new))
Paragraph 5 on extern
is missing the parallel text.
In paragraph 2, edit
Automatic, register, thread, static, and namespace-scoped external variablesof namespace scopecan be initialized by arbitrary expressions involving literals and previsously declared variables and functions.
Paragraph 7 remains unchanged, which implies that thread storage duration objects may be uninitialized at program startup.
In paragraph 14, edit as follows. The expanded scope of 3.6.2 leaves this text mostly untouched.
When an aggregate with static or thread storage duration is initialized with a brace-enclosed initializer-list, if all the member initializer expressions are constant expressions, and the aggregate is a POD type, the initialization shall be done during a static phase of initialization (3.6.2); otherwise, it is unspecified whether the initialization of members with constant expressions takes place during the static phase or during the dynamic phase of initialization.
In paragraph 6, edit
A member shall not be declared to have automatic storage duration (auto
,register
), with thethread_local
storage-class-specifier unless also declaredstatic
, or with theextern
storage-class-specifier.
In paragraph 1, edit
Astatic
data member is not part of the subobjects of a class. For such a member declaredthread_local
, there is only one copy of the member per thread. For such a member not declaredthread_local
, thereThereis only one copy ofathe data member shared by all the objects of the class.static
In paragraph 8, edit
Default constructors are called implicitly to create class objects of static, thread, or automatic storage duration (3.7.1, 3.7.2(new), 3.7.2) defined without an initializer (8.5), ...
In paragraph 5, edit
In addition, the destruction of temporaries bound to references shall take into account the ordering of destruction of objects with static, thread, or automatic storage duration (3.7.1, 3.7.2(new), 3.7.3(new));
In paragraph 10, edit
Destructors are invoked implicitly (1) for a constructed object with static storage duration (3.7.1) at program termination (3.6.3), (new) for a constructed object with thread storage duration (3.7.2(new)) at thread exit, (2) for a constructed object with automatic storage duration (3.7.23(new)) when the block in which the object is created exits (6.7), (3) for a constructed temporary object when the lifetime of the temporary object ends (12.2), (4) for a constructed object allocated by a new-expression (5.3.4), through use of a delete-expression (5.3.5), (5) in several situations due to the handling of exceptions (15.3).
In paragraph 4, edit
[ Note: the order in which objects with static or thread storage duration are initialized is described in 3.6.2 and 6.7. -- end note ]
In paragraph 4, edit
Exceptions thrown
in destructors of objects with static storage duration
or in constructors of static-duration namespace-scope objects
are not caught by a function-try-block on main()
.
Likewise, exceptions thrown
in destructors of object with thread storage duration
or in constructors of thread-duration namespace-scope objects
are not caught by a function-try-block
on the initial function of the thread.
std::terminate()
function [except.terminate]In paragraph 1, in the list of causes for termination, edit
when construction or destruction of a non-local object with static or thread storage duration exits using an exception (3.6.2), or
Another possibility is to propogate the exception to the joiner, but then there would be no distinction between the thread function exiting with an exception and one of its thread-duration objects exiting with an exception.
In paragraph 3, edit
The program is terminated without executing destructors for objects of automatic, thread, or static storage duration and without calling the functions passed to atexit() (3.6.3).
Paragraph 8, discusses the interaction of destruction
and calling exit
.
The following edit is the minimum possible change to the standard
to occomodate thread storage duration objects.
The function exit() has additional behavior in this International Standard:
- First, objects with static storage duration are destroyed and functions registered by calling
atexit
are called. Non-local objects with static storage duration are destroyed in the reverse order of the completion of their constructor. (Automatic objectsObjects with either automatic or thread storage duration are not destroyed as a result of callingexit()
.) Functions registered withatexit
are called in the reverse order of their registration, except that a function is called after any previously registered functions that had already been called at the time it was registered. A function registered withatexit
before a non-local objectobj1
of static storage duration is initialized will not be called untilobj1
's destruction has completed. A function registered withatexit
after a non-local objectobj2
of static storage duration is initialized will be called beforeobj2
's destruction starts. A local static objectobj3
is destroyed at the same time it would be if a function calling theobj3
destructor were registered withatexit
at the completion of theobj3
constructor.