P1874R1
Dynamic Initialization Order of Non-Local Variables in Modules

Published Proposal,

This version:
http://wg21.link/p1874r1
Author:
(Apple)
Audience:
EWG, SG2
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++

Abstract

The order of dynamic initialization of non-local variables with static storage duration (globals) imported from interface and header units is indeterminately sequenced relative to those of their importer. This has the potential to break code, including code which includes <iostream>, when moving to C++20. This paper explores what existing implementations do in this case, and defines an ordering as is currently done in both Clang and GCC. This paper resolves US082.

1. Effects of This Paper

Note: In all of the examples in this paper it is assumed that #include translation does not occur.

Status Quo This Paper
import <iostream>;

struct G {
  G() {
    std::cout << "Constructing\n";
  }
} g;
// ☠️ std::Init may not have been initialized yet,
// so std::cout may not have been initialized yet.
import <iostream>;

struct G {
  G() {
    std::cout << "Constructing\n";
  }
} g;
// ✅ std::Init has been initialized,
// thus std::cout has been initialized.

2. The Problem

[basic.start.dynamic] states:

Dynamic initialization of non-local variables V and W with static storage duration are ordered as follows:

With textual #includes the first rule holds for global initaliziers, but this breaks when moving to import, either explicitly or via #include translation. This happens because header units are their own translation units, and thus fall into the last rule. A similar problem exists with module interface units which transitively #include or import such a header.

3. The Solution

Clang already ran into this issue and has a simple fix. Run all global initializers from translation units which are transitively imported when they are imported.

// H1.h
inline int a = init();

// H2.h
inline int b = init();

// TU.cpp
int c = init();
import "H1.h";
import "H2.h";

Clang modules tried to mimic includes as closely as possible, thus in the above example Clang will initialize c first, and then a before b. If you flip the order of the imports, then it would initialize b before a. While this is a reasonable model for Clang modules, which was intended to mimic #includes, it is a poor fit for C++20 modules, as it violates the rule that the order of imports doesn’t matter.

This paper explores a simpler user model based on dependency order. All global initaliziers from translation units which are transitively imported are run before any globals in the translation unit which imports them. For the above example it would mean that the initialization of a and b is sequenced before the initialization of c, but that the initializations of a and b are indeterminately sequenced.

3.1. The Catch

Initializers of inline variable V and variable W are ordered when "V is defined before W in every translation unit in which W is defined." If the above solution is applied blindly, then cycles can be created.

// H1.h
int external(int);

// H2.h
#include "H1.h"
inline int a = external(0);

// H3.h
#include "H1.h"
inline int b = external(1);

// M1.cppm
module;
#include "H2.h"
export module M1;
import "H3.h";

// M2.cppm
module;
#include "H3.h"
export module M2;
import "H2.h";

This example has the following sequenced before graph:

Example b b a a b->a a->b

M1 has an interface dependency on the the header unit "H3.h", and contains a definition of b, thus bs initialization is sequenced before a's initialization. However; M2 has an interface dependency on the header unit "M2.h" and contains a definition of a, thus as initialization is sequenced before b's initialization. This is a contradiction.

The ordering guarantees will need to be weakened for this case.

4. Implementation Units?

If we’re defining an ordering for interfaces, why not also define one for module implementation units and solve the init order problem once and for all? While there are plausible ways to define an ordering for implementation units, there are several reasons why this direction was not chosen at this time:

5. Ship Vehicle

This paper is targeting C++20 as the standard library is subtly broken without it.

6. Proposed Polls

  1. Pick a model. In the graphs that follow a directed arrow represents a sequenced before relationship. Initializers with no forward path between them are indeterminately sequenced.

Note: SG2 and EWG approved the Relaxed Clang Model, and the wording below implements that.

// H1.h
int a = init();

// H2.h
int b = init();

// H3.h
int c = init();

// TU.cpp
int d = init();
import "H1.h";
import "H2.h";
int e = init();
import "H3.h";
Current Clang Model Relaxed Clang Model Dependency Order Model
Clang Model d d a a d->a b b a->b e e b->e c c e->c
Relaxed Clang Model e e c c d d d->e a a a->e b b b->e
Dependency Order Model a a d d a->d e e d->e b b b->d c c c->d

7. Wording

The following wording assumes that the issue with interface dependency not applying to header units covered by US087 is fixed.

7.1. [basic.start.dynamic]

1+ A declaration D is appearance-ordered before a declaration E if
  • D appears in the same translation unit as E, or

  • the translation unit containing E has an interface dependency on the translation unit containing D,

in either case prior to E.

2 Dynamic initialization of non-local variables V and W with static storage duration are ordered as follows:

[ Note: This definition permits initialization of a sequence of ordered variables concurrently with another sequence. — end note ]