Doc No: | N2756 = 08-0266 |
Date: | 2008-09-16 |
Reply to: | Bill Seymour <stdbill.h@pobox.com> |
As a simple example,
class A { public: int a = 7; };would be equivalent to
class A { public: A() : a(7) {} };The real benefits of member initializers do not become apparent until a class has multiple constructors. For many data members, especially private ones, all constructors initialize a data member to a common value as in the next example:
class A { public: A(): a(7), b(5), hash_algorithm("MD5"), s("class A example") {} A(int a_val) : a(a_val), b(5), hash_algorithm("MD5"), s("Constructor run") {} A(int b_val) : a(7), b(b_val), hash_algorithm("MD5"), s("Constructor run") {} A(D d) : a(f(d)), b(g(d)), hash_algorithm("MD5"), s("Constructor run") {} int a, b; private: // Cryptographic hash to be applied to all A instances HashingFunction hash_algorithm; // String indicating state in object lifecycle std::string s; };Even in this simple example, the redundant code is already problematic if the constructor arguments for hash_algorithm are copied incorrectly in one of A’s constructors or if one of the lifecycle states was accidentally misspelled as "Constructor Run". These kinds of errors can easily result in subtle bugs. Such inconsistencies are readily avoided using member initializers.
class A { public: A(): a(7), b(5) {} A(int a_val) : a(a_val), b(5) {} A(int b_val) : a(7), b(b_val) {} A(D d) : a(f(d)), b(g(d)) {} int a, b; private: // Cryptographic hash to be applied to all A instances HashingFunction hash_algorithm("MD5"); // String indicating state in object lifecycle std::string s("Constructor run"); };Not only does this eliminate redundant code that must be manually synched, it makes much clearer the distinctions between the different constructors. (Indeed, in Java, where both forms of initialization are available, the use of member initializers is invariably preferred by experienced Java programmers in examples such as these.)
Now suppose that it is decided that MD5 hashes are not collision resistent enough and that SHA-1 hashes should be used. Without member initializers, all the constructors need to be updated. Unfortunately, if one developer is unaware of this change and creates a constructor that is defined in a different source file and continues to initialize the cryptographic algorithm to MD5, a very hard to detect bug will have been introduced. It seems better to keep the information in one place.
It may happen that a data member will usually have a particular value, but a few specialized constructors will need to be cognizant of that value. If a constructor initializes a particular member explicitly, the constructor initialization overrides the member initializations as shown below:
class A { public: A(): a(7), b(5) {} A(int a_val) : a(a_val), b(5) {} A(int b_val) : a(7), b(b_val) {} A(D d) : a(f(d)), b(g(d)) {} // Copy constructor A(const A& aa) : a(aa.a), b(aa.b), hash_algorithm(aa.hash_algorithm.getName()), s(aa.s) {} int a, b; private: // Cryptographic hash to be applied to all A instances HashingFunction hash_algorithm("MD5"); // String indicating state in object lifecycle std::string s("Constructor run"); };A few additional points are worth noting.
During discussion in the Core Working Group at the September ’07 meeting in Kona, a question arose about the scope of identifiers in the initializer. Do we want to allow class scope with the possibility of forward lookup; or do we want to require that the initializers be well-defined at the point that they’re parsed?
int x(); struct S { int i; S() : i(x()) {} // currently well-formed, uses S::x() // ... static int x(); }; struct T { int i = x(); // should use T::x(), ::x() would be a surprise // ... static int x(); };
struct S { int i(x); // data member with initializer // ... static int x; }; struct T { int i(x); // member function declaration // ... typedef int x; };One possible solution is to rely on the existing rule that, if a declaration could be an object or a function, then it’s a function:
struct S { int i(j); // ill-formed...parsed as a member function, // type j looked up but not found // ... static int j; };A similar solution would be to apply another existing rule, currently used only in templates, that if T could be a type or something else, then it’s something else; and we can use “typename” if we really mean a type:
struct S { int i(x); // unabmiguously a data member int j(typename y); // unabmiguously a member function };Both of those solutions introduce subtleties that are likely to be misunderstood by many users (as evidenced by the many questions on comp.lang.c++ about why
The solution proposed in this paper is to allow only initializers of the “= initializer-clause” and “{ initializer-list }” forms. That solves the ambiguity problem in most cases, for example:
HashingFunction hash_algorithm{"MD5"};Here, we could not use the = form because HasningFunction’s constructor is explicit.
In especially tricky cases, a type might have to be mentioned twice. Consider:
vector<int> x = 3; // error: the constructor taking an int is explicit vector<int> x(3); // three elements default-initialized vector<int> x{3}; // one element with the value 3In that case, we have to chose between the two alternatives by using the appropriate notation:
vector<int> x = vector<int>(3); // rather than vector<int> x(3); vector<int> x{3}; // one element with the value 3
struct S { const int i = f(); // well-formed with forward lookup static const int j = f(); // always ill-formed for statics // ... constexpr static int f() { return 0; } };
struct S { int i = j; // ill-formed without forward lookup, undefined behavior with int j = 3; };(Unless caught by the compiler, i might be intialized with the undefined value of j.)
We believe:
Problem 1: This problem does not occur as we don’t propose the () notation. The = and {} initializer notations do not suffer from this problem.Problem 2: adding the static keyword makes a number of differences, this being the least of them.
Problem 3: this is not a new problem, but is the same order-of-initialization problem that already exists with constructor initializers.
Because of this controversy, the authors no longer propose that auto be allowed for non-static data members.In order to determine the layout of X, we now have 2-phase name lookup and ADL. Note that func could be either a type or a function; it may be found in T, the namespace of MyType, the associated namespace(s) of T when instantiated, the global namespace, an anonymous namespace, or any namespaces subject to a using directive. With care we could probably throw some concept_map lookup in for luck.template< class T > struct MyType : T { auto data = func(); static const size_t erm = sizeof(data); };Depending on the order of header inclusion I might even get different results for ADL, and break the One Definition Rule - which is not required to be diagnosed.
There are seven changes in suggested standardese to get from D2628, which CWG considered on Thursday in Sofia-Antipolis, to the immediately previous version of this paper, N2673:
To get from N2673 to this paper, there are six changes to the suggested standardese, all of which we believe to be editorial:
There are four changes to get from N2712 to this paper:
(including from the mem-initializer or brace-or-equal-initializer for a non-static data member)
The potential scope of a name declared in a class consists not only of the declarative region following the name’s point of declaration, but also of all function bodies, brace-or-equal-initializers of non-static data members, and default arguments in that class (including such things in nested classes).
A name used in the definition of a member function (9.3) of class X following the function’s declarator-id 28), or in the brace-or-equal-initializer of a non-static data member (9.2) of class X, shall be declared in one of the following ways:and change the last bullet to:
— if X is a member of namespace N, or is a nested class of a class that is a member of N, or is a local class or a nested class within a local class of a function that is a member of N, before themember function definitionuse of the name, in namespace N or in one of N’s enclosing namespaces.
The keyword this names a pointer to the object for which a non-static member function (9.3.2) is invoked or a non-static data member’s initializer (9.2) is evaluated. The keyword this shall be used only inside a non-static class member function body (9.3) or in abrace-or-equal-initializer for a non-static data member (9.2). The type of the expression is a pointer to the function’s or non-static data member’s class(9.3.2), possibly withcv-qualifiers on the class type. The expression is an rvalue.
An id-expression that denotes a non-static data member or non-static member function of a class can only be used:— as part of a class member access (5.2.5) in which the object-expression refers to the member’s class or a class derived from that class, or— to form a pointer to member (5.3.1), or
— in the body of a non-static member function of that class or of a class derived from that class (9.3.1), or
— in a mem-initializer for a constructor for that class or for a class derived from that class (12.6.2), or
— in a
brace-or-equal-initializer for a non-static data member of that class or of a class derived from that class (12.6.2), or— if that id-expression denotes a non-static data member and it is the sole constituent of an unevaluated operand, except for optional enclosing parentheses. [ Example:
...
Change paragraph 4:
The auto type-specifier can also be used in declaring an object in the condition of a selection statement (6.4) or an iteration statement (6.5), in the type-specifier-seq in a new-type-id (5.3.4), and in declaring a static data member withaaconstant-initializer brace-or-equal-initializer that appears within themember-specification of a class definition (9.4.2).
initializer:
= initializer-clausebrace-or-equal-initializer
( expression-list )
braced-init-listbrace-or-equal-initializer:
= initializer-clause
braced-init-list
- as the initializer in a variable definition (8.5)
- as the initializer in a new expression (5.3.4)
- in a return statement (6.6.3)
- as a function argument (5.2.2)
- as a subscript (5.2.1)
- as an argument to a constructor invocation (8.5, 5.2.3)
- as an initializer for a non-static data member (9.2)
- as a base-or-member initializer (12.6.2)
- on the right-hand side of an assignment (5.17)
member-declarator:
declarator pure-specifieropt
declaratorconstant-initializerbrace-or-equal-initializeropt
identifieropt : constant-expression
constant-initializer:
= constant-expression
A class is considered a completely-defined object type (3.9) (or complete type) at the closing } of theclass-specifier. Within the classmember-specification, the class is regarded as complete within function bodies, default arguments,exception-specifications, andbrace-or-equal-initializers for non-static data members (including such things in nested classes). Otherwise it is regarded as incomplete within its own classmember-specification.
A member-declarator can contain a constant-initializer only if it declares a static member (9.4) of const type, see 9.4.2.
A member can be initialized using abrace-or-equal-initializer. (For static data members, see 9.4.2 [class.static.data]; for non-static data members, see 12.6.2 [class.base.init]).
If a static data member is of const literal type, its declaration in the class definition can specifya constant-initializera brace-or-equal-initializer with an initializer-clause that is an integral constant expression. A static data member of literal type can be declared in the class definition with the constexpr specifier; if so, its declaration shall specifya constant-initializera brace-or-equal-initializer with an initializer-clause that is an integral constant expression. In both these cases, the member may appear in integral constant expressions. The member shall still be defined in a namespace scope if it is used in the program and the namespace scope definition shall not contain an initializer.
A default constructor for a class X is a constructor of class X that can be called without an argument. …— its class has no virtual functions (10.3) and no virtual base classes (10.1), and
— no non-static data member of its class has a
brace-or-equal-initializer, and— all the direct base classes of its class have trivial default constructors, and
— for all the non-static data members of its class that are of class type (or array thereof), each such class has a trivial default constructor.
Add a new first bullet to 12.6.2p4:
If a given non-static data member or base class is not named by amem-initializer-id (including the case where there is nomem-initializer-list because the constructor has noctor-initializer), then— If the entity is a non-static data member that has abrace-or-equal-initializer, the entity is initialized as specified in 8.5.—
IfOtherwise, if the entity is a non-static ...
After the call to a constructor for class X has completed, if a member of X is neitherspecified in the constructor’s mem-initializers, nor default-initialized, nor value-initialized,initialized nor given a value during execution of the compound-statement of the body of the constructor, the member has indeterminate value.[ Example:
struct A { A(); }; struct B { B(int); }; struct C { C() { } // initializes members as follows: A a; // ok: calls A::A() const B b; // error: B has no default constructor int i; // ok: "i" has indeterminate value int j = 5; // ok: "j" has the value "5" };— end example ]
If a given non-static data member has both abrace-or-equal-initializer and amem-initializer, the initialization specified by themem-initializer is performed, and the non-static data member’sbrace-or-equal-initializer is ignored.[ Example: given
struct A { int i = /* some integer expression with side effects */ ; A(int arg) : i(arg) { } // ... };the A(int) constructor will simply initialize i to the value of arg, and the side effects in i’sbrace-or-equal-initializer will not take place. — end example ]
Member functions, including virtual functions (10.3), can be called during construction or destruction (12.6.2). When a virtual function is called directly or indirectly from a constructor (includingfromthemem-initializer orbrace-or-equal-initializer for a non-static data member) or from a destructor, and the object to which the call applies is the object under construction or destruction, the function called is the one defined in the constructor or destructor’s own class or in one of its bases, but not a function overriding it in a class derived from the constructor or destructor’s class, or overriding it in one of the other base classes of the most derived object (1.8). If the virtual function call uses an explicit class member access (5.2.5) and the object-expression refers to the object under construction or destruction but its type is neither the constructor or destructor’s own class or one of its bases, the result of the call is undefined. [ Example:
...
A similar change appears in 12.7p5:
The typeid operator (5.2.8) can be used during construction or destruction (12.6.2). When typeid is used in a constructor (includingfromthemem-initializer orbrace-or-equal-initializer for a non-static data member) or in a destructor, or used in a function called (directly or indirectly) from a constructor or from a destructor, if the operand of typeid refers to the object under construction or destruction, typeid yields the std::type_info object representing the constructor or destructor’s class. If the operand of typeid refers to the object under construction or destruction and the static type of the operand is neither the constructor or destructor’s class nor one of its bases, the result of typeid is undefined.
And again in 12.7p6:
Dynamic_casts (5.2.7) can be used during construction or destruction (12.6.2). When a dynamic_cast is used in a constructor (includingfromthemem-initializer orbrace-or-equal-initializer for a non-static data member) or in a destructor, or used in a function called (directly or indirectly) from a constructor or from a destructor, if the operand of the dynamic_cast refers to the object under construction or destruction, this object is considered to be a most derived object that has the type of the constructor or destructor’s class. If the operand of the dynamic_cast refers to the object under construction or destruction and the static type of the operand is not a pointer to or object of the constructor or destructor’s own class or one of its bases, the dynamic_cast results in undefined behavior.
The implicitly-defined or explicitly-defaulted copy constructor for class X performs a memberwise copy of its subobjects. [ Note:brace-or-equal-initializers of non-static data members are ignored. See also the example in 12.6.2. — end note ] The order of copying is the same as the order of initialization of bases and members in a user-defined constructor (see 12.6.2). Each subobject is copied in the manner appropriate to its type: