N3412=12-0102
revision of N3366=12-0056
Jens Maurer
2012-09-19

Runtime-sized arrays with automatic storage duration (revision 2)

Changes compared to N3366

Motivation

Sometimes, a user wishes to allocate a local array whose size is not known at compile-time, but at runtime only. Nonetheless, the array's size will remain unchanged during the lifetime of the array.

Examples are

This paper proposes to add local runtime-sized arrays with automatic storage duration to C++, for example:

void f(std::size_t n)
{
   int a[n];
   for (std::size_t i = 0; i < n; ++i)
       a[i] = 2*i;
   std::sort(a, a+n);
}

Traditionally, the array bound "n" had to be a constant expression (see 8.3.4 dcl.array). For local arrays with automatic storage duration, this paper proposes to lift that restriction. The syntax is intended to be the same as that used for C99 variable length arrays (VLAs).

As a design guideline, the same rules should apply to "new T[n]" and a local array "T a[n]".

There is well-established existing practice with gcc, Clang, and Intel C++ all implementing a similar, if not identical feature. In fact, Douglas Gregor reported in c++std-ext-12553 on 2012-01-30:

Users really seem to want this feature. It's a fairly common extension, and when we tried to ban it out of principle (in Clang), our users reacted *very* strongly.

Scope

This paper does not propose to add all features of C99 variable length arrays to C++. In particular, the following features are explicitly excluded: Examples:
void f(std::size_t n)
{
  int a[n];
  unsigned int x = sizeof(a);            // ill-formed
  const std::type_info& ti = typeid(a);  // ill-formed
  typedef int t[n];                      // ill-formed
}

Discussion

Data structures that allocate from the heap access, by design, a global resource that is often highly contended in a multi-threaded program. Therefore, avoiding heap allocations is usually advantageous for performance. Allocating such data from the stack is much more efficient, because the stack is local to each thread and bytes on the stack are often cached locally. (Since each thread has a separate stack, it is unlikely that another thread on another CPU accesses the same data, thereby causing more expensive cache invalidations.)

The syntax does not require additional keywords. Instead, a restriction on the existing array declaration syntax is lifted in certain circumstances.

There is no reason to limit the feature to PODs as array element types, thus such a limitation is not proposed.

Stack overflow becomes more likely, in particular if the size depends on external input and is not properly checked. Some environments might therefore prohibit the use of the feature. Such a prohibition can be easily enforced with a static analysis tool.

There is no longer an upper bound on the size of a function's stack frame. This makes static analysis of stack usage harder.

Open Issues

Alternatives

The following alternatives were considered:

Changes to the Working Draft

Change in 3.9 basic.types paragraph 10:
A type is a literal type if it is:
Change in 3.9.2 basic.compound paragraph 2 (the same wording is added by the proposed resolution of core issue 1464):
These methods of constructing types can be applied recursively; restrictions are mentioned in 8.3.1 dcl.ptr, 8.3.4 dcl.array, 8.3.5 dcl.fct, and 8.3.2 dcl.ref. Constructing a type such that the number of bytes in its object representation exceeds the maximum value representable in the type std::size_t (18.2 support.types) is ill-formed.
Change in 4.2 conv.array paragraph 1:
An lvalue or rvalue expression of type "array of N T", "array of runtime bound of T", or "array of unknown bound of T" can be converted to a prvalue of type "pointer to T". The result is a pointer to the first element of the array.
Insert a new paragraph before 5.2.8 expr.typeid paragraph 2:
The typeid operator shall not be applied to an array of runtime bound.

When typeid is applied to a glvalue expression ...

Change in 5.3.3 expr.sizeof paragraph 1:
... The sizeof operator shall not be applied to an expression that has function or incomplete type, to an enumeration type whose underlying type is not fixed before all its enumerators have been declared, to an array of runtime bound, to the parenthesized name of such types, or to an lvalue that designates a bit-field. ...

Drafting note: 5.3.7 expr.unary.noexcept does not need to be changed, because the declaration of an array of runtime bound cannot be lexically part of the operand of a noexcept; see also 5.1.2p2 expr.prim.lambda.

Change in 6.5.4 stmt.ranged paragraph 1:
Insert a new paragraph before 7.1.3 dcl.typedef paragraph 3:
A typedef-name shall not name an array of runtime bound.

In a given non-class scope, a typedef specifier can be used to redefine the name of any type declared in that scope to refer to the type to which it already refers. [ Example: ... ]

Change in 7.1.6.2 dcl.type.simple paragraph 3:
The type denoted by decltype(e) is defined as follows:
Change in 8 dcl.decl paragraph 4:
noptr-declarator:
      declarator-id attribute-specifier-seqopt
      noptr-declarator parameters-and-qualifiers
      noptr-declarator [ constant-expressionopt expressionopt ] attribute-specifier-seqopt
      ( ptr-declarator )

Drafting note: Section 8.1 [dcl.name] defining the grammar term type-id is intentionally unchanged. Thus, constructing an array of runtime bound in a type-id is ill-formed, because the grammar continues to require all constant-expressions in array bounds.

Change in 8.3.1 dcl.ptr paragraph 1:
... Similarly, the optional attribute-specifier-seq (7.6.1) appertains to the pointer and not to the object pointed to. There shall be no pointers to arrays of runtime bound.
Change in 8.3.2 dcl.ref paragraph 5:
There shall be no references to references, no references to arrays of runtime bound, no arrays of references, and no pointers to references. ...
Change in 8.3.4 dcl.array paragraph 1 (partly taken from Mike Miller's drafting for core issue 1464):
In a declaration T D where D has the form
           D1 [ constant-expressionopt expressionopt ] attribute-specifier-seqopt
and the type of the identifier in the declaration T D1 is "derived-declarator-type-list T", then the type of the identifier of D is an array type; if the type of the identifier of D contains the auto type-specifier, the program is ill-formed. T is called the array element type; this type shall not be a reference type, the (possibly cv-qualified) type void, a function type, an array of unknown or runtime bound, or an abstract class type. Except as noted below, if the expression is omitted, the type of the identifier of D is "derived-declarator-type-list array of unknown bound of T". If the expression is present, it is implicitly converted to std::size_t. The expression is erroneous if: If the expression, after converting to std::size_t, is a core constant expression and the expression is erroneous, the program is ill-formed. If the constant-expression (5.19 expr.const) is present, it shall be an integral constant expression and its value shall be greater than zero. The constant expression specifies the bound of (number of elements in) the array. If the value of the constant expression is N, the array has N elements numbered 0 to N-1, and the type of the identifier of D is "derived-declarator-type-list array of N T". An object of array type If N is zero, an object of array type has no elements. Otherwise, it contains a contiguously allocated non-empty set of N subobjects of type T. Except as noted below, if the constant expression is omitted, the type of the identifier of D is "derived-declarator-type-list array of unknown bound of T", an incomplete object type. The type "derived-declarator-type-list array of N T" is a different type from the type "derived-declarator-type-list array of unknown bound of T", see 3.9 basic.types. Any type of the form "cv-qualifier-seq array of N T" is adjusted to "array of N cv-qualifier-seq T", and similarly for "array of unknown bound of T". The optional attribute-specifier-seq appertains to the array. [ Example:
  typedef int A[5], AA[2][3];
  typedef const A CA;         // type is "array of 5 const int"
  typedef const AA CAA;       // type is "array of 2 array of 3 const int"

  void f(unsigned int n) {
    int a[n];     // type of "a" is "array of runtime bound of int"
  }

-- end example ] [ Note: ... ]
Change in 8.3.4 dcl.array paragraph 3:
When several "array of" specifications are adjacent, a multidimensional array is created; only the first of the constant expressions that specify the bounds of the arrays may be omitted. In addition to ...
Add a new paragraph before 8.3.4 dcl.array paragraph 4:
An array of runtime bound shall only be used as the type of a local object with automatic storage duration.
Change in 8.3.5 dcl.fct paragraph 8:
If the type of a parameter includes a type of the form "array of runtime bound of T", "pointer to array of unknown bound of T", or "reference to array of unknown bound of T," the program is ill-formed. [ Footnote: ... ] Functions shall not have a return type of type array or function, although they may have a return type of type pointer or reference to such things.
Change in 8.5.1 dcl.init.aggr paragraph 2:
... An empty initializer list {} shall not be used as the initializer-clause for an array of unknown bound with static or thread storage duration. [ Footnote: The syntax provides for empty initializer-lists, but nonetheless C++ does not have zero length arrays. ]
Change in 8.5.1 dcl.init.aggr paragraph 6:
For types other than arrays of runtime bound (8.3.4 dcl.array), an An initializer-list is ill-formed if the number of initializer-clauses exceeds the number of members or elements to initialize. [ Example: ... ]
Change in 8.5.2 dcl.init.string paragraph 2:
[ Note: There cannot be more initializers than there are array elements; see 8.3.4 dcl.array. [ Example:
 char cv[4] = "asdf";                 // error
is ill-formed since there is no space for the implied trailing '\0'. -- end example ] -- end note ]
Change in 9.2 class.mem paragraph 10:
Non-static A non-static (9.4 class.static) data members member shall not have incomplete types. type or type "array of runtime bound". [ Note: In particular, a class C shall not contain a non-static member of class C, but it can contain a pointer or reference to an object of class C. ]
Change in 14.1 temp.param paragraph 7:
A non-type template-parameter shall not be declared to have floating point, class, array of runtime bound, or void type. [ Example: ... ]