Document #: | P1061R7 |
Date: | 2024-02-14 |
Project: | Programming Language C++ |
Audience: |
CWG |
Reply-to: |
Barry Revzin <barry.revzin@gmail.com> Jonathan Wakely <cxx@kayari.org> |
R7 attempts to word the post-Varna version.
R6 has added wording changes and adds some more complicated examples to motivate how to actually word this paper.
R5 has minor wording changes.
R4 significantly improves the wording after review in Issaquah.
R3 removes the exclusion of namespace-scope per EWG guidance.
R2 adds a section about implementation complexity, implementation experience, and wording.
R1 of this paper [P1061R1] was presented to EWG in Belfast 2019 [P1061R1.Minutes] which approved the direction as presented (12-5-2-0-1).
R0 of this paper [P1061R0] was presented to EWGI in Kona 2019 [P1061R0.Minutes], who reviewed it favorably and thought this was a good investment of our time (4-3-4-1-0). The consensus in the room was that the restriction that the introduced pack need not be the trailing identifier.
Function parameter packs and tuples are conceptually very similar. Both are heterogeneous sequences of objects. Some problems are easier to solve with a parameter pack, some are easier to solve with a tuple
. Today, it’s trivial to convert a pack to a tuple
, but it’s somewhat more involved to convert a tuple
to a pack. You have to go through std::apply()
[N3915]:
This is great for cases where we just need to call a [non-overloaded] function or function object, but rapidly becomes much more awkward as we dial up the complexity. Not to mention if I want to return from the outer scope based on what these elements have to be.
How do we compute the dot product of two tuple
s? It’s a choose your own adventure of awkward choices:
Nested apply()
|
Using index_sequence
|
---|---|
Regardless of which option you dislike the least, both are limited to only std::tuple
s. We don’t have the ability to do this at all for any of the other kinds of types that can be used in a structured binding declaration [P0144R2] - because we need to explicit list the correct number of identifiers, and we might not know how many there are.
We propose to extend the structured bindings syntax to allow the user to introduce a pack as (at most) one of the identifiers:
std::tuple<X, Y, Z> f();
auto [x,y,z] = f(); // OK today
auto [...xs] = f(); // proposed: xs is a pack of length three containing an X, Y, and a Z
auto [x, ...rest] = f(); // proposed: x is an X, rest is a pack of length two (Y and Z)
auto [x,y,z, ...rest] = f(); // proposed: rest is an empty pack
auto [x, ...rest, z] = f(); // proposed: x is an X, rest is a pack of length one
// consisting of the Y, z is a Z
auto [...a, ...b] = f(); // ill-formed: multiple packs
If we additionally add the structured binding customization machinery to std::integer_sequence
, this could greatly simplify generic code:
Not only are these implementations more concise, but they are also more functional. I can just as easily use apply()
with user-defined types as I can with std::tuple
:
struct Point {
int x, y, z;
};
Point getPoint();
double calc(int, int, int);
double result = std::apply(calc, getPoint()); // ill-formed today, ok with proposed implementation
Python 2 had always allowed for a syntax similar to C++17 structured bindings, where you have to provide all the identifiers:
>>> a, b, c, d, e = range(5) # ok
>>> a, *b = range(3)
File "<stdin>", line 1
a, *b = range(3)
^
SyntaxError: invalid syntax
But you could not do any more than that. Python 3 went one step further by way of PEP-3132 [PEP.3132]. That proposal allowed for a single starred identifier to be used, which would bind to all the elements as necessary:
The Python 3 behavior is synonymous with what is being proposed here. Notably, from that PEP:
Possible changes discussed were:
- Only allow a starred expression as the last item in the exprlist. This would simplify the unpacking code a bit and allow for the starred expression to be assigned an iterator. This behavior was rejected because it would be too surprising.
R0 of this proposal only allowed a pack to be introduced as the last item, which was changed in R1.
Unfortunately, this proposal has some implementation complexity. The issue is not so much this aspect:
This part is more or less straightforward - we have a dependent type and we introduce a pack from it, but we’re already in a template context where dealing with packs is just a normal thing.
The problem is this aspect:
We have not yet in the history of C++ had this notion of packs outside of dependent contexts. This is completely novel, and imposes a burden on implementations to have to track packs outside of templates where they previously had not.
However, in our estimation, this functionality is going to come to C++ in one form or other fairly soon. Reflection, in the latest form of [P1240R2], has many examples of introducing packs in non-template contexts as well - through the notion of a reflection range. That paper introduces several reifiers that can manipilate a newly-introduced pack, such as:
As with the structured bindings example in this paper - we have a non-dependent object outside of a template that we’re using to introduce a pack.
Furthermore, unlike some of the reflection examples, and some of the more generic pack facilities proposed in [P1858R2], this paper offers a nice benefit: all packs must still be declared before use. Even in the sum_non_template
example which, as the name suggests, is not a template in any way, the pack elems
needs an initial declaration. So any machinery that implementations need to track packs doesn’t need to be enabled everywhere - only when a pack declaration has been seen.
Jason Rice has implemented this in a clang. As far as we’ve been able to ascertain, it works great.
It is also available on Compiler Explorer.
The strategy the wording takes to handle sum_non_template
above is to designate elems
as a non-dependent pack and to state that non-dependent packs are instantiated immediately. In that example, elems
is obviously non-dependent (nothing anywhere is dependent), so we just instantiate the fold-expression immediately.
But defining “non-dependent pack” is non-trivial. Examples from Richard Smith:
Is X<sizeof...(v)>
a dependent type or is the pack v
expanded eagerly because we already know its size?
One approach could be to say that packs are instantiated immediately only if they appear outside of any template. But then:
Here, despite being in a template, nothing is actually dependent.
I think the right line to draw here is to follow [CWG2074] - which asks about this example:
template<typename T>
void f() {
struct X {
typedef int type;
#ifdef DEPENDENT
T x;
#endif
};
X::type y; // #1
}
void g() { f<int>(); }
there is implementation variance in the treatment of #1
, but whether or not DEPENDENT
is defined appears to make no difference.
[…]
Perhaps the right answer is that the types should be dependent but a member of the current instantiation, permitting name lookup without typename.
I think the best rule would be to follow the suggested answer in this core issue: a structured binding pack is dependent if: the type of its initializer (the E
in 9.6 [dcl.struct.bind]) is dependent and it is not a member of the current instantiaton. This would make neither of the ...v
packs dependent, which seems conceptually correct.
Here is an interesting example, courtesy of Christof Meerwald:
This example demonstrates the need for having typename
and/or template
to disambiguate, even if we’re not in a template context.
Here is an interesting example, also courtesy of Christof Meerwald:
struct C { int j; long l; }; int g() { auto [ ... i ] = C{ 1, 2L }; return ( [c = i] () { if constexpr (sizeof(c) > sizeof(long)) { static_assert(false); // error? } struct C { // error, not templated? int f() requires (sizeof(c) == sizeof(int)) { return 1; } // error, not templated? int f() requires (sizeof(c) != sizeof(int)) { return 2; } }; return C{}.f(); } () + ... + 0 ); } // error? int v = g();
What happens here? Core’s intent is that this example is valid - the first static_assert
declaration does not fire, this declares two different types C
, both of which are valid, and v
is initialized with the value 3
.
Here’s an addendum from Jason Merrill, Jens Maurer, and John Spicer, slightly altered and somewhat gratuitously formatted hoping that it’s possible to understand:
struct C { int j; long ; }; int g() { auto [ ... i ] = C{ 1, 2L }; if constexpr (sizeof...(i) == 0) { static_assert(false); // #1 } return ( // E1 []{ if constexpr (sizeof(int) == 0) { static_assert(false); // #2 } return 1; }() * // E2 [c=i]{ if constexpr (sizeof(c) > sizeof(long)) { static_assert(false); // #3 } if constexpr (sizeof(int) == 0) { static_assert(false); // #4 } return 2; }() + ... + 0) }
The expression E1
(that first immediately-invoked lambda) is completely non-dependent, no template packs, no structured binding packs, no packs of any kind. The expression E2
is an abridged version of the previous example’s lambda - it does depend on the structured binding pack i
.
The intent is that the static_assert
declarations in #1
and #3
do not fire (since i
is not an empty pack and neither int
nor long
has larger size than long
), and this matches user expectation. The static_assert
in #2
, if written in a regular function, would immediately fire - since it’s not guarded by any kind of template. So the questions are:
#2
still fire in this context, or does it become attached to the pack somehow?#4
still fire in this context? It’s still exactly as non-dependent as it was before, but now it’s in a context that’s at least somewhat dependent - since we’re in a pack expansion over ...i
This brings up the question of what the boundary of dependence is.
The approach suggested by John Spicer is as follows. Consider this example (let si
just be some statements):
Those statements get treated roughly as if they appeared in this context:
Of course, not exactly like that (we’re not literally introducing a generic lambda, there’s no added function scope, all of these statements are directly in foo
so that return
s work, etc.). But this is the model.
Importantly, it helps answer all of the questions in the previous examples: do the static_assert
s fire in the Varna example addendum? No, none of them fire.
In addition to non-dependent packs, this paper also seems like it would offer the ability to declare a pack at namespace scope:
Structured bindings in namespace scope are a little odd to begin with, since they currently cannot be declared inline
. A structured binding pack at namespace scope adds that much more complexity. However, rejecting it would be a very odd kind of restriction that would be surprising. If users want to do this, they should be able to. EWG also explicitly polled this question in a telecon, with a strong preference for allowing such namespace-scope declarations.
Add a drive-by fix to 7.5.6 [expr.prim.fold] after paragraph 3:
π A fold expression is a pack expansion.
Add a new grammar option for simple-declaration to 9.1 [dcl.pre]:
sb-pack-identifier:
...
identifiersb-identifier-list:
identifier
sb-pack-identifier
sb-identifier-list,
identifier
sb-identifier-list,
sb-pack-identifier
simple-declaration:
decl-specifier-seq init-declarator-listopt;
attribute-specifier-seq decl-specifier-seq init-declarator-list;
attribute-specifier-seqopt decl-specifier-seq ref-qualifieropt[
identifier-listsb-identifier-list]
initializer;
Adjust 9.1 [dcl.pre]/11 to delay static_assert
s here too:
11 In a static_assert-declaration, the constant-expression
E
is contextually converted tobool
and the converted expression shall be a constant expression ([expr.const]). If the value of the expressionE
when so converted istrue
or the expression is evaluated either in the context of a template definition or in the locus of a structured binding pack declaration, the declaration has no effect and the static_assert-message is an unevaluated operand ([expr.context]). Otherwise, the static_assert-declaration fails and […]
Change 9.1 [dcl.pre] paragraph 8:
8 A simple-declaration with an
identifier-listsb-identifier-list is called a structured binding declaration ( [dcl.struct.bind]). The decl-specifier-seq shall contain only the type-specifierauto
and cv-qualifiers. The initializer shall be of the form “= assignment-expression
”, of the form “{ assignment-expression }
”, or of the form “( assignment-expression )
”, where the assignment-expression is of array or non-union class type.
Extend 9.3.4.6 [dcl.fct]/5:
5 The type of a function is determined using the following rules. The type of each parameter (including function parameter packs, after immediately expanding structured binding packs ([temp.variadic])) is determined from its own parameter-declaration ([dcl.decl]). After determining the type of each parameter, any parameter of type “array of T” or of function type T is adjusted to be “pointer to T”. After producing the list of parameter types, any top-level cv-qualifiers modifying a parameter type are deleted when forming the function type. The resulting list of transformed parameter types and the presence or absence of the ellipsis or a function parameter pack is the function’s parameter-type-list.
[Note 3: This transformation does not affect the types of the parameters. For example,
int(*)(const int p, decltype(p)*)
andint(*)(int, const int*)
are identical types. — end note] [Example 2:void f(char*); // #1 void f(char[]) {} // defines #1 void f(const char*) {} // OK, another overload void f(char *const) {} // error: redefines #1 void g(char(*)[2]); // #2 void g(char[3][2]) {} // defines #2 void g(char[3][3]) {} // OK, another overload void h(int x(const int)); // #3 void h(int (*)(int)) {} // defines #3 + struct C { + int i; + long l; + }; + auto [...v] = C{ 1, 0 }; + C k(int, long); // #4 + C k(decltype(v)... p) { // defines #4 + return C{p...}; // non-dependent function parameter pack p + // is instantiated immediately ([temp.variadic]) + }
— end example]
Change 9.6 [dcl.struct.bind] paragraph 1:
1 A structured binding declaration introduces the identifiers v0, v1, v2, …, vN-1 of the
identifier-listsb-identifier-list as names ([basic.scope.declarative])of structured bindings. A structured binding is either an identifier of the sb-identifier-list or an element of the pack introduced by an sb-pack-identifier. The declaration shall contain at most one sb-pack-identifier. Let cv denote the cv-qualifiers in the decl-specifier-seq.
Introduce new paragraphs after 9.6 [dcl.struct.bind] paragraph 1, introducing the terms “structured binding size” and SBi:
1+1 The structured binding size of a type
E
is the required number of names that need to be introduced by the structured binding declaration, as defined below. If there is no structured binding pack, then the number of elements in the sb-identifier-list shall be equal to the structured binding size. Otherwise, the number of elements of the structured binding pack is the structured binding size less the number of non-pack elements in the sb-identifier-list and the number of non-pack elements shall be no more than the structured binding size.1+2 Let SBi denote the ith structured binding in the structured binding declaration after expanding the structured binding pack, if any. [ Note: If there is no structured binding pack, then SBi denotes vi. - end note ] [ Example:
struct C { int x, y, z; }; auto [a, b, c] = C(); // SB0 is a, SB1 is b, and SB2 is c auto [d, ...e] = C(); // SB0 is d, the pack e (v1) contains two identifiers: SB1 and SB2 auto [...f, g] = C(); // the pack f (v0) contains two identifiers: SB0 and SB1, and SB2 is g auto [h, i, j, k, ...l] = C(); // error: structured binding size is too small
- end example ]
Change 9.6 [dcl.struct.bind] paragraph 3 to define a structured binding size and extend the example:
3 If
E
is an array type with element typeT
,the number of elements in the identifier-list shall bethe structured binding size ofE
is equal to the number of elements ofE
. EachviSBi is the name of an lvalue that refers to the element i of the array and whose type isT
; the referenced type isT
. [Note: The top-level cv-qualifiers ofT
are cv. — end note] [Example:auto f() -> int(&)[2]; auto [ x, y ] = f(); // x and y refer to elements in a copy of the array return value auto& [ xr, yr ] = f(); // xr and yr refer to elements in the array referred to by f's return value + auto g() -> int(&)[4]; + auto [a, ...b, c] = g(); // a is the first element of the array, b is a pack containing the second and + // third elements, and c is the fourth element + auto& [...e] = g(); // e is a pack referring to the four elements of the array
— end example]
Change 9.6 [dcl.struct.bind] paragraph 4 to define a structured binding size:
4 Otherwise, if the qualified-id
std::tuple_size<E>
names a complete type, the expressionstd::tuple_size<E>::value
shall be a well-formed integral constant expression and thenumber of elements in the identifier-list shall bestructured binding size ofE
is equal to the value of that expression. […] EachviSBi is the name of an lvalue of typeTi
that refers to the object bound tori
; the referenced type isTi
.
Change 9.6 [dcl.struct.bind] paragraph 5 to define a structured binding size:
5 Otherwise, all of
E
’s non-static data members shall be direct members ofE
or of the same base class ofE
, well-formed when named ase.name
in the context of the structured binding,E
shall not have an anonymous union member, and thenumber of elements in the identifier-list shall bestructured binding size ofE
is equal to the number of non-static data members ofE
. Designating the non-static data members ofE
asm0, m1, m2, . . .
(in declaration order), eachSBi is the name of an lvalue that refers to the membervi
mi
ofE
and whose type is cvTi
, whereTi
is the declared type of that member; the referenced type is cvTi
. The lvalue is a bit-field if that member is a bit-field.
Change 13.1 [temp.pre]/8 to extend the notion of what is a templated entity:
8 An entity is templated if it is
- (8.1) a template,
- (8.2) an entity defined ([basic.def]) or created ([class.temporary]) in a templated entity,
- (8.3) a member of a templated entity,
- (8.4) an enumerator for an enumeration that is a templated entity,
or- (8.5) the closure type of a lambda-expression ([expr.prim.lambda.closure]) appearing in the declaration of a templated entity
., or
- (8.6) an entity whose declaration’s locus is inhabited by the declaration of a structured binding pack.
[ Note 1: A local class, a local or block variable, or a friend function defined in a templated entity is a templated entity. — end note ]
[Example:
struct C { int j; long l; }; int g() { auto [ ... i ] = C{ 1, 2L }; return ( [c = i] () { struct C { int f() requires (sizeof(c) == sizeof(int)) { return 1; } int f() requires (sizeof(c) != sizeof(int)) { return 2; } }; return C{}.f(); } () + ... + 0 ); } int v = g(); // OK: v == 3
-end example]
Add a new clause to 13.7.4 [temp.variadic], after paragraph 3:
3+ A structured binding pack is an identifier that introduces zero or more structured bindings ([dcl.struct.bind]). [ Example
auto foo() -> int(&)[2]; auto [...a] = foo(); // a is a structured binding pack containing 2 elements auto [b, c, ...d] = foo(); // d is a structured binding pack containing 0 elements
- end example]
In 13.7.4 [temp.variadic], change paragraph 4:
4 A pack is a template parameter pack, a function parameter pack,
oran init-capture pack, or a structured binding pack. The number of elements of a template parameter pack or a function parameter pack is the number of arguments provided for the parameter pack. The number of elements of an init-capture pack is the number of elements in the pack expansion of its initializer.
In 13.7.4 [temp.variadic], paragraph 5 (describing pack expansions) remains unchanged.
In 13.7.4 [temp.variadic], add a bullet to paragraph 8:
8 Such an element, in the context of the instantiation, is interpreted as follows:
- (8.1) if the pack is a template parameter pack, the element is a template parameter ([temp.param]) of the corresponding kind (type or non-type) designating the ith corresponding type or value template argument;
- (8.2) if the pack is a function parameter pack, the element is an id-expression designating the ith function parameter that resulted from instantiation of the function parameter pack declaration;
otherwise- (8.3) if the pack is an init-capture pack, the element is an id-expression designating the variable introduced by the ithth init-capture that resulted from instantiation of the init-capture pack
.; otherwise- (8.4) if the pack is a structured binding pack, the element is an id-expression designating the ith structured binding that resulted from the structured binding declaration.
Add a new paragraph after 13.7.4 [temp.variadic]
12 If all of the packs expanded by a pack expansion are not dependent, the pack expansion is instantiated immediately.
[ Example:
struct C { }; void g(...); // #1 template <typename T> void f() { C arr[1]; auto [...e] = arr; g(e...); // calls #1 } void g(C); // #2 int main() { f<int>(); }
- end example ]
Add a bullet to 13.8.3.3 [temp.dep.expr]/3:
3 An id-expression is type-dependent if it is a template-id that is not a concept-id and is dependent; or if its terminal name is
- (3.1) associated by name lookup with one or more declarations declared with a dependent type,
- (3.2) associated by name lookup with a non-type template-parameter declared with a type that contains a placeholder type,
- (3.3) associated by name lookup with a variable declared with a type that contains a placeholder type ([dcl.spec.auto]) where the initializer is type-dependent,
- (3.4) associated by name lookup with one or more declarations of member functions of a class that is the current instantiation declared with a return type that contains a placeholder type,
- (3.5) associated by name lookup with a structured binding declaration ([dcl.struct.bind]) whose brace-or-equal-initializer is type-dependent,
- (3.5b) associated by name lookup with a pack, unless that pack is a non-type template parameter pack whose types are non-dependent,
- (3.6) associated by name lookup with an entity captured by copy ([expr.prim.lambda.capture]) in a lambda-expression that has an explicit object parameter whose type is dependent ([dcl.fct]),
- (3.7) the identifier func ([dcl.fct.def.general]), where any enclosing function is a template, a member of a class template, or a generic lambda,
- (3.8) a conversion-function-id that specifies a dependent type, or
- (3.9) dependent
Add a carve-out for in 13.8.3.4 [temp.dep.constexpr]/4:
4 Expressions of the following form are value-dependent:
unless the identifier is a non-dependent pack, or all of the packs expanded in the fold-expression are non-dependent packs.
Bump __cpp_structured_bindings
in 15.11 [cpp.predefined]:
Thanks to Michael Park and Tomasz Kamiński for their helpful feedback. Thanks to Richard Smith for help with the wording. Thanks especially to Jason Rice for the implementation.
[CWG2074] Richard Smith. 2015-01-20. Type-dependence of local class of function template.
https://wg21.link/cwg2074
[N3915] Peter Sommerlad. 2014-02-14. apply() call a function with arguments from a tuple (V3).
https://wg21.link/n3915
[P0144R2] Herb Sutter. 2016-03-16. Structured Bindings.
https://wg21.link/p0144r2
[P1061R0] Barry Revzin, Jonathan Wakely. 2018-05-01. Structured Bindings can introduce a Pack.
https://wg21.link/p1061r0
[P1061R0.Minutes] EWGI. 2019. Kona 2019 EWGI: P1061R0.
http://wiki.edg.com/bin/view/Wg21kona2019/P1061
[P1061R1] Barry Revzin, Jonathan Wakely. 2019-10-07. Structured Bindings can introduce a Pack.
https://wg21.link/p1061r1
[P1061R1.Minutes] EWG. 2019. Belfast 2020 EWG: P1061R1.
https://wiki.edg.com/bin/view/Wg21belfast/P1061-EWG
[P1240R2] Daveed Vandevoorde, Wyatt Childers, Andrew Sutton, Faisal Vali. 2022-01-14. Scalable Reflection.
https://wg21.link/p1240r2
[P1858R2] Barry Revzin. 2020-03-01. Generalized pack declaration and usage.
https://wg21.link/p1858r2
[PEP.3132] Georg Brandl. 2007. PEP 3132 – Extended Iterable Unpacking.
https://www.python.org/dev/peps/pep-3132/