ISO/IEC JTC1 SC22 WG21 N2957 = 09-0147 - 2009-09-25
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Problem
Importance
Solution
Wording
Acknowledgments
References
The adoption of [N2927] restricts the referents of a lambda expression to the immediately-enclosing function or lambda expression. This restriction is severe and does damage to the utility of the lambda feature.
One principle of lambda design is that programmers should be able to replace all well-structured control constructs with function calls taking lambda expressions as arguments. [N2927] prevents that replacement, and is essentially equivalent to saying that the body of a nested loop cannot access function parameters.
Consider the following function to multiply square matricies.
matrix operator*( const matrix& a, const matrix& b ) { int n = a.numrows(); matrix c( n, n ); for ( int i = 0; i < n; i++ ) { for ( int j = 0; j < n; j++ ) { double t = 0.0; for ( int k = 0; k < n; k++ ) { t += a[i][k] * b[k][j]; } c[i][j] = t; } } return c; }
Its rewrite into functions and lambdas, as intended by the adopted [N2550], is as follows. Note that the rewrite is approximately the same size.
matrix operator*( const matrix& a, const matrix& b ) { int n = a.numrows(); matrix c( n, n ); for_range( 0, n-1, [&]( int i ){ for_range( 0, n-1, [&]( int j ){ double t = 0.0; for_range( 0, n-1, [&]( int k ){ t += a[i][k] * b[k][j]; } ); c[i][j] = t; } ); } ); return c; }
However this straightforward reformulation is ill-formed as per [N2927] wording for 5.1.1 [expr.prim.lambda] paragraphs 9–12. To make the code well-formed, the function must be rewritten as follows, where inserted text shows new code, which sometimes alter variable references.
matrix operator*( const matrix& a, const matrix& b ) { int n = a.numrows( ); matrix c( n, n ); for_range( 0, n-1, [&]( int i ){ const matrix& a1 = a; const matrix& b1 = b; int& n1 = n; matrix& c1 = c; for_range( 0, n1-1, [&]( int j ){ const matrix& a2 = a1; const matrix& b2 = b1; int& n2 = n1; int& i2 = i; double t = 0.0; for_range( 0, n1-1, [&]( int k ){ t += a2[i2][k] * b2[k][j]; } ); c1[i][j] = t; } ); } ); return c; }
Such extensive mechanical editing will inhibit use, make errors more likely, and complicate code maintenance. Such editing is properly the domain of the compiler.
One view of lambdas is that
they are primarily useful for very short inline function arguments
to standard algorithms,
and hence do not need general scoping support.
This view considerably underestimates the power of control abstraction.
Indeed, the algorithms in the standard library
are the beginning of control abstraction, not the end.
In a highly concurrent world,
programmers need more control constructs
than can be reasonably provided by the language.
Programmers will need to define and use their own constructs,
and they will be using those constructs
with the same frequency that
current programmers use for
and if
.
Consider this following example, adapted from [CrowlLeBlanc94], for Gaussian elimination. It contains two constrol structures, one for the triangulation and one for the row reduction. Only the latter is a common control construct.
int size = system.size(); triangulate( size, [&]( int pivot, int reduce ) { auto fraction = system[reduce][pivot] / system[pivot][pivot]; forall( pivot, size-1, [&]( int variable ) { system[reduce][variable] -= system[pivot][variable] * fraction; } ); } );
In particular, the triangulate
control construct
encapsulates all the available parallelism in the triangulation.
Just as importantly, it also captures all the necessary synchronization.
So,
this highly parallel code contains no threads and no synchronization,
which are the source of most effort and bugs in parallel programming.
More imporantly,
there is no particular commitment
to any sequential or parallel execution of its control constructs.
A vector machine might choose to execute
triangulate
serially and forall
concurrently.
On the other hand, a NUMA machine would likely
execute triangulate
with coarse parallelism
and forall
sequentially.
Good control abstraction enables more effective use of machine resources. But to be effectively programmed, lambda expressions must be able to easily substitute for ordinary well-structured control statements. For that easy substitution, we must restore the intent of [N2550].
In outline, the solution is as follows.
Define reaching scope for a lambda expression as the enclosing lambda expressions out to and including the innermost function. That is, equivalent to replacing the lambda expressions by blocks.
Define explicit and implicit captures with reference to the reaching scope.
Define a capture of an entity to be a use (formally) of that entity within the scope containing the lambda expression.
Define a lambda expression to be ill-formed if it uses (formally) anything from its reaching scope that it does not capture.
Define a lambda expression to be ill formed if it captures an entity that is not either defined or captured in the immediately enclosing lambda expression or function.
For entities captured by an outer lambda expression, define its nested lambda expression to capture those entities from the closure object of the outer lambda expression. That is, all captures are one level deep. This definition ensures that a closure object for a lambda expression contains all the information necessary to form a closure object for nested lambda expressions.
As [N2927] has not yet been incorporated into the working draft, all edits are applied to the wording in [N2927]. All edits fall in section 5.1.1 [expr.prim.lambda].
Before paragraph 10, add a new paragraph as follows.
The reaching scope of a lambda expression is the set of enclosing scopes up to and including the innermost enclosing function and its parameters. [Note: This reaching scope includes any intermediate lambda expressions and their parameter lists. The reaching scope of a lambda expression is the scope as it would have been if all lambda expressions were replaced by blocks. —end note]
Edit paragraph 10 as follows.
The identifiers in a capture-list are looked up using the usual rules for unqualified name lookup ([basic.lookup.unqual] 3.4.1); each such lookup shall find a variable or reference with automatic storage duration declared within the reaching scope of the lambda expression. An entity (i.e., a variable, a reference, or
this
) is said to be explicitly captured if it appears in the lambda-expression's capture-list. An explicitly captured entity is used ([basic.def.odr] 3.2) in the scope containing the lambda expression.
Edit paragraph 11 as follows.
If a lambda-expression has an associated capture-default and its compound-statement uses ([basic.def.odr] 3.2)
this
or a variable or reference with automatic storage durationdeclared in an enclosing function or lambda-expressionand the used entity is not explicitly captured, then the used entity is said to be implicitly captured; such entities shall be declared within the reaching scope of the lambda expression. An implicitly captured entity is used ([basic.def.odr] 3.2) in the scope containing the lambda expression. [Note: Thus the implicit capture of an entity by a nested lambda-expression can cause its implicit capture by the containing lambda-expression. Implicit uses ofthis
can result in implicit capture. —end note]
Edit paragraph 12 as follows.
If
this
is captured, either explicitly or implicitly, the lambda-expression shall appeardirectly inwithin the definition of a non-static member function, i.e., not in another lambda-expression.[Note: This rule prevents access from a nested lambda-expression to the members of the enclosing lambda-expression's closure object. —end note]
Move the existing paragraph 9 to after paragraph 12 and edit it as follows.
A lambda-expression's compound-statement can use (see above)If a lambda-expression uses ([basic.def.odr] 3.2)this
from an immediately-enclosing member function definition, as well as variables and references with automatic storage duration from an immediately-enclosing function definition or lambda-expression, provided these entities are captured (as described below). Any other use ([basic.def.odr] 3.2) of a variable or reference with automatic storage duration declared outside the lambda-expression is ill-formed.this
or a variable or reference with automatic storage duration from its reaching scope, and that entity is not captured by the lambda-expression, the program is ill-formed; diagnostic required. If a lambda-expression captures an entity, and that entity is not defined or captured in the immediately enclosing lamba expression or function, the program is ill-formed; diagnostic required. [Example:void f1(int i) { int const N = 20; auto m1 = [=]{ int const M = 30; auto m2 =
[=][i]{ int x[N][M]; // OK: N and M are not "used" x[0][0] = i; // OK: i is explicitly captured by m2 // and implicitly captured by m1// error: i is not declared in the immediately};// enclosing lambda-expression}; struct s1 { int f; int work(int n) { int m = n*n; int j = 40; auto m3 = [this,m]{ auto m4 = [&,j]{ // error: j not captured by m3 int x = n; // error: n implicitly captured by m4 // but not captured by m3 x += m; // OK: m implicitly captured by m4 // and explicitly captured by m3 x += N; // error: N is outside of the reaching scope x += f; // OK: this captured implicitly by m4 // and explicitly by m3 } } } } }—end example]
After the above paragraph, add a new paragraph as follows.
If a lambda-expression
m1
captures an entity, and that entity is captured by an immediately enclosing lambda-expressionm2
, thenm1
's capture is transformed as follows:
if
m2
captures the entity by copy,m1
captures the corresponding non-static data member ofm2
's closure type;if
m2
captures the entity by reference,m1
captures the same entity captured bym2
.[Note: Therefore, all captures are one level deep and a closure object for a lambda expression contains all the information necessary to form a closure object for nested lambda expressions. —end note] [Example: The nested lambda expressions and invocations below will output
123234
.int a = 1, b = 1, c = 1; auto m1 = [a, &b, &c]() mutable { auto m2 = [a, b, &c]() mutable { std::cout << a << b << c; a = 4; b = 4; c = 4; }; a = 3; b = 3; c = 3; m2(); }; a = 2; b = 2; c = 2; m1(); std::cout << a << b << c;
—end example]
William M. Miller provided many helpful comments and alternate wording suggestions.