Reaching Scope of Lambda Expressions

ISO/IEC JTC1 SC22 WG21 N2957 = 09-0147 - 2009-09-25

Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org

Problem
Importance
Solution
Wording
Acknowledgments
References

Problem

The adoption of [N2927] restricts the referents of a lambda expression to the immediately-enclosing function or lambda expression. This restriction is severe and does damage to the utility of the lambda feature.

One principle of lambda design is that programmers should be able to replace all well-structured control constructs with function calls taking lambda expressions as arguments. [N2927] prevents that replacement, and is essentially equivalent to saying that the body of a nested loop cannot access function parameters.

Consider the following function to multiply square matricies.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows();
    matrix c( n, n );
    for ( int i = 0; i < n; i++ ) {
        for ( int j = 0; j < n; j++ ) {
            double t = 0.0;
            for ( int k = 0; k < n; k++ ) {
                t += a[i][k] * b[k][j];
            }
            c[i][j] = t;
        }
    }
    return c;
}

Its rewrite into functions and lambdas, as intended by the adopted [N2550], is as follows. Note that the rewrite is approximately the same size.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows();
    matrix c( n, n );
    for_range( 0, n-1, [&]( int i ){
        for_range( 0, n-1, [&]( int j ){
            double t = 0.0;
            for_range( 0, n-1, [&]( int k ){
                t += a[i][k] * b[k][j];
            } );
            c[i][j] = t;
        } );
    } );
    return c;
}

However this straightforward reformulation is ill-formed as per [N2927] wording for 5.1.1 [expr.prim.lambda] paragraphs 9–12. To make the code well-formed, the function must be rewritten as follows, where inserted text shows new code, which sometimes alter variable references.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows( );
    matrix c( n, n );
    for_range( 0, n-1, [&]( int i ){
        const matrix& a1 = a;
        const matrix& b1 = b;
	int& n1 = n;
	matrix& c1 = c;
        for_range( 0, n1-1, [&]( int j ){
            const matrix& a2 = a1;
            const matrix& b2 = b1;
	    int& n2 = n1;
	    int& i2 = i;
            double t = 0.0;
            for_range( 0, n1-1, [&]( int k ){
                t += a2[i2][k] * b2[k][j];
            } );
            c1[i][j] = t;
        } );
    } );
    return c;
}

Such extensive mechanical editing will inhibit use, make errors more likely, and complicate code maintenance. Such editing is properly the domain of the compiler.

Importance

One view of lambdas is that they are primarily useful for very short inline function arguments to standard algorithms, and hence do not need general scoping support. This view considerably underestimates the power of control abstraction. Indeed, the algorithms in the standard library are the beginning of control abstraction, not the end. In a highly concurrent world, programmers need more control constructs than can be reasonably provided by the language. Programmers will need to define and use their own constructs, and they will be using those constructs with the same frequency that current programmers use for and if.

Consider this following example, adapted from [CrowlLeBlanc94], for Gaussian elimination. It contains two constrol structures, one for the triangulation and one for the row reduction. Only the latter is a common control construct.

int size = system.size();
triangulate( size, [&]( int pivot, int reduce ) {
    auto fraction = system[reduce][pivot] / system[pivot][pivot];
    forall( pivot, size-1, [&]( int variable ) {
        system[reduce][variable] -= system[pivot][variable] * fraction;
    } );
} );

In particular, the triangulate control construct encapsulates all the available parallelism in the triangulation. Just as importantly, it also captures all the necessary synchronization. So, this highly parallel code contains no threads and no synchronization, which are the source of most effort and bugs in parallel programming. More imporantly, there is no particular commitment to any sequential or parallel execution of its control constructs. A vector machine might choose to execute triangulate serially and forall concurrently. On the other hand, a NUMA machine would likely execute triangulate with coarse parallelism and forall sequentially.

Good control abstraction enables more effective use of machine resources. But to be effectively programmed, lambda expressions must be able to easily substitute for ordinary well-structured control statements. For that easy substitution, we must restore the intent of [N2550].

Solution

In outline, the solution is as follows.

Define reaching scope for a lambda expression as the enclosing lambda expressions out to and including the innermost function. That is, equivalent to replacing the lambda expressions by blocks.
Define explicit and implicit captures with reference to the reaching scope.
Define a capture of an entity to be a use (formally) of that entity within the scope containing the lambda expression.
Define a lambda expression to be ill-formed if it uses (formally) anything from its reaching scope that it does not capture.
Define a lambda expression to be ill formed if it captures an entity that is not either defined or captured in the immediately enclosing lambda expression or function.
For entities captured by an outer lambda expression, define its nested lambda expression to capture those entities from the closure object of the outer lambda expression. That is, all captures are one level deep. This definition ensures that a closure object for a lambda expression contains all the information necessary to form a closure object for nested lambda expressions.

Wording

As [N2927] has not yet been incorporated into the working draft, all edits are applied to the wording in [N2927]. All edits fall in section 5.1.1 [expr.prim.lambda].

Before paragraph 10, add a new paragraph as follows.

The reaching scope of a lambda expression is the set of enclosing scopes up to and including the innermost enclosing function and its parameters. [Note: This reaching scope includes any intermediate lambda expressions and their parameter lists. The reaching scope of a lambda expression is the scope as it would have been if all lambda expressions were replaced by blocks. —end note]

Edit paragraph 10 as follows.

The identifiers in a capture-list are looked up using the usual rules for unqualified name lookup ([basic.lookup.unqual] 3.4.1); each such lookup shall find a variable or reference with automatic storage duration declared within the reaching scope of the lambda expression. An entity (i.e., a variable, a reference, or this) is said to be explicitly captured if it appears in the lambda-expression's capture-list. An explicitly captured entity is used ([basic.def.odr] 3.2) in the scope containing the lambda expression.

Edit paragraph 11 as follows.

If a lambda-expression has an associated capture-default and its compound-statement uses ([basic.def.odr] 3.2) this or a variable or reference with automatic storage duration ~~declared in an enclosing function or lambda-expression~~ and the used entity is not explicitly captured, then the used entity is said to be implicitly captured; such entities shall be declared within the reaching scope of the lambda expression. An implicitly captured entity is used ([basic.def.odr] 3.2) in the scope containing the lambda expression. [Note: Thus the implicit capture of an entity by a nested lambda-expression can cause its implicit capture by the containing lambda-expression. Implicit uses of this can result in implicit capture. —end note]

Edit paragraph 12 as follows.

If this is captured, either explicitly or implicitly, the lambda-expression shall appear ~~directly in~~ within the definition of a non-static member function~~, i.e., not in another lambda-expression~~. ~~[Note: This rule prevents access from a nested lambda-expression to the members of the enclosing lambda-expression's closure object. —end note]~~

Move the existing paragraph 9 to after paragraph 12 and edit it as follows.

A lambda-expression's compound-statement can use (see above) this from an immediately-enclosing member function definition, as well as variables and references with automatic storage duration from an immediately-enclosing function definition or lambda-expression, provided these entities are captured (as described below). Any other use ([basic.def.odr] 3.2) of a variable or reference with automatic storage duration declared outside the lambda-expression is ill-formed. If a lambda-expression uses ([basic.def.odr] 3.2) this or a variable or reference with automatic storage duration from its reaching scope, and that entity is not captured by the lambda-expression, the program is ill-formed; diagnostic required. If a lambda-expression captures an entity, and that entity is not defined or captured in the immediately enclosing lamba expression or function, the program is ill-formed; diagnostic required. [Example:
void f1(int i) {
  int const N = 20;
  auto m1 = [=]{
     int const M = 30;
     auto m2 = [=][i]{
        int x[N][M]; // OK: N and M are not "used"
        x[0][0] = i; // OK: i is explicitly captured by m2
                     // and implicitly captured by m1
                     // error: i is not declared in the immediately
     };              // enclosing lambda-expression
  };
  struct s1 {
    int f;
    int work(int n) {
      int m = n*n;
      int j = 40;
      auto m3 = [this,m]{
        auto m4 = [&,j]{ // error: j not captured by m3
          int x = n; // error: n implicitly captured by m4
                     // but not captured by m3
          x += m; // OK: m implicitly captured by m4
                  // and explicitly captured by m3
          x += N; // error: N is outside of the reaching scope
          x += f; // OK: this captured implicitly by m4
                  // and explicitly by m3
        }
      }
    }
  }
}
—end example]

After the above paragraph, add a new paragraph as follows.

If a lambda-expression m1 captures an entity, and that entity is captured by an immediately enclosing lambda-expression m2, then m1's capture is transformed as follows:

if m2 captures the entity by copy, m1 captures the corresponding non-static data member of m2's closure type;

if m2 captures the entity by reference, m1 captures the same entity captured by m2.

[Note: Therefore, all captures are one level deep and a closure object for a lambda expression contains all the information necessary to form a closure object for nested lambda expressions. —end note] [Example: The nested lambda expressions and invocations below will output 123234.
int a = 1, b = 1, c = 1;
auto m1 = [a, &b, &c]() mutable {
    auto m2 = [a, b, &c]() mutable {
        std::cout << a << b << c;
        a = 4; b = 4; c = 4;
    };
    a = 3; b = 3; c = 3;
    m2();
};
a = 2; b = 2; c = 2;
m1();
std::cout << a << b << c;
—end example]

Acknowledgments

William M. Miller provided many helpful comments and alternate wording suggestions.

References

[N2927]: New wording for C++0x Lambdas (rev. 2), Daveed Vandevoorde, ISO/IEC JTC1 SC22 WG21 N2927 = 09-0117 - 2009-07-15, http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf
[N2550]: Lambda Expressions and Closures: Wording for Monomorphic Lambdas (Revision 4), Jaakko Järvi, John Freeman, Lawrence Crowl, ISO/IEC JTC1 SC22 WG21 N2550 = 08-0060 - 2008-02-29, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2550.pdf
[CrowlLeBlanc94]: Parallel Programming with Control Abstraction, Lawrence A. Crowl, Thomas J. LeBlanc, ACM Transactions on Programming Languages and Systems 16(3): 524-576, May 1994, http://www.Crowl.org/Lawrence/paper/journals/1994J-TOPLAS-03-524/