Document number: | N1649 |
04-0089 | |
Author: | Daveed Vandevoorde |
Edison Design Group | |
Date: | 9 Apr. 2004 |
Ever since the introduction of angle brackets, C++ programmers have been surprised by the fact that two consecutive right angle brackets must be separated by whitespace:
The problem is an immediate consequence of the the “maximum munch” principle and the fact that >> is a valid token (right shift) in C++.#include <vector> typedef std::vector<std::vector<int> > Table; // OK typedef std::vector<std::vector<bool>> Flags; // Error
This issue is a minor, but persisting, annoying, and somewhat embarrassing problem. If the cost is reasonable, it seems therefore worthwhile to eliminate the surprise.
The purpose of this document is to explain ways to allow the >> token to be treated as two closing angle brackets, as well as to discuss the resulting issues.
The example above shows the most common context of double right angle brackets: Nested template-ids. However, the “new-style” cast syntax may also participate in such constructs. For example:
This situation currently occurs fairly rarely because the template-ids involved always represent class types, whereas these casts usually involve pointer, pointer-to-member, or reference types.static_cast<List<B>>(ld)
However, if template aliases make it into the language (and it seems likely they will), then template-ids will be able to represent nonclass types. It seems therefore desirable to address the issue for all constructs with right angle brackets: Not just templates.
Solving our problem amounts to decreeing that under some circumstances a >> token is treated as two right angle brackets instead of a right shift operator. As it turns out, there are two general (and reasonable) approaches to defining those “circumstances.”
The first approach is the simplest: Decree that if a left angle bracket is active (i.e. not yet matched by a right angle bracket) the >> token is treated as two right angle brackets instead of a shift operator, except within a pair parentheses within the angle brackets. A slight variation on that theme is to require at least two left angle brackets to be active since otherwise the construct would be an error (because there would be an excess of right angle brackets).
This strategy is similar to the treatment of the > token: If a left angle bracket is active, the token is treated as a right angle bracket, except within parentheses. For example:
A<(X>Y)> a; // The first > token appears within parentheses and // therefore is not a right angle bracket. The second one // is a right angle bracket because a left angle bracket // is active and no parentheses are more recently active.
Unfortunately, some programs may be broken by the former approach. Consider the following example:
This program is valid today; it produces the following output:#include <iostream> template<int I> struct X { static int const c = 2; }; template<> struct X<0> { typedef int c; }; template<typename T> struct Y { static int const c = 3; }; static int const c = 4; int main() { std::cout << (Y<X<1> >::c >::c>::c) << '\n'; std::cout << (Y<X< 1>>::c >::c>::c) << '\n'; }
With the right angle bracket rule proposed above, the >> token in the second statement would change its meaning (from right shift to double right angle bracket) and the output would therefore become:0 3
0 0
To avoid the backward incompatibility, the alternative approach it to modify the rule proposed above to only treat the >> token as two right angle brackets when parsing template type arguments or template template arguments, but not when parsing template nontype arguments.
Another way to view this alternative approach is that a template argument is always parsed as far as possible (which may include right shift operators). When an argument is parsed, the next token must be a comma, a > treated as a single closing angle bracket, or (with this proposal) a >> token treated as a double angle bracket.
The GNU and EDG C++ compilers currently implement the second proposed alternative for error recovery purposes. It would be trivial to promote the error recovery procedure to a correct parse procedure. (Other compilers appear to have a facility for the same purpose, but I do not know their exact strategy.)
As mentioned, the first proposal is analogous to the existing language rule for the > token. We therefore do not expect implementation difficulty for the approach either.
I suggest we pursue the first approach (which breaks some valid programs). Specifically, I propose that if even a single left angle bracket is active, a >> token not enclosed in parentheses is treated as two right angle brackets and not as a right shift operator.
My arguments for doing so are the following:
Reflector messages: c++std-ext-6767,6771,6773,6775,6779,6786,6788,6789,6792,6793,6794,6799,6801,6809.