N1512=03-0095

September 18, 2003

Evolution WG issues list

Maintainers: Bjarne Stroustrup (bs@cs.tamu.edu) and David Vandevoorde (daveed@edg.com).

This is an initial list of issues. We have "seeded" it with topics of current topics in the evolution working group, reflector threads, and email messages received. We expect that the list will grow significantly and that the presentation and organization of this list will evolve to cope.

It is obvious to us that there are so many proposals for language extensions that most must be rejected in order to keep the language reasonably manageable and coherent.

We note the requests for an XML format, but haven't yet had a chance to do anything about that.

Naturally, we'd appreciate suggestions that we have forgotten and drafts for the various entries that are currently just placeholders.

Organization

This list is organized in three parts:

Proposals, numbered EPddd, which are proposals under active consideration for inclusion into C++ (or approved or rejected after such consideration).
Issues, numbered EIddd, which represents attempts to identify a set of related problems that might be addressed by a language extension. Some issues have come directly from users without serious suggestions for how to solve them.
Suggestions, numbered ESddd, which simply lists suggestions made directly or indirectly to us. We’re not making value judgments about a suggestion beyond either having heard a suggestion repeatedly or received/seen a reasonably detailed description of what’s desired. Some suggestions will contain references to original written versions.

We try to give each suggestion, issue, and proposal a reasonably mnemonic name in addition to its number. Where we know of an original source, we’ll try to identify it. In our experience, most ideas have multiple sources, so we encourage people not to be too proprietary about “their” suggestions.

We expect that suggestions, issues, and proposals will eventually be extensively cross referenced so that people can see which proposals are the responses to which suggestions and issues.

For proposals, we’ll record evolution group straw votes and committee votes, if any. If a suggestion, issue, or proposal has been discussed and didn’t have strong support, we mark it LOW to indicate that no further EWG discussion will occur without new input.

The expected “lifecycle” of a language extension idea is that it will emerge from several suggestions, merge into an issue, and emerge for vote as one or more proposals. In that process several papers will be written.

Policies

The EWG (Evolution Working Group) tries to classify suggestions, issues, and proposals according to which larger language concerns they address and to favor proposals that address

Improve the support of generic programming
Improve the support for library building and use
Make the language more regular, predictable, and teachable (also known as “make simple things simple to express”)
Support the building of libraries that will make C++ a better platform for systems programming

We expect that these directions will eventually become reasonably well specified and be explicitly approved by the committee. Currently, they are simply synthesized from committee and working group discussions.

Typically, a proposal will not go to full committee before CWG or LWG has discussed it either on a reflector or at a meeting.

We would like to minimize the size and number of core language extensions. Conseqently, for each proposal, we will consider if a standard library facility would be sufficient to solve the problem or if a combination of a small core language change plus a standard library extension would do the job.

Compatibility with C++03 and the zero-overhead principle are considered very important.

Proposals

EP001. decltype and auto.

See ES001. Note by Stroustrup: c++std-ext-5364. Paper by Järvi, Gregor, Siek, and Stroustrup: N1478=03-0061.

EP002. template aliases.

See ES004 (typedef template). Note by Mat Marcus and Gabriel dos Reis: N1449/03-0032. Note by Gabriel dos Reis and Bjarne Stroustrup: N1489.

EP003. #nomacros.

See EI001. Note by Stroustrup to be written.

EP004. extern template.

Note by Mat Marcus and Gabriel dos Reis to be written.

EP005. Dynamic libraries.

Discussion managed by Pete Becker.

EP006. Allow local classes as template parameters.

Paper by Anthony Williams: WG21/N1427-J16/03-0009. See ES043.

EP007. Move semantics.

Paper by Howard Hinnant: N1377=02-0035.

EP008. Null pointer constant.

Introduce the keyword nullptr to denote a value that can be assigned to any pointer, but not to non-pointers.

Paper by Herb Sutter and Bjarne Stroustrup: N1488.

EP009. Static assertions

Paper by Robert Klarer and John Maddock: N1381/02-0039 "Proposal to Add Static Assertions to the Core Language".

EP010. Concepts

A mechanism for better specification of template arguments, leading to better error messages and selection of templates based on template argument types.

Note by Gabriel Dos Reis and Bjarne Stroustrup: N1522=03-0105. Note by Bjarne Stroustrup: N1510.

See ES046.

EP011. Generalized initializer lists

A mechanism for initializing containers with initializer lists and for using initializer lists as function arguments.

Note by Gabriel Dos Reis and Bjarne Stroustrup: N1509.

EP012. User-defined literals

Note by Bjarne Stroustrup: N1511.

Issues

EI001. Macro pollution.

Macros can arbitrarily change the meaning of any piece of code. This imposes restrictive defensive naming practices and even then leads to surprises and errors. Namespaces provide no defense. Since macros are typically found in headers (incl. standard headers) a programmer cannot be expected to know every macro used in a program.

Macros are so widely used and some uses, such #include guards and conditional compilation control macros, have no generally acceptable alternatives.

See EP003. See ES042.

EI002. Simplify, generalize, and automate.

Too many simple things cannot be expressed simply and require error-prone workarounds. This discourages good style and complicates early teaching.

Paper by Francis Glassborow: WG21/N1445==J16/03-0027.

EI003. GUI.

C++ doesn't have a standard GUI. This is widely considered to mean that C++ doesn's have a GUI and can't be used for applications requiring a grephical user interface. It also hampers eductaion by biasing teaching towards either proprietary extensions, limiting the utility of what is taught to a single library or vendor, or to command line exercises widely perceived as booring and old-fashioned.

GUIs are often built using language extensions - thus rendering them non-portable.

Suggestions

ES001. typeof.

If we had typeof (or an equivalent) then an entire LWG proposal, Doug Gregor's result_of<> class, would become unnecessary. It would be nice to be able to write something like this:

	template < template T1, template T2, template T3>
   		typeof(f(t1, t2, t3))
   			void foo(T1& t1, T2& t2, T3& t3) { return f(t1, t2, t3); }

Forwarded from LWG by Matt Austern. See EP001.

ES002. Solve the forwarding problem.

By "the forwarding problem" I mean: create a wrapper function f2 that forwards its arguments to a generic function f1 in exactly the same way as if f1 had been called directly. Not trivial, since some of f1's arguments may by of the form T, some T&, some const T&.

Forwarded from LWG by Matt Austern. See EI002.

ES003. Move semantics.

The T&& stuff, disgusting as the syntax may be, would be a solution to the forwarding problem.

Forwarded from LWG by Matt Austern. See EP007.

ES004. Typedef templates.

There are half dozen places in the existing library where we could remove gratuitous ugliness if we had them, and at least as many in the TR proposals. See EP002 “template aliases”.

Forwarded from LWG by Matt Austern. See EP002.

Also: Herb Sutter: paper N1406/02-0064. Walter Brown paper: N1451/03-0034. Tread: c++std-ext-5658

ES005. Variable-length template parameter lists.

Forwarded from LWG by Matt Austern.

ES006. Overload new-style casts

The ability to overload new-style cases like static_cast and dynamic_cast. This is useful for smart pointers, and it would also be useful if we ever want to remove the gaps in allocator specification.

Forwarded from LWG by Matt Austern.

ES007. Extern templates.

Forwarded from LWG by Matt Austern. See EP004.

ES008. Forwarding constructors.

(Meaning two things: make it possible to forward one overload of X::X to another overload, and make it possible to for a derived class to get constructors with the same signatures as its base has, without having to spell them out again.)

Forwarded from LWG by Matt Austern. See EI002.

ES009. Simple compile-time reflection.

One example is a simple way to write a compile-time test checking whether a generic class X defines a type called Foo. BS: This ought to relate to decltype (ES001). For example

	decltype(T).has_member(Foo);

Forwarded from LWG by Matt Austern.

ES010. Named template parameters.

This would be useful for policy classes.

One might argue for named function parameters as well, just for consistency's sake, but there isn't as strong a demand for it.

Forwarded from LWG by Matt Austern.

ES011. GUI.

Provide core language facilities, such as events, callbacks, and properties that make a good standard library GUI feasible.

ES012. Defaulting and inhibiting common operations

See EI002.

ES013. Class namespaces.

The suggestion is to allow a class’s namespace to be opened so that one can define several members at once. The main utility would be to avoid tedious (and therefore error-prone) repetition of template parameters for template classes.

Paper by Carl Daniel: WG21/1420==J16/03-0002.

ES014. templates of local types.

See ES006.

ES015. Dynamic libraries.

Discussion by Pete Becker. Early paperes: Matt Austern: N1400/02-0058. Pete becker: N1418/02-0076

ES016. long long.

As in C99, but consider overloading rules and literals.

ES017. Lambda.

???

ES018. >>

Allow >> to terminate two specializations; e.g. vector< list< int>> vli;

Not being able to do this surprises and annoys novices, leads to harder to read code, and occationally catches even experts. It has also become the target of a popular jibe from GJ proponents.

ES019. Overloading for function objects

Allow function objects to participate in overload resolution. For example

	void f(int);
	struct X {
		void operator()(double);
	};
	X f;

	f(1);		// call the function
	f(2.0);	// call the object

ES020. Overload set entity.

Provide some way of collecting the overload set for a template argument and resolve it at the call point in the template.

ES021. Restrict.

Provide a way for the programmer to state that two pointers can be treated as referring to non-overlapping storage. “Like C99, but with precisely defined semantics”.

ES022. Sealed.

Provide a way of saying that a class cannot be derived from and/or that a virtual function cannot be overridden.

ES023. Modules

???

ES024. Properties.

Provide Delphi/C# like properties. It may be possible to provide properties as a standard-library facility.

ES025. Operator dot.

Allow operator.() to be defined.

ES026. Generalize (curly) initializers.

Give more uniform treatment of built-in and user-defined types; arrays should not have an unfair syntactical advantage over vector and other standard-library containers.

See also Daniel Gutson's "Non Default Constructors for Arrays", ES037.

ES027. String literals and floating-point nontype template parameters.

???

ES028. Global operators

Allow operators that don't need to be members (e.g., -> [] ()) to be define at namespace scope.

ES029. Scoped enumerators.

That is enumerators that are not implicitly exported to the enclosing scope but must be qualified with their enumeration name to be accessed. For example:

	enum E { a, b };
 	E x = a; 	// error: no ‘a’ in scope
	E y = E::b;	// ok

See ES030.

ES030. Non-converting enumerations

Provide enumerations with values that that don't implicitly convert to int. For example

	explicit enum E { a, b };
	int x = a; // error

"explicit enum" might imply scoped enumerators.

ES031. Override/new.

Provide a way of saying that a function overrides an existing virtual function. Provide a way of saying that a function does not override an existing function.

ES032. Finally.

Allow a finally clause for a try block.

ES033. Optional garbage collection.

Explicitly acknowledge that garbage collection is a valid implementation technique for C++ and define when destructors are called and what it means for a an object to be unreferenced. See TC++PL3 C.9.1.

ES034. Enumerators with floating-point values.

Also, provide better support for enumerations used to express ranges. For example

	enum Range { low=1.4, high=7.9 };

See also ES030 and ES050.

ES035. LOW. Inline constants.

Daniel Gutson's paper.

ES036. LOW. Self methods

Daniel Gutson's paper.

ES037. Non-default array initialization

Daniel Gutson's paper.

ES038. Fix namespace alias

Namespace aliasing issues (e.g., extending namespaces through namespace alias names).

ES039. Allow switch on string.

For example:

	bool f(string s)
	{
		switch(s) {
		case "yes": return true;
		case "no": return false;
		default: throw Unexpected_string();
	}

This is a frequent suggestion by novices. The frequency of requests has increased since the appearance of C#.

ES040. Variable length template argument lists.

The basic idea is ???

ES041. A keyword for declaring thread-local storage.

???

ES042. #nospam.

Provide a preprocessor mechanism for limiting macros entering and exiting a scope. For example:

	#nomacros
	#in A B
	…
	#out A X
	#endnomacros

No macros are expanded between #nomacros and #endnomacros unless explicitly enabled by #in. No macros defined between #nomacros and #endnomacros will be defined after #endnomacros unless explicitly enabled by #out.

Suggestion by Bjarne Stroustrup. After discussion in the EWG it was decided to look for a solution that allowed macros used by macros allowed in by “#in” to be used in the expansion of such macros only.

#nomacros should nest.

ES043. Allow local classes as template parameters.

Paper by Anthony Williams: WG21/N1427-J16/03-0009. See EP006.

ES044. Generate operators

Systematically, synthesize fundamental operators, such as == and != for regular value types.

Suggestion by Alex Stepanov. Relates to EI002.

ES045. Checked throw specifications

Statically check throw specifications and assume that extern "C" implies throw(). Suggestion by Sean Parent. Discussion on -ext. Or eliminate them.

ES046. Concepts/constraints

Provide a mechanism that ensure that every operation used by a template is available (a constraint) or a facility that ensures that only a specific set of operations are used by a template (a concept). Suggestions by Matt Austern, Gabriel dos Reis, Alex Stepanov, Bjarne Stroustrup, and probably every serious generic programming practitioner.

A proposal should ideally

guarantee good error messages
allow for overloading based on concept/constraint
allow for selective code generation based on concept/constraint

See EP010.

ES046. Nested functions

???

ES047. Null pointer constant

nullptr

ES047. Relax union restriction

Note by John Skaller:

The change
-----------

1. Remove the restriction that unions may
not contain members of a constructible type.

2. Add a restriction to those clauses which
define the semantics of generated default
constructors, copy constructors, copy assignment
operators, and destuctors to the effect that
a program is ill formed if there is a need
to generate one or more of these functions
for a union containing a constructible type.

[-----------

There is an existing situation where

a such a function is required and cannot be
generated, and that is when it must call
a function of a base class which is not accessible:
this feature is regularly exploited by programmers
to prevent objects being copied.
---------]


The motivation
---------------

In ISO C there is a univeral mechanism
for using the same storage extent to
hold different objects at different times,
namely the union. It is universal in the sense
that all data types may be put into unions.

There are two primary uses for unions.

(a) The first use is to save storage by reusing
some space which is no longer required.
In this use, the position in the code
determines which component of the union,
if any, is in use.

(b) The second use is to store data
describing one of several different cases,
in this use, a discriminant either within
the union, or some associated storage,
is used to determine dynamically which component
is in use.

The canonical example is the transacation,
which includes the messages sent by windowing systems.

Unions provide two important features:

(a) they guarrantee enough storage is reserved
for any one of the components

(b) they ensure that the storage is aligned
correctly for any one of the components

They also provide a convenient notation
to refer to the storage components.

Unions are not safe: the programmer may
inadvertantly access a component for which
a value has not been stored.

In C++, exactly the same motivations
for unions exists as in C. There is a need
to save space and to correctly align storage
to hold heterogenous data, whether the discriminant
is the program counter, an external variable,
or the first member of every component type.

However, in C++ there is an even stronger
motivation: in C we are not concerned
with execution of constructors and destructors.

In C++, use of an internal discriminant
allows automatic copying and destruction of the correct
union component.

The alternatives which most
programmers employ are serious design errors.
The first is to store all the alternatives in
a struct. The problem with this method is
not just that it wastes space, but that
all the values are initialised at one time,
and all destroyed at one time. If one of the
types does not admit a convenient default
constructor, it may even be impossible to use
this representation directly: instead a union
of pointer may be used, which is equivalent
to a proper union except that there is an extra
overhead referencing the components and
allocating heap storage.

I can't emphasise how endemic this design fault is:
it is used in text books. Here is a common example:

struct op {
char op_name;
op *left;
op *right;
};

This struct is used to represent an expression tree.
It is of course TOTALLY wrong. The correct representation is:

struct op;
struct unop { char op_name; op *arg; };
struct binop { char op_name; op *left; op *right; };
union opcase {
char id;
unop u;
binop b;
};
struct op {
enum {id_t, unop_t, binop_t} tag;
opcase n;
};

The reason is that: (1) it is not restricted
to arity 0,1 an 2 operators. (2) the case discriminant
is explicit (rather than relying on NULL pointer checks).


I note in passing that a C pointer or SQL data base type
is actually a discriminated union of pointer to object
or NULL, and, data value or NIL, respectively.

The second design error is
using an abstract base and a deirved
class for each alternative: it suffers
the same problem of allocation overhead
mentioned above, and almost invariably requires
a downcast to access the desired alternative.

At present then, C++ programmers can only use
the convenient, low overhead solution they desire
if the types involved are not constructible.
The change above removes that restriction,
and allows programmers to obtain correctly
sized and aligned storage which can be used
for any finite set of data types.

No code is broken by the proposal
since it is a relaxation of a restriction.

No safety is lost, even if constructible
types are used, since a union containing
constructible types requires a user defined
constructor, assignment operator, or destructor
to be used in contexts requiring construction,
assigment, or destruction.

My personal need here is two-fold.

First, I am generating C++ code
for a programming language which allows users
to specify a new primitive type by nominating a
C++ type.

This programming language needs to allocate storage
for these types in a block structured context,
which is the kind (a) of use mentioned above
where the program counter (position in the code)
determines what component of the union is used.
Indeed, it determines when to construct the
component, and when to destroy it. In this case
I need a naked union like:

union X {
string s;
vector v;
}

without any constructors or destructors, since
I will use placement new and expicit
destruction to build and destroy objects.
Since I'm emulating a stack frame, I'm happy
to prevent copying by leaving out copy and assignment
operators too.

Present workaround: I allocate the store as required
on the heap in some cases, and use a struct instead
of a union in others, costing time and storage, respectively.
In addition, there may be a problem failing to destroy
an object at the correct time in the second workaround.

The second use is the categorical sum, or discriminated
union, usage type (b). In this case I am emulating
ML style variants. This canonical ML example is the list:

type 'a list = Empty | Cons of int * 'a list

where 'a is a type variable. My representation is a tagged
pointer:

struct X { int caseno; void *p; }

where p is cast as appropriate. Unfortunately,
apart from costing allocations, there is a serious
semantic problem, since copying these pointers
does not copy the objects pointed at by p.

This second use is a very common need: i cited
before the example of transation types,
which are all so often *incorrectly* encoded
using a base and derived types. That encoding
is popular partly due to ignorance of correct
structure, but also because it is relatively simple
to use RTTI to determine the case.

[The technique suffers from both allocation costs
and lack of type closure over the union type,
quite apart from confusing programs by adding
yet another abuse of inheritance]

ES048. ???

See EP011.

Note by Dave Abrahams:

As a separate related matter, I think we ought to be thinking about
removing the use of copy-initialization for forms not involving curly
braces. Among other things, that would allow:

auto_ptr x = new T;

instead of the currently-required:

auto_ptr x = auto_ptr(new T);

(due to auto_ptr's explicit constructor). I know it says "explicit",
but that's really there to prevent unintended conversions. The
declaration above is highly intentional in either form, and probably
clearer in the first form!

This is also useful for cases like:

struct Y
{
Y();
explicit Y(int);
};

struct X
{
explicit X(Y);
};

int z = 3;

X x(Y(z)); // oops, declares a function

X& xx = x; // error

allowing

X x = Y(Z);

instead of the awkward:

X x((Y(Z)));

[I know it doesn't generalize to multiple arguments]

Also, I note that at least one important compiler front-end does
copy-elision for the copy-initialization case but NOT for the
direct-initialization case, which is awfully unintuitive. Why should
users have to worry about the fact that one form might be more
efficient than the other, at the whim of the compiler vendor?

ES049. Multimethods

Provide the ability to do a dynamic lookup on more than one operand. Frequest suggestion. Discussion in D&E.

Document from Julian Smith: N1463=03-0046.

ES050. Based enums Allow the underlying type of an enum to be explicitly declared. For example:

	enum E : char { /* ... */ };
	enum N : int { /* ... */ }

E051. Derived enumerations Allow derived enumerations. For example:

	enum B { a, b };
	enum D : B { c, d }; // members of D are a, b, c, and d