Doc. no.: N2034=06-0104
Date: 2006-06-19
Project: Programming Language C++
Reply to: Matt Austern <austern@google.com>
This is a list of suggested additions to the standard C++ library for the next version of C++ ("C++0x").
This list is intended as a stimulus to further work. Some of the items on this list already have one or more concrete proposals associated with them, others are vague hopes, and most are somewhere in between. Similarly, the scope of these items extend from minor changes in an existing interface to major research projects. I have listed a name or names for each item, and, if relevant, I have listed a paper or papers describing the idea. These ideas are not necessarily supported by anyone other than the person who suggested them, and you should not assume that any particular item will make it into C++0x. The items in this list are in no particular order.
Most of the entries in the wish list fall into a small number of general categories:
Requester: Dave Abrahams
Paper:
Simple and uniform display mechanism for arbitrary ordered lists. (Arrays, STL containers, pairs, tuples, etc.) cout << x should "just work" for such types.
Requester: Maarten Kronenburg, Bjarne Stroustrup, Matt
Austern, others
Paper: N1718, "A Proposal to add the Infinite Precision Integer
and Rational to the C++ Standard Library", M. J. Kronenburg.
Requester: Bjarne Stroustrup
Paper:
String versions of all (or) most functions that take C-style strings (e.g. filestream constructors)
Requester: Bjarne Stroustrup
Paper:
Simple functions for extracting int and floating-point values out of a string
Requester: Frank Yeh, John Medema, and others.
Paper:
Better member functions in string template class eg. trimming, upper/lower case conversion etc. Some specific suggestions:
Requester: Bjarne Stroustrup
Paper:
Requester: Matt Austern, Assaf Lavie
Paper:
Assaf Lavie adds: Text encoding libraries - all formats of Unicode and code page conversions, base64 (maybe through iostream manipulators), etc.
Requester: Bjarne Stroustrup
Paper:
Requester: Bjarne Stroustrup, Hans Boehm
Paper:
"Infinite precision" reals.
Requester: Bjarne Stroustrup, Michael Rasmussen, John
Medema, others
Paper:
Some form of simple graphic/GUI library (possibly a simple interface to the simpler parts of larger libraries)—a recurent theme.
Some thoughts on this from Michael Rasmussen:
The main architecture resembles the idea behind Java's JDBC. E.g. the
language/library provides a standard interface or set of interfaces and
developers of graphics libraries does the actual implementation. If put
into a design pattern terminology the standard interface is a factory
hiding all implementation specific stuff from the programmer. The standard
interface would specify features and their dependencies but otherwise
leave implementation descessions to the developers of the graphics
library.
There are many advantages to this architecture of which I will point out the most important ones:
It is hard for me to point out some disadvantages but of course one is obvious: The success depends on adaptation from developers of graphics libraries.
Requester: Bjarne Stroustrup, Chris Messer
Paper:
A facility for random-access to files; maybe a form of random-access iterators for filestreams.
One can totally facilitate this random-access by memory-mapping the desired file. Unfortunately, the API for file mapping is different between platforms. ( ie. mmap() vs. CreateFileMapping() ). I would find it extremely useful to have a standardized file-mapping API.
a range checked version of standard containers/iterators and algorithms
Requester: Bjarne Stroustrup, Beman Dawes
Paper:
a way of manipulating files that are part of a directory structure
Requester: unknown
Paper:
Process manipulation library—allows you to iterate processes running on the system and its threads, support forking, duping a process, killing etc.
Requester: Bjarne Stroustrup, Tanguy Fautré, Sunil Palicha,
Michael Hopkins
Paper:
a good linear algebra library (Matrices and vectors)
My main plea is on a Linear Algebra library which is definitely needed for numeric work, to stop people having to implement it themselves and usually doing a poor or incomplete job – either in terms of efficiency or generality (or both).
As well as the obvious requirements of easy interface to C & Fortran math libraries, PLEASE make sure that vector/matrix/tensor access & manipulation can be from 1..n with no loss of efficiency (or possibly with any arbitrary start and end, but at least 1..n) and not force it to be from 0..n-1. We do this in our own libraries by defining operator() thus: v( n ) -> v[ n-1 ] , M( n, m ) -> M[n-1][m-1] etc. This could be described as Fortran style and is definitely more natural for math/stats work. Efficient methods for representing and iterating slices with guaranteed no copying would also be very useful.
Iteration from 1..n is essential to tempt math & stats people to use the library - maybe as converts from Fortran or other languages which use this more math-friendly syntax as a default.
Ability to express sub-matrices, block matrices, projection/slices etc in a natural syntax. This is touched on in section 22.4 of BS’s book TC++PL using std::valarray and its’ related types which may be good candidates for implementing a standard linear algebra library. There are no guarantees that using these auxiliary types disallows copying though, and this suggests that a straightforward implementation could compromise runtime efficiency – see below.
Runtime efficiency will be an important part of convincing math users to use the library. This will obviously have an impact on the simple iterators and slice/projection operators that are provided and how they are implemented.
Operator overloading would clearly be useful for obvious cases ( *, /, +, -, ^ ), though many matrix and vector operations & algorithms will be much better implemented by calls to BLAS, LAPACK etc libs optimised for the individual platform (e.g. ACML, MKL, ATLAS). Maybe these could be overloaded/specialised as necessary when optimised implementations are available? Some ideas are mentioned in TC++PL 22.4.7.
boost::MultiArray and boost::uBLAS both look like promising work towards a standard linear algebra library facility though I have not used either myself so cannot comment on their efficiency or generality. Maybe the authors of these could be brought in on the design process? I suspect that heavy-duty template meta-programming like Blitz++ (and uBLAS?) would produce very long compile times though, and it is likely that linking to optimised C and Fortran libraries would be preferable in many cases – see above.
Linear algebra types should be generic for fundamental math types (i.e. float, double, long double, int, unsigned int, long, unsigned long, long long, unsigned long long) but there may be an argument for not extending this to all types – so maybe typelists would be useful?
There are many good ideas related to linear algebra types in Part III of Barton & Nackman book “Scientific & Engineering C++” (chapters 13, 15, 18), though it’s possible that C++ language facilities and programming paradigms that take into account language strengths and weaknesses have moved on in some areas since then.
Interfacing to C and Fortran libraries should be a ‘no-brainer’. This will probably mean:
that all values in a linear algebra object must be stored contiguously
providing a member function that returns a pointer to the first element
boost::function is useful for generalising math operations.
Requester: David Miller
Paper:
Versions of the standard containers with virtual destructors
Requester: Howard Hinnant
Paper:
Move semantics, which requires core and library support
Requester: Gary Powell
Paper:
A string formatting library. (Aka, type safe printf) John Bates suggests that one way to improve on printf would be to have explicit parameter substitution via numbering, as opposed to the implicit ordering in printf. This would be similar to Java's MessageFormat object or Microsoft's FormatMessage API and would allow better internationalization support.
Requester: Gary Powell
Paper:
A container to hold anything, like dynamic_any
Requester: Gary Powell
Paper:
A date/time library, a date is not a duration! And an end to time_t.
Requester: Gary Powell
Paper:
a graph library, aka boost/graph (supplements std containers)
Requester: Gary Powell
Paper:
math/octonion & math/quaterion's (used by game designers for 3d math.)
Requester: Gary Powell, Walter Brown, Andy Little
Paper:
a SI/Units library. ( e = m c2, and end to the mars lander crashes.)
One example of work in this area is Andy Little's pqs library, which addresses physical quantities as well as units.
Requester: Matt Austern
Paper:
New STL algorithms: copy_if, is_sorted, versions of the uninitialized_* algorithms that take allocator arguments.
Requester: Jonathan Schilling
Paper:
A SOAP- and XML-based web services binding to the standard library. There would also need to be some kind of metalanguage/directives mechanism for marking within the C++ source, which classes and functions are being exposed as web services.
Requester: Paul Bristow
Paper: N1668 = 04-0108, "A Proposal to add Mathematical Functions
for Statistics to the C++ Standard Library", Paul Bristow
More math functions for statistics: Student's t function, Kolmogorov statistic, incomplete beta function, digamma function, and so on.
Requester: Matt Austern
Paper:
A mechanism for accessing integer operations that every processor has: addition with carry, full-width multiplication, and division with remainder and quotient. The signatures might look something like this: pair<UInt, bool> add(UInt, UInt);, pair<UInt, UInt> mult(UInt, UInt);, pair<UInt, UInt> div(UInt, UInt);. We would probably want overloads for unsigned int, unsigned long, unsigned long long (assuming it's added to the core language), and maybe unsigned char.
Nick Krempel adds:
Firstly, I wanted to point out that addition with carry is already
possible in a reasonably natural way as follows:
unsigned int a, b, c;
// a and b are inputs
c = a + b;
if (c < a) // carry
else // no carry
This seems slightly easier to work with than a pair object, but I can see
why you may want the pair object version for uniformity.
Secondly, and more importantly, I wanted to emphasize the importance of a
2-by-1 division.
That is, a division taking a double-precision number (which could be 2
separate ‘UInt’s of the relevant type) and a single-precision divisor and
returns a single-precision result. An exception can be thrown if the
result doesn’t fit in single precision. Another version can also return a
single-precision remainder.
This operation is important for implementing multiprecision arithmetic
(for example a divide n-by-1 is easily implemented in terms of it), and it
is available as a single assembler instruction on Intel processors. It is
a natural inverse to the 1x1->2 full-width multiplication you propose.
Of course, it can be implemented in terms of usual (C++) arithmetic
operations, but this requires at least as much code as implementing your
proposed full multiplication in terms of the usual C++ multiplication (at
least 3 multiplications and some bit shifting), and of course will be a
number of times slower. So it seems in line with your proposal to include
this operation too.
Requester: Rüdiger Brünner, John Medema, Eduardo Yánez
Paper:
Generic database support; STL-based recordset operations
Requester: Rüdiger Brünner, Raffaele Mancuso,
Anubhav Hanjura, others
Paper:
Library for binary object serialization/deserialization including object factory, versioning support, support for native datatypes and Standard Library classes (std::basic_string, STL containers, std::complex, etc.)
Anubhav Hanjura adds: I would want C++ to support serialization and deserialization of complex data structures (may be based on the STL) completely, so that data streams can be exchanged without bothering about endianness or encoding/decoding specifics.
Requester: Rüdiger Brünner
Paper:
Fixed-point arithmetic library and currency class based upon it
Requester: Rüdiger Brünner, Assaf Lavie
Paper:
Singleton template (like in the Loki library) and probably also generic implementations of other common patterns. Assaf Lavie adds a request for machine-wide singleton classes.
Requester: Sylvain Pion
Paper:
The Sun compiler provides a built-in type for interval arithmetic, and Boost and CGAL have interval classes. There are many other implementations elsewhere.
Requester: Alisdair Meredith
Paper:
One item I would like to add to the LWG wish list if a 'class factory' or 'virtual constructor' implementation, again inspired by Loki. I have been wrestling with several points of the design space, trying to find the best balance. Was hoping to propose something for TR2 but there is one issue that really needs core (i.e. Evolution) support before we can have a truly satisfying interface, and that is the ability to work with any constructor rather than just the default.
A mix of perfect forwarding and var-arg template parameters will get us some way there, but I'm still not sure it is enough.
Another issue not immediately tied in is the progress on dynamic libraries, as class factories often lie at the heart of plug-in schemes.
Beyond that it is a matter of defining the optimal interface. The Loki version is quite policy driven eg. for return type. Do we want to allow a variety of returns, or fix on tr1::shared_ptr, or auto_pr (or even move_ptr if it arrives)?
Requester: Dmitriy Morozov and others
Paper:
The standard's allocator machinery, in principle, allows users to define allocators with alternate memory models. (The original motivation was MS-DOS's "near" and "far" memory models.) Allocators have reference and pointer typedefs. However, this support is seriously limited. First, the current standard doesn't allow any freedom for Alloc::reference: it is required to be T&. Second, 20.1.5/4-5 says that standard library implementations are allowed to ignore an allocator's memory model typedefs and assume that all allocators use the usual memory model.
These restrictions should be removed. This may require some core language changes. It may also require writing out requirements that an alternate memory model must satisfy.
Requester: Diego Bragato
Paper:
A standardized documentation facility (e.g. doxygen)
Requester: Diego Bragato
Paper:
A metadata language support (e.g. xdoclet,jdk1.5)
Requester: Diego Bragato
Paper:
Support for reversibility from the implementation to a "set of" design paradigms (e.g. UML and eiffel). I do feel that this would be a strong innovation. This feature can be nicely integrated with graph support, documentation, metadata, and even the graphical facilities
Requester: Diego Bragato
Paper:
Support for the design by contract method, including standardized support for testing available from the asserts in the contracts.
Requester: Diego Bragato
Paper:
Requester: Diego Bragato
Paper:
Specification for a web service container, taking into account also authentication
Requester: Liran Shaul
Paper:
Why doesn't the std::map class offer a method reserve that will allocate memory ahead of time, like the vector::reserve function?
I understand that this function won't be able to be as efficient as the vector/list matching function, However allocating all the memory (or at least most of it) In advance has 2 major advantages that I can think of:
Requester: Jeremy Jurksztowicz, James Buchanan, Lubos Dolezel, and
others.
Paper:
I was just going over the wishlist for the C++ standard and have another suggestion. As lock free programming comes more and more into mainstream, why not add an atomic type to C++? Something like 'volatile word', 'atomic<T>' or just make it a requirement that all accesses to volatile ints are atomic, for most common platforms it already is. This would make portable FIFO, and other lock free structure implementations trivial. Alternatively you could add a std::compare_and_swap function.
James Buchanan adds a request for mutexes, semaphores, and spinlocks. This would be enough for low level system programming on SMP machines.
Lubos Dolezel adds: Something like std::mutex would be great. Just a simple multiplatform class for mutexes with methods like lock(), unlock() and trylock() would simplify many projects.
Requester: Kirk Korver
Paper:
One of the annoyances I have with using the STL is it's implementation of std::for_each(). I think this algorithm is a great idea, and passing in the start and end ranges are nice for flexibility. But I ask you, how many times have you used for_each on EVERY ELEMENT in the collection? I would wager that it is upwards of 90%. Why do so much typing for something we do often?
I propose: The addition of the algorithm for_all, which is a replacement to for_each, where the functor is to be applied to ALL elements in the collection.
Here is the actual implementation I have in my code.
namespace std
{
// Author & Date: Kirk Korver 03 Jun 2004
// Purpose: wrap the std::for_each function to apply the functor to ALL entries
// Inputs:
// c - the container we care about
// f - the function or functor we want to apply
template <class Container, typename Functor> inline
Functor for_all(Container & c, Functor & f)
{
return std::for_each(c.begin(), c.end(), f);
}
};
Requester: Greg Osgood
Paper:
There is one feature that I would like to see added to the STL: the ability to specify the threading model for collections. In most implementations the threading model is defined at the project level. There are some situations in which accessing a collection if very critical to performance of the application. If the threading model is single threaded then the performance will be very close to the same as using arrays. However, if the threading model is multithread then the performance may suffer greatly. This seems pretty obvious to me, but if you have any questions or need clarification let me know.
Requester: Tyler Doubrava, Chris Messer
Paper:
It would be great if std::vector<> had an intuitive way to deallocate its memory. You can do so by calling the member function swap(), providing an empty instance of the same type. Most people, I believe, would have intuitively expected the clear() member function to do so.
I suggest adding a new member function, dealloc() which would call clear, and free the vector's memory.
Further comment from Chris Messer:
I believe it is important to maintain the functional difference between
clearing a list and freeing it. If one intends to re-fill the list, then
clearing saves the performance hassle of re-allocating, so I don't think
that freeing memory in clear() is a good idea. It would be nice to have a
member function tidy() that wasn't related to clear(), but that freed all
of the vector's unused capacity. (At least one implementation already
provides _Tidy as a nonstandard extension.)
Requester: Benjamin Hanson
Paper: lexertl: The
Modular Lexical Analyser Generator
I would like to see a lexical analyser generator library (and even maybe some kind of yacc library) added to std::tr2. Dave Handley and myself are currently developing a lexical analyser generator 'lexertl' which can be viewed at http://www.benhanson.net/lexertl.html.
James Buchanan adds:
Spirit is very nice, but is quite hackish. I think it's better to have some kind of template where you can choose the algorithm (LL(k)/resursive descent, GLR, LALR, LR(k)) and make the lexical analysis and parsing framework understand EBNF as a standard way for feeding in a CFG. Also, there should be additional types like symbol tables and an AST type.
Requester: James Buchanan
Paper:
Some kind of standard interface for dealing with graphics primitives (hooks into X or ncurses/dialog on Unix systems, or optionally SVGAlib on Linux, Win32 on Windows systems, whatever on Mac).
Also multimedia: A standard interface to the audio and video facilities on a platform.
Requester: James Buchanan
Paper:
Binary trees, splay trees, Red-Black trees, B trees, B+ trees, dancing trees (those used in Hans Reiser's Reiser File System).
Requester: James Buchanan
Paper:
Requester: James Buchanan
Paper:
This would be separate from the Standard C++ library proper, and become a second standard library. There is a draft of C for embedded systems that the embedded C++ stdlib could make use of as a starting point. The embedded stdlib would need to have interfaces for access to low-level system I/O (inport_b, outport_b and friends), etc.
Requester: Zach Laine
Paper:
The lack of uniformity between std::map iterators and other container's iterators could be partially addressed by providing "*_value" versions of std::for_each, std::accumulate, etc.. These versions would be specialized for iterators whose value_types are std::pairs, and would operate only on the .second member thereof, while still operating as before on iterators with non-std::pair value_types. Alternately, this could be accomplished by providing a new iterator that wraps a given iterator, and returns the wrapped iterator's value_type's .second member when dereferenced, if appropriate.
This lack of uniformity could be fully addressed by acknowledging that std::map is a bit different than the other containers, and creating a custom iterator type for its use. This new iterator type would still provide access to the key associated with its associated element, through a method called key_value() or similar, but would access its element's mapped value when dereferenced, and would access its element's mapped value's members using operator->().
Requester: Assaf Lavie
Paper:
To be able to open Unicode-named files using fstream (i.e. not just a string c'tor but also a wstring c'tor).
Requester: Assaf Lavie
Paper:
i.e. standard basic_string<tchar> where tchar compiles to either char or wchar_t
Requester: Assaf Lavie
Paper:
From Assaf Lavie:
I wish it was as easy to do web-stuff in C++ as it is in, say, perl. Classes for HTTP/FTP/HTTPS/XML/HTML/
From John Medema:
A unified socket/stream interface. Perhapsnstream nsck( "www.google.com:80", "#" )where "#" is a delimiter string for string parsing (i.e. the stream contains "1#2#3#4.4#5", and repeated calls nsck >> myString would yield myString values of 1, 2, 3, 4.4, and 5).
Requester: Assaf Lavie
Paper:
lambda expressions - std::for_each is hardly useful without them.
Requester: John Medema
Paper:
Wrapper libraries for common C functions. This would be largely unnecessary if we could get a richer std::string interface with conversions to and from C data types, but there are a great many socket functions that need dynamically allocated pointers.
Requester: Greg Osgood
Paper:
I would like to see a binary_find function in STL algorithm which would work just like binary_search except instead of returning a bool indicating whether or not the item was found, would return an iterator to the object if it is found and the end of the collection if it is not. Just like binary_search, there should be two versions one which takes a predicate. The function signature could be as follows:
template<class ForwardIterator, class Type>
ForwardIterator binary_find(
ForwardIterator _First,
ForwardIterator _Last,
const Type& _Val
);
template<class ForwardIterator, class Type, class BinaryPredicate>
ForwardIterator binary_find(
ForwardIterator _First,
ForwardIterator _Last,
const Type& _Val,
BinaryPredicate _Comp
);
(This would just be lower_bound followed by an equality check, but it might still be useful.)
Requester: Ravi Menon
Paper: A
candidate for a boost version of intrusive containers
We have observed performance problems with std stl containers (it is used at certain areas in our code), and sticking with old C code (space and time efficient) is too cumbersome.
Without getting into ugly details, one main issue we see with stl containers like map and hash_map is the 'extra allocation' required for the node - _Hashtable_node for hash_map and _Rb_tree_node for map. Even if we are storing pointers to objects (for key and value) to avoid copy ctor overheads for the pair, for MT systems with mutex overhead, it still is an issue.
Although I am a big fan of stl containers, this is one criticism I can't overcome from folks who are more used to old C style intrusive implementations (with all that unsafe casts to void* etc..).
No amount of positive arguments in favor like type safety, inlining (as opposed to fn pointers used for hash comparators and hash functions), readability, s/w maintenance etc.. can overcome the bias in favor of measurable (and significant) performance gains with intrusive lists.
Requester: Ravi Menon
Paper:
It would be useful to have some provision (a template argument?) to supply an alternative hash table used for unordered associative containers. E.g. power of 2 size hash table to avoid mod (%) arithmetic, and get to the bucket via bit-wise '&' - i.e. bucket = hash & (size - 1). I can think of some designs and willing to contribute if there is an interest.
Requester: Jeroen Vermeulen
Paper:
Cloning
Exceptions
We need a way to be able to create copies of exception objects, without full static type information. This helps parallelism because it allows us to write frameworks that perform multiple tasks independently of one another, and gather up their successes and failures for later processing. Just catch any exception coming out of each individual task, add it to a container of results, and proceed to the next independent task.
The std::exception hierarchy already has a vtable, so this is no great loss in terms of performance. Now, every concrete class derived from std::exception would hopefully implement this function, always returning a copy of itself as a std::auto_ptr<std::exception> to avoid changing the return type--but dynamically it would always point to an exception of the right type. Programmers implementing exception classes of their own would be called upon to do the same.
Requester: Fernando Lagos
Paper: http://www.codeproject.com/vcpp/stl/linked_map.asp
A family of adaptors/containers to make the standard associative
containers know the insertion/get/deletetion order. An example can be seen
at http://www.codeproject.com/vcpp/stl/linked_map.asp
Requester: George Faraj
Paper:
I'm writing to propose something to the library wish list. I haven't found this in the active issues list.std::vector<ConvertibleToInt> v1;
...
std::vector<int> v2 = v1; // error
std::vector<int> v2(v1.begin(), v2.end()); // fine
std::vector<std::vector<ConvertibleToInt> > v3;
std::vector<std::vector<int> > v4 = v3; // error
std::vector<std::vector<int> > v5(v3.begin(), v3.end()); // error
// fine but overly complicated
std::vector<std::vector<int> > v6;
std::vector<std::vector<ConvertibleToInt> >::iterator it, end = v3.end();
for (it = v3.begin(); it != end; ++it)
v6.push_back(std::vector<int>(it->begin(), it->end()));
// proposed syntax
std::vector<std::vector<int> > v7 = v3; // simple and intuitive
vector(vector<T, Allocator> const&);
template <typename Y, typename AllocatorY>
vector(vector<Y, AllocatorY> const&);
Requester: Andy Jewell
Paper:
How about an additional template parameter for vector and bitset and such,
that determines the index type, i.e. the type of the parameter to
operator[], or perhaps more generally the size_type.
Common uses would include enumerated types, or one the many abstractions
to make integers into separate types.
The only effect of this parameter would be to cause a compile time error
if the wrong type is used.
One could certainly wrap the existing classes to get this effect, which
many people do, but it seems a waste of effort.
Requester: Dennis Lubert
Paper:
Possibility to choose the underlying tree type of a std::map (probably via
template parameter, of course std::set and the multi variants too). This
probably goes together with the wish of additional tree types by James
Buchanan (I would like to see a AVL Tree implementation there too). The
main reason here is, that most std::map implementations use an rb tree
implementation. While this is quite ok in most times, it can get very
unbalanced if fed with badly ordered (sorted) data, which especially comes
when copying maps (seems mostly done by inserting in-order).
Requester: Dennis Lubert
Paper: http://www.boost.org/libs/assign/doc/index.html
One thing that I noted is especially handy for C++ newbies, is the
boost::assign family of functions that allows a very intuitive handling of
container input. I think most parts of those functions would be very good
to have in C++.
Requester: Oliver Kowalke
Paper:
http://boost-sandbox.sourceforge.net/libs/statechart/doc/index.html
I would suggest boost::statechart library which allows the implementation of finite state machines over templates. It follows the UML-statechart notation. It will be included into boost-1.34. I'm using it and find it very impressive.
Please send additions to this list to Matt Austern, matt @ lafstern.org