Doc. No.: X3J16/95-0013 WG21/ N0613 Date: January 31, 1995 Project: Programming Language C++ Reply To: J. Lawrence Podmolik Andersen Consulting jlp@chi.andersen.com Clause 23 (Containers Library) Issues List Revision 1 Revision History Version 1 - January 31, 1995: Distributed in pre-Austin mailing. Introduction This document is a summary of the issues identified in Clause 23. For each issue the status, a short description, and pointers to relevant reflector messages and papers are given. This evolving document will serve as a basis of discussion and historical for Containers issues and as a foundation of proposals for resolving specific issues. Issues ------------------------------------------------------------------------- Work Group: Library Issue Number: 1 Title: propose to add convenience functions to stl containers Sections: 23.1.5 through 23.1.8 and 23.2.1 through 23.2.4 Status: active Description: There are some common needs that are not currently provided as a class members of the stl container classes. These additional methods can be provided by individual programmers as needed, but in our experience are so commonly wanted that they deserve to be included as class members Proposed Resolution: Add the following method to all containers size_type void Container::clear() {return erase(begin(),end());} Note: the returned size_type matches another proposal that all methods which change the size of a container by "some" amount return that amount. add the following methods to containers that provide the equivalent void pop_something(): sections 23.1.7 through 23.1.8 (list, deque) T pop_front_value() {T ret = front(); pop_front(); return ret; } section 23.1.5 through 23.1.8 (vectors, list deque) T pop_back_value() {T ret = back(); pop_back(); return ret; } sections 23.1.9 and 23.1.11 (stack and priority_queue) T pop_value() {T ret = top(); pop(); return ret; } section 23.1.10 (queue) T pop_value() {T ret = front(); pop(); return ret; } Note: The method names are a suggestion only. It is the returned T that is wanted. Requestor: Frank Griswold: griswolf@roguewave.com Owner: Discussion: Alex Stepanov has indicated that the design philosophy of the STL was (approximately) every method should "do exactly one thing". While this criterion is elegant and correct, it could be improved to make the library's "natural" use somewhat easier. Balanced against this consideration is the cost incurred by making the change: The interface becomes slightly wider; and the library becomes slightly larger (and therefore harder to maintain). Rogue Wave has had a great deal of experience in the matter of choosing an optimum class interface. It is always difficult to explain to users why any particular additional proposed method costs more to implement, maintain, and use than it is worth. In this case, the additional functionality is parallel to many requests from users of the Tools.h++ classes, the implementations are minor, maintenance should be nil, and programmers who do not make use of the methods should incur very small compile time and no run time cost. ------------------------------------------------------------------------- Work Group: Library Issue Number: 2 Title: should some STL members return an iterator? Also: minor clarification of member insert() Sections: 23.1.5 through 23.1.8 and 23.2.1 through 23.2.4 Status: active Description: There are some methods in all the STL containers that currently return void. These methods can be used "in a more natural way" for certain coding techniques if they return an iterator. Also: The current description of the method iterator insert(iterator location, const T&) does not specify which iterator is returned. Proposed Resolution: If a container method might change the size of the container by exactly one, then it should return an iterator that points to the item inserted, or "just past" the item removed. (This covers both the needed clarification and the proposed change) The method listed below should return an iterator providing the location of the change called for by the method. These changes should be made in parallel on all the containers mentioned in the WP sections listed above. iterator Container::erase(iterator); The push and pop methods should return an iterator because doing so keeps the conceptual model consistent. Also note that for containers with only a forward iterator, the proposed version of push_back() returns information that could be expensive to recover. (Sections 2.3.1.5 through 2.3.1.8) iterator Container::push_front(const T&) {... return begin();} iterator Container::push_back(const T&) {... no way to do cheaply unless bidirectional ... return --end()} iterator Container::pop_front() { ... return begin(); } iterator Container::pop_back() { ... return end(); } the following method should return an iterator because doing so keeps the conceptual model consistent. iterator list::splice(iterator position, list& donor, iterator donor_location) Note that Container::insert(iterator,const T&) already returns an iterator for all classes in the STL. Requestor: Frank Griswold: griswolf@roguewave.com Owner: Discussion: Alex Stepanov has indicated that the design philosophy of the STL was (approximately) every method should "do exactly one thing". While this criterion is elegant and correct, it could be improved for the general user by allowing a more natural coding style which arises when the method returns an iterator referencing the item at the location where the action occurred. There is the further consideration that under some circumstances, it may be expensive to recover that location without such a returned iterator. Note also that the current specification does not require that any iterator remain valid after a change to the container (each container class specifies which iterators remain useful). Providing a returned "guaranteed good" iterator allows continued use of an iterator that cannot be guaranteed without this proposed change. Balanced against these considerations are the costs incurred by making the change: The cost of returning the iterator will never be more than the cost of creating, assigning and incrementing an iterator, and in many cases it will be possible to return (an assignment of) an iterator that was already in use. There is also the added cost to the program of having a return value in a register or on the stack. There are already methods in the STL containers that are consistent with the proposed conceptual model (For instance: Container:: insert(iterator const T&)). The proposed changes can therefore be seen as modifications to some classes to make the conceptual model consistent across all the STL containers. Example using erase // as it is now Collection::iterator hit = find(col.begin(),first_deletable); for(int i = 0; i < interim_deletes; ++i) iterator save = hit; // can this work? Collection::iterator save = hit; // maybe do something useful with hit col.erase(hit) hit = ++save; // oops: no guarantee this can work // so you might have to do some pretty fancy (and expensive) // extra work. Like do another search for each loop. Ugh. } // as proposed Collection::iterator hit = find(col.begin(),first_deletable); while(for int i = 0; i < interim_deletes; ++i) { // maybe do something useful with hit hit = col.erase(hit); // no need to increment hit: we get "next" back from call to erase() } ------------------------------------------------------------------------- Work Group: Library Issue Number: 3 Title: problems of nomenclature in STL classes Sections: 23.1.5 through 23.1.8 and 23.2.1 through 23.2.4 Status: active Description: -- list has the only method that takes a predicate object: remove_if(Predicate). This method's name should be changed to "remove." -- The STL container classes provide methods to "erase" or "remove" data; but the names of those methods are not consistent with their semantics across classes. -- Section 17.2.2.4.2 refers to "associative containers" These are not all associative (in the sense of an association between a key and a value). They are containers which internally mediate the location of their contained data rather than allowing the user to place the data. This request for renaming may be editorial. Proposed Resolution: -- No container class member name contain a trailing "_if," since overloading based on the signature of the method is sufficient. This will change the WP only for class list 23.1.7 class declaration; change remove_if to remove 23.1.7.2 paragraph 8 remains unchanged -- Also propose that the following conceptual model be used in naming methods which take data out of the container: If all the data that in some sense "matches" a key or a predicate object is removed, then the name of the method is "remove" parallel to list::remove()) of section 23.1.7.2 paragraph 8. By another proposal, these methods would also return a size_type instead of void. If data is erased at an iterator or in the range between two iterators, the method is named "erase" parallel to the erase() defined in 17.2.2.4.1 table 22. By another proposal, these methods would return either a size_type (range) or an iterator "just past" the point of erasure (single iterator). In summary: methods named erase "just erase right here" but methods named remove "search out and destroy." This will change the WP in sections: 17.2.2.4.2 table 25 associative containers change the first a.erase(k) to a.remove(k). Note the returned size_type which is already in conformance with the other proposal. 23.2.1, 23.2.2, 23.2.3, 23.2.4: rename erase(const key_type&) to remove... -- Also propose that section 17.2.2.4.2 be changed to refer to classes which are "not externally sequenced" or which are "internally sequenced" . This would suggest a parallel change in section 17.2.2.4.1 from "sequence" to "externally sequenced". We can live with any name which seems suitable to the editor or committee; and which does categorize the kind of container without linguistic traps. Requestor: Frank Griswold: griswolf@roguewave.com Owner: Discussion: Renaming list::remove_if. This simplifies life for programmers. We consider it "almost editorial" in the sense that no functionality would be lost. But since it changes the interface, we understand that it must be voted on. An alternative (and clearly not editorial) change would be to remove the method from list's interface: list is unique in having such a method. Rationalizing the use of "erase" to mean "erase right here" and "remove" to mean "find and remove all matches." This too is a simplification. As it now stands, the method "erase" overloads to mean two essentially different kinds of actions. This is a potential source of confusion; and is easily remedied. Renaming "associative containers." There are such things as really associative containers: The (multi)map classes are associative. Using the word associative to mean something else, particularly in the close context of maps is misleading. The WP is difficult enough to understand even without semantic traps. We do understand the difficulty of finding a meaningful and succinct replacement (and do understand that the word "associative" can be used to imply association of a large group). Our problem with the word is that it has such strong resonance with the meaning "mapping" that it makes it difficult to communicate about the "non-associative associative containers" such as Set. Ugh. The best we have been able to imagine or solicit is "internally sequenced" which is not very pleasing but which (we hope) is descriptive and not misleading. We do consider this editorial, but welcome the chance that more minds will find a better word. ------------------------------------------------------------------------- Work Group: Library Issue Number: 4 Title: should STL classes have fixed comparator semantics? Section: 17.2.2.4 (table 22), 23.1.5 through 23.2.4 Status: active Description: Table 22 specifies that the semantics of operator==(const Container a, const Container b) is "a.size() == b.size() && equal(a.begin(), a.end(), b.end())" This use of the algorithm equal() forces containers to hold data in the same order if they are to be ==. While this is often reasonable, it is not always the meaning that is wanted, particularly for data that is being used as an unordered (logical, not stl) set. Table 22 also specifies that the semantics of operator< (const Container, const Container) is "lexicographical_compare(a.begin(),a.end(),b.begin(),b.end())" As with operator==, this requirement for lexicographic ordering among containers, while often useful is not invariably what is wanted, particularly when the container is being used as an unordered (logical) set. Proposed Resolution: Provide suitable specialized meanings for the various operators by providing a traits class for each container. The standard should require (in sections 23.1.5 through 23.2.4) that these traits classes be specialized as follows: list, vector, deque: lexical comparison semantics (multi)set, (multi)map: set inclusion semantics In section 17.2.2.4 there should be a discussion like this: Any container which meets the STL specification must have an associated specialization of the container_traits class which provides a typedef binary_function&, const container&, int) comparitor; which may be used to provide the 6 comparison operators on that container. A complete description of the container must include a description of the comparison semantics provided for that container. Table 22 of section 17.2.2.4 would change in boxes defining operational semantics of operator==() and operator<() to say something like "either lexically comparison or set-inclusion comparison, depending on the container" There needs to be a discussion/definition of template lexical_comparitor; and template set_comparitor I'm not sure where these go. Chapter 25? lexical_comparitor makes use of lexicographical_compare (25.3.8), set_comparitor might make use of includes (25.3.5.1) but might not, depending on the container. Here is container_traits: template struct container_traits { typedef binary_function&, const container&, int> comparitor; }; Here is a required (partial) specialization: template struct container_traits { typedef lexical_comparitor comparitor; }; Requestor: Frank Griswold: griswolf@roguewave.com Owner: Discussion: There are actually two other possible resolutions. We believe they are inferior: -- "Leave things alone: Users can always use a function rather than an operator if they want different behavior." This solution is clear, simple, and wrong: Operator overloading can always be replaced by function call notation, but there are very few C++ practitioners who would willingly give up the clarity and elegance afforded by class-specific operators. -- Remove the rows from table 22 that refer to the semantics of the comparison operators (since they all depend on operator==() and operator<()); and insert a statement that the operators should be user-defined as needed. This solution is clear, simple, and moves the burden of providing reasonable semantics onto the production programmer, which we believe is wrong. If we are supplying a set of containers, it should be complete as well as correct. Although this issue bears some resemblance to the issue of providing a (std::) operator>=() in terms of operator<(); it is not the same: The problem here is not the possibility of providing unexpected (and perhaps incorrect) functionality. The problem is that the semantics for operator==() and for operator<() are not always exactly as the current standard defines them, and they _should_ not be fixed. We believe the best way to provide user-specifiable semantics for container comparisons is to require that they use a container_ comparitor that returns an int which is interpreted in the same way as the return from strcmp(). This three way result may then be used to implement all 6 of the comparison operators in a self-consistent way, and in constant additional time. This solution does impose slightly more structure on comparison semantics than necessary: operator==() need not be identical to "compare returns zero", but we regard this amount of structure as useful, reasonable and "never wrong in practice." Code something like this: template bool operator==(const container& a , const container& b) { // no size() comparison: the comparitor does all the work container_traits >::comparitor comp; return comp(a,b) == 0 ? true : false; } bool operator!=(const container& a , const container& b) { container_traits >::comparitor comp; return comp(a,b) != 0 ? true : false; } bool operator<(const container& a , const container& b) { container_traits >::comparitor comp; return comp(a,b) < 0 ? true : false; } // etc for the other 3 comparitors possible issues: This really depends on partial specialization to work. Users could still get particular semantics if they want to: class MyClass template struct container_traits { bool has_comp true; typedef set_comparitor comparitor; }; There has been some discussion of the possibility of moving the associative containers' key_compare into this traits class. Further thought leads us to believe that we do not want to force a key-type comparison into the container: A user may very well wish to use the same kind of container to hold the same type of key, but using different key comparison semantics between the two containers. There has also been some discussion of the possibility of adding another template argument to the containers: The comparison class. This would allow the user to specify for instance, list a; list b; Although this scheme is attractive, we believe it would add too much potential for error in actual use to be appropriate in the standard. For instance the two lists above cannot be compared because they are not of the same type. If the STL containers are to be useful as a "common denominator" among C++ classes, they should have a fixed meaning: classes of the same name containing the same type should always be comparable. Note that while users can still provide the comparison semantics they want for a given container of a given type, that semantics is fixed for any compilation. ------------------------------------------------------------------------- Work Group: Library Issue Number: 5 Title: should some STL members return a size_type? Sections: 23.1.5 through 23.1.8 and 23.2.1 through 23.2.4 Status: active Description: There are some methods in the STL containers that currently return void. These methods can be used "in a more natural way" if they return a count of the number of items removed or added. Note similar proposal requesting that some methods return iterators. Proposed Resolution: If a container method might change the size of the container by an unknown amount, it should return the amount of the change. The methods listed below should return a size_type indicating the number of items added to or removed from the container. These changes should be made in parallel on all the containers mentioned in the WP sections listed. section 23.1.5 through 23.1.8 and section 23.2.1 through 23.2.4 size_type Container::erase(iterator start, iterator boundary); section 23.1.5 through 23.1.8 template size_type Container::insert(iterator location InputIterator first, InputIterator boundary); section 23.2.1 through 23.2.4 template size_type Container::insert( inputIterator first, InputIterator boundary); section 23.1.5 through 23.1.8 size_type Container::insert(iterator location, size_type, const T&); section 23.2.1 through 23.2.4 size_type Container::insert(size_type, const T&); The methods listed below should return a size_type indicating the number of items added to or removed from the container. These changes apply only to the class: list section 23.1.7 and 23.1.7.2 para 10 size_type list::merge(list&); section 23.1.7 and 23.1.7.2 para 5 size_type list::splice(iterator position, list&); section 23.1.7 and 23.1.7.2 para 7 size_type list::splice(iterator position, iterator start, iterator boundary); section 23.1.7 and 23.1.7.2 para 9 size_type list::unique(); section 23.1.7 and 23.1.7.2 para 9 template size_type list::unique(BinaryPredicate); section 23.1.7 and 23.1.7.2 para 8 size_type list::remove(const T&); section 23.1.7 and 23.1.7.2 para 8 template size_type list::remove_if(Predicate); Note: this method's name is the subject of another proposal Requestor: Frank Griswold: griswolf@roguewave.com Owner: Discussion: Alex Stepanov has indicated that the design philosophy of the STL was (approximately) every method should "do exactly one thing". While this criterion is elegant and correct, it could be improved for the general user by allowing a more natural coding style which arises when the method returns a count. See examples below. Balanced against these considerations are the costs incurred by making the change: A size_type local variable will have to be created and manipulated, either by incrementing during the execution of the method, or by subtracting one call of size() from another. There is also the added cost to the program of having a return value in a register or on the stack. There are already methods in the STL containers that are consistent with the proposed conceptual model (set::erase(const T&) for instance). The proposed changes can therefore be seen as being modifications to some classes to make the conceptual model consistent across all the STL containers. Example using insert // as it is now... visible_lines_container::size_type cur_sz = visible_lines.size(); lines.insert(lines.end(),newLines.begin(),newLines.end()); if(lines.size() - cur_sz > current_empty_lines) scroll_visible_lines(); // as proposed if(lines.insert(lines.end(),newLines.begin(),newLines.end()) > current_empty_lines ) scroll_visible_lines(); Example using erase // as it is now OdaContainer::size_type cur_oda_sz = overdueAccounts.size(); overdueAccounts.erase(first_reconciled,end()); if(cur_oda_sz - overdueAccounts.size() > this_user_accounts) throw(something); // as proposed if(overdueAccounts.erase(first_reconciled,end()) > this_user_accounts) throw(something); Example using remove() // as it is now list kill_list; list::size_type killed = kill_list.size(); do { kill_list.remove(get_user_choice()); } while (0 == killed - kill_list.size()) // user chose badly // as proposed list kill_list; while(0 == kill_list.remove(get_user_choice())) ; /* spin until you get a proper choice */ ------------------------------------------------------------------------- Work Group: Library Issue Number: 6 Title: naming inconsistencies in bits Sections: 23.1.1 [lib.template.bits] Status: active Description: vector uses "flip()" to toggle a bit, bits uses "toggle()". I asked this question almost 3 years ago, and people didn't care much either way. The international folks then present mildly favored "toggle". Not wanting to waste time on such trivia, and trusting that Alex has as much right to speak for international users as anyone, I suggest that we use "flip" in place of "toggle" in bits (it occurs in two places). Likewise, let's rename "length()" to "size()" (if not done already), to be in-sync with the rest of STL. I have for some time suggested that we rename bits to bitset, making it less awkward to talk about ("bits" is not a singular noun). This is not a priority, but I would surely like to hear some feedback at least once in a row. I have also suggested that "to_ushort()" is redundant, since we have to_ulong(). Resolution: Requestor: Chuck Allison <72640.1507@compuserve.com> Owner: Emails: (none) Papers: (none) ------------------------------------------------------------------------- Work Group: Library Issue Number: 7 Title: adding vector::flip that toggles all bits Sections: 23.1.1 [lib.template.bits] Status: active Description: vector uses flip() only as something one can do with a "reference", a surrogate for single-bit access. Is there anything wrong with adding a vector::flip to toggle all bits, like bits does? Resolution: Requestor: Chuck Allison <72640.1507@compuserve.com> Owner: Emails: (none) Papers: (none) ------------------------------------------------------------------------- Work Group: Library Issue Number: 8 Title: add a nested reference class to bits Sections: 23.1.1 [lib.template.bits] Status: active Description: I propose that we add a nested reference class to bits, similar to vector, to allow for bits::operator[]. To be precise, add: template class bits { public: class reference { public: reference& operator=(bool); // for b[i] = x; reference& operator=(const reference&); // for b[i] = b[j]; bool operator~() const; // for ~b[i] operator bool(); // for x = b[i]; reference& flip(); // for b[i].flip(); }; reference operator[](size_t); // for b[i] // the rest as-is in bits... }; Note no public constructor for reference. I have implemented this successfully in Borland C++. Note also that there is still no reason for introducing an iterator class for bits. Resolution: Requestor: Chuck Allison <72640.1507@compuserve.com> Owner: Emails: (none) Papers: (none) ------------------------------------------------------------------------- Work Group: Library Issue Number: 9 Title: adding a "default value" argument to map/multimap constructors Sections: 23.2.3 [lib.map], 23.2.4 [lib.multimap] Status: active Description: As currently defined, when operator[] is applied to a map or a multimap and a new entry is inserted into the map as a result, the new value is initialized using the default constructor T(). This is not always desirable - sometimes it is useful to specify another default value. The USL library solved this problem by providing an alternate constructor whrein the user could specify the value to be used when new ("empty") entries were automatically inserted into the map. Such an option could be added to map and multimap in the current WP. The analogous map/multimap constructors might look something like this: map(const T& = T(), const Compare& comp = Compare()); multimap(const T& = T(), const Compare& comp = Compare()); Resolution: Requestor: Larry Podmolik Owner: Emails: (none) Papers: (none) ------------------------------------------------------------------------- Work Group: Library Issue Number: 10 Title: reference counted strings and begin()/end() Sections: 21.1.1.2 [lib.basic.string] Status: active Description: In Valley Forge, we accepted a version of basic_string that was modified to be compatible with STL. One of the issues that arose was ensuring that a reference counted version of basic_string could be efficiently implemented. One suggested implementation prevented structural sharing of strings whenever a non-const iterator was allowed to escape from the string. This can make innocent-looking code like the following inefficient: string s; string::const_iterator iter = s.begin(); ... The non-const version of begin() gets invoked, so we would have to prohibit all future sharing of s, even though we don't intent to modify it. To force the return of a const iterator involves a messy looking const_cast to a reference type. So: should we create distinct names for the functions that return const iterators? Resolution: Requestor: Larry Podmolik Owner: Emails: c++std-lib-2981, c++std-lib-2985 Papers: (none) -------------------------------------------------------------------------