Doc. no. N2024=06-0094
Date: 2006-06-23
Project: Programming Language C++
Reply to: Howard Hinnant <howard.hinnant@gmail.com>

C++ Standard Library Active Issues List (Revision R43)

Reference ISO/IEC IS 14882:1998(E)

Also see:

The purpose of this document is to record the status of issues which have come before the Library Working Group (LWG) of the ANSI (J16) and ISO (WG21) C++ Standards Committee. Issues represent potential defects in the ISO/IEC IS 14882:1998(E) document. Issues are not to be used to request new features.

This document contains only library issues which are actively being considered by the Library Working Group. That is, issues which have a status of New, Open, Ready, and Review. See Library Defect Reports List for issues considered defects and Library Closed Issues List for issues considered closed.

The issues in these lists are not necessarily formal ISO Defect Reports (DR's). While some issues will eventually be elevated to official Defect Report status, other issues will be disposed of in other ways. See Issue Status.

This document is in an experimental format designed for both viewing via a world-wide web browser and hard-copy printing. It is available as an HTML file for browsing or PDF file for printing.

Prior to Revision 14, library issues lists existed in two slightly different versions; a Committee Version and a Public Version. Beginning with Revision 14 the two versions were combined into a single version.

This document includes [bracketed italicized notes] as a reminder to the LWG of current progress on issues. Such notes are strictly unofficial and should be read with caution as they may be incomplete or incorrect. Be aware that LWG support for a particular resolution can quickly change if new viewpoints or killer examples are presented in subsequent discussions.

For the most current official version of this document see http://www.open-std.org/jtc1/sc22/wg21/. Requests for further information about this document should include the document number above, reference ISO/IEC 14882:1998(E), and be submitted to Information Technology Industry Council (ITI), 1250 Eye Street NW, Washington, DC 20005.

Public information as to how to obtain a copy of the C++ Standard, join the standards committee, submit an issue, or comment on an issue can be found in the comp.std.c++ FAQ. Public discussion of C++ Standard related issues occurs on news:comp.std.c++.

For committee members, files available on the committee's private web site include the HTML version of the Standard itself. HTML hyperlinks from this issues list to those files will only work for committee members who have downloaded them into the same disk directory as the issues list files.

Revision History

Issue Status

New - The issue has not yet been reviewed by the LWG. Any Proposed Resolution is purely a suggestion from the issue submitter, and should not be construed as the view of LWG.

Open - The LWG has discussed the issue but is not yet ready to move the issue forward. There are several possible reasons for open status:

A Proposed Resolution for an open issue is still not be construed as the view of LWG. Comments on the current state of discussions are often given at the end of open issues in an italic font. Such comments are for information only and should not be given undue importance.

Dup - The LWG has reached consensus that the issue is a duplicate of another issue, and will not be further dealt with. A Rationale identifies the duplicated issue's issue number.

NAD - The LWG has reached consensus that the issue is not a defect in the Standard, and the issue is ready to forward to the full committee as a proposed record of response. A Rationale discusses the LWG's reasoning.

Review - Exact wording of a Proposed Resolution is now available for review on an issue for which the LWG previously reached informal consensus.

Tentatively Ready - The issue has been reviewed online, but not in a meeting, and some support has been formed for the proposed resolution. Tentatively Ready issues may be moved to Ready and forwarded to full committee within the same meeting. Unlike Ready issues they will be reviewed in subcommittee prior to forwarding to full committee.

Ready - The LWG has reached consensus that the issue is a defect in the Standard, the Proposed Resolution is correct, and the issue is ready to forward to the full committee for further action as a Defect Report (DR).

DR - (Defect Report) - The full J16 committee has voted to forward the issue to the Project Editor to be processed as a Potential Defect Report. The Project Editor reviews the issue, and then forwards it to the WG21 Convenor, who returns it to the full committee for final disposition. This issues list accords the status of DR to all these Defect Reports regardless of where they are in that process.

TC - (Technical Corrigenda) - The full WG21 committee has voted to accept the Defect Report's Proposed Resolution as a Technical Corrigenda. Action on this issue is thus complete and no further action is possible under ISO rules.

WP - (Working Paper) - The proposed resolution has not been accepted as a Technical Corrigendum, but the full WG21 committee has voted to apply the Defect Report's Proposed Resolution to the working paper.

RR - (Record of Response) - The full WG21 committee has determined that this issue is not a defect in the Standard. Action on this issue is thus complete and no further action is possible under ISO rules.

Future - In addition to the regular status, the LWG believes that this issue should be revisited at the next revision of the standard. It is usually paired with NAD.

Issues are always given the status of New when they first appear on the issues list. They may progress to Open or Review while the LWG is actively working on them. When the LWG has reached consensus on the disposition of an issue, the status will then change to Dup, NAD, or Ready as appropriate. Once the full J16 committee votes to forward Ready issues to the Project Editor, they are given the status of Defect Report ( DR). These in turn may become the basis for Technical Corrigenda (TC), or are closed without action other than a Record of Response (RR ). The intent of this LWG process is that only issues which are truly defects in the Standard move to the formal ISO DR status.

Active Issues


23. Num_get overflow result

Section: 22.2.2.1.2 [lib.facet.num.get.virtuals]  Status: Open  Submitter: Nathan Myers  Date: 6 Aug 1998

The current description of numeric input does not account for the possibility of overflow. This is an implicit result of changing the description to rely on the definition of scanf() (which fails to report overflow), and conflicts with the documented behavior of traditional and current implementations.

Users expect, when reading a character sequence that results in a value unrepresentable in the specified type, to have an error reported. The standard as written does not permit this.

Further comments from Dietmar:

I don't feel comfortable with the proposed resolution to issue 23: It kind of simplifies the issue to much. Here is what is going on:

Currently, the behavior of numeric overflow is rather counter intuitive and hard to trace, so I will describe it briefly:

Further discussion from Redmond:

The basic problem is that we've defined our behavior, including our error-reporting behavior, in terms of C90. However, C90's method of reporting overflow in scanf is not technically an "input error". The strto_* functions are more precise.

There was general consensus that failbit should be set upon overflow. We considered three options based on this:

  1. Set failbit upon conversion error (including overflow), and don't store any value.
  2. Set failbit upon conversion error, and also set errno to indicated the precise nature of the error.
  3. Set failbit upon conversion error. If the error was due to overflow, store +-numeric_limits<T>::max() as an overflow indication.

Straw poll: (1) 5; (2) 0; (3) 8.

Proposed resolution:

Discussed at Lillehammer. General outline of what we want the solution to look like: we want to say that overflow is an error, and provide a way to distinguish overflow from other kinds of errors. Choose candidate field the same way scanf does, but don't describe the rest of the process in terms of format. If a finite input field is too large (positive or negative) to be represented as a finite value, then set failbit and assign the nearest representable value. Bill will provide wording.


96. Vector<bool> is not a container

Section: 23.2.5 [lib.vector]  Status: Open  Submitter: AFNOR  Date: 7 Oct 1998

vector<bool> is not a container as its reference and pointer types are not references and pointers.

Also it forces everyone to have a space optimization instead of a speed one.

See also: 99-0008 == N1185 Vector<bool> is Nonconforming, Forces Optimization Choice.

Proposed resolution:

[In Santa Cruz the LWG felt that this was Not A Defect.]

[In Dublin many present felt that failure to meet Container requirements was a defect. There was disagreement as to whether or not the optimization requirements constituted a defect.]

[The LWG looked at the following resolutions in some detail:
     * Not A Defect.
     * Add a note explaining that vector<bool> does not meet Container requirements.
     * Remove vector<bool>.
     * Add a new category of container requirements which vector<bool> would meet.
     * Rename vector<bool>.

No alternative had strong, wide-spread, support and every alternative had at least one "over my dead body" response.

There was also mention of a transition scheme something like (1) add vector_bool and deprecate vector<bool> in the next standard. (2) Remove vector<bool> in the following standard.]

[Modifying container requirements to permit returning proxies (thus allowing container requirements conforming vector<bool>) was also discussed.]

[It was also noted that there is a partial but ugly workaround in that vector<bool> may be further specialized with a customer allocator.]

[Kona: Herb Sutter presented his paper J16/99-0035==WG21/N1211, vector<bool>: More Problems, Better Solutions. Much discussion of a two step approach: a) deprecate, b) provide replacement under a new name. LWG straw vote on that: 1-favor, 11-could live with, 2-over my dead body. This resolution was mentioned in the LWG report to the full committee, where several additional committee members indicated over-my-dead-body positions.]

Discussed at Lillehammer. General agreement that we should deprecate vector<bool> and introduce this functionality under a different name, e.g. bit_vector. This might make it possible to remove the vector<bool> specialization in the standard that comes after C++0x. There was also a suggestion that in C++0x we could additional say that it's implementation defined whether vector<bool> refers to the specialization or to the primary template, but there wasn't general agreement that this was a good idea.

We need a paper for the new bit_vector class.


201. Numeric limits terminology wrong

Section: 18.2.1 [lib.limits]  Status: Open  Submitter: Stephen Cleary  Date: 21 Dec 1999

In some places in this section, the terms "fundamental types" and "scalar types" are used when the term "arithmetic types" is intended. The current usage is incorrect because void is a fundamental type and pointers are scalar types, neither of which should have specializations of numeric_limits.

Proposed resolution:

[Lillehammer: it remains true that numeric_limits is using imprecise language. However, none of the proposals for changed wording are clearer. A redesign of numeric_limits is needed, but this is more a task than an open issue.]


233. Insertion hints in associative containers

Section: 23.1.2 [lib.associative.reqmts]  Status: Open  Submitter: Andrew Koenig  Date: 30 Apr 2000

If mm is a multimap and p is an iterator into the multimap, then mm.insert(p, x) inserts x into mm with p as a hint as to where it should go. Table 69 claims that the execution time is amortized constant if the insert winds up taking place adjacent to p, but does not say when, if ever, this is guaranteed to happen. All it says it that p is a hint as to where to insert.

The question is whether there is any guarantee about the relationship between p and the insertion point, and, if so, what it is.

I believe the present state is that there is no guarantee: The user can supply p, and the implementation is allowed to disregard it entirely.

Additional comments from Nathan:
The vote [in Redmond] was on whether to elaborately specify the use of the hint, or to require behavior only if the value could be inserted adjacent to the hint. I would like to ensure that we have a chance to vote for a deterministic treatment: "before, if possible, otherwise after, otherwise anywhere appropriate", as an alternative to the proposed "before or after, if possible, otherwise [...]".

Proposed resolution:

In table 69 "Associative Container Requirements" in 23.1.2 [lib.associative.reqmts], in the row for a.insert(p, t), change

iterator p is a hint pointing to where the insert should start to search.

to

insertion adjacent to iterator p is preferred if more than one insertion point is valid.

and change

logarithmic in general, but amortized constant if t is inserted right after p.

to

logarithmic in general, but amortized constant if t is inserted adjacent to iterator p.

[Toronto: there was general agreement that this is a real defect: when inserting an element x into a multiset that already contains several copies of x, there is no way to know whether the hint will be used. The proposed resolution was that the new element should always be inserted as close to the hint as possible. So, for example, if there is a subsequence of equivalent values, then providing a.begin() as the hint means that the new element should be inserted before the subsequence even if a.begin() is far away. JC van Winkel supplied precise wording for this proposed resolution, and also for an alternative resolution in which hints are only used when they are adjacent to the insertion point.]

[Copenhagen: the LWG agreed to the original proposed resolution, in which an insertion hint would be used even when it is far from the insertion point. This was contingent on seeing a reference implementation showing that it is possible to implement this requirement without loss of efficiency. John Potter provided such a reference implementation.]

[Redmond: The LWG was reluctant to adopt the proposal that emerged from Copenhagen: it seemed excessively complicated, and went beyond fixing the defect that we identified in Toronto. PJP provided the new wording described in this issue. Nathan agrees that we shouldn't adopt the more detailed semantics, and notes: "we know that you can do it efficiently enough with a red-black tree, but there are other (perhaps better) balanced tree techniques that might differ enough to make the detailed semantics hard to satisfy."]

[Curaçao: Nathan should give us the alternative wording he suggests so the LWG can decide between the two options.]

[Lillehammer: The LWG previously rejected the more detailed semantics, because it seemed more loike a new feature than like defect fixing. We're now more sympathetic to it, but we (especially Bill) are still worried about performance. N1780 describes a naive algorithm, but it's not clear whether there is a non-naive implementation. Is it possible to implement this as efficently as the current version of insert?]

[Post Lillehammer: N1780 updated in post meeting mailing with feedback from Lillehammer with more information regarding performance. ]


254. Exception types in clause 19 are constructed from std::string

Section: 19.1 [lib.std.exceptions]  Status: Open  Submitter: Dave Abrahams  Date: 01 Aug 2000

Many of the standard exception types which implementations are required to throw are constructed with a const std::string& parameter. For example:

     19.1.5  Class out_of_range                          [lib.out.of.range]
     namespace std {
       class out_of_range : public logic_error {
       public:
         explicit out_of_range(const string& what_arg);
       };
     }

   1 The class out_of_range defines the type of objects  thrown  as  excep-
     tions to report an argument value not in its expected range.

     out_of_range(const string& what_arg);

     Effects:
       Constructs an object of class out_of_range.
     Postcondition:
       strcmp(what(), what_arg.c_str()) == 0.

There are at least two problems with this:

  1. A program which is low on memory may end up throwing std::bad_alloc instead of out_of_range because memory runs out while constructing the exception object.
  2. An obvious implementation which stores a std::string data member may end up invoking terminate() during exception unwinding because the exception object allocates memory (or rather fails to) as it is being copied.

There may be no cure for (1) other than changing the interface to out_of_range, though one could reasonably argue that (1) is not a defect. Personally I don't care that much if out-of-memory is reported when I only have 20 bytes left, in the case when out_of_range would have been reported. People who use exception-specifications might care a lot, though.

There is a cure for (2), but it isn't completely obvious. I think a note for implementors should be made in the standard. Avoiding possible termination in this case shouldn't be left up to chance. The cure is to use a reference-counted "string" implementation in the exception object. I am not necessarily referring to a std::string here; any simple reference-counting scheme for a NTBS would do.

Further discussion, in email:

...I'm not so concerned about (1). After all, a library implementation can add const char* constructors as an extension, and users don't need to avail themselves of the standard exceptions, though this is a lame position to be forced into. FWIW, std::exception and std::bad_alloc don't require a temporary basic_string.

...I don't think the fixed-size buffer is a solution to the problem, strictly speaking, because you can't satisfy the postcondition
  strcmp(what(), what_arg.c_str()) == 0
For all values of what_arg (i.e. very long values). That means that the only truly conforming solution requires a dynamic allocation.

Further discussion, from Redmond:

The most important progress we made at the Redmond meeting was realizing that there are two separable issues here: the const string& constructor, and the copy constructor. If a user writes something like throw std::out_of_range("foo"), the const string& constructor is invoked before anything gets thrown. The copy constructor is potentially invoked during stack unwinding.

The copy constructor is a more serious problem, becuase failure during stack unwinding invokes terminate. The copy constructor must be nothrow. Curaçao: Howard thinks this requirement may already be present.

The fundamental problem is that it's difficult to get the nothrow requirement to work well with the requirement that the exception objects store a string of unbounded size, particularly if you also try to make the const string& constructor nothrow. Options discussed include:

(Not all of these options are mutually exclusive.)

Proposed resolution:

Rationale:

Throwing a bad_alloc while trying to construct a message for another exception-derived class is not necessarily a bad thing. And the bad_alloc constructor already has a no throw spec on it (18.4.2.1).

Future:

All involved would like to see const char* constructors added, but this should probably be done for C++0X as opposed to a DR.

I believe the no throw specs currently decorating these functions could be improved by some kind of static no throw spec checking mechanism (in a future C++ language). As they stand, the copy constructors might fail via a call to unexpected. I think what is intended here is that the copy constructors can't fail.

[Pre-Sydney: reopened at the request of Howard Hinnant. Post-Redmond: James Kanze noticed that the copy constructors of exception-derived classes do not have nothrow clauses. Those classes have no copy constructors declared, meaning the compiler-generated implicit copy constructors are used, and those compiler-generated constructors might in principle throw anything.]


255. Why do basic_streambuf<>::pbump() and gbump() take an int?

Section: 27.5.2 [lib.streambuf]  Status: Open  Submitter: Martin Sebor  Date: 12 Aug 2000

The basic_streambuf members gbump() and pbump() are specified to take an int argument. This requirement prevents the functions from effectively manipulating buffers larger than std::numeric_limits<int>::max() characters. It also makes the common use case for these functions somewhat difficult as many compilers will issue a warning when an argument of type larger than int (such as ptrdiff_t on LLP64 architectures) is passed to either of the function. Since it's often the result of the subtraction of two pointers that is passed to the functions, a cast is necessary to silence such warnings. Finally, the usage of a native type in the functions signatures is inconsistent with other member functions (such as sgetn() and sputn()) that manipulate the underlying character buffer. Those functions take a streamsize argument.

Proposed resolution:

Change the signatures of these functions in the synopsis of template class basic_streambuf (27.5.2) and in their descriptions (27.5.2.3.1, p4 and 27.5.2.3.2, p4) to take a streamsize argument.

Although this change has the potential of changing the ABI of the library, the change will affect only platforms where int is different than the definition of streamsize. However, since both functions are typically inline (they are on all known implementations), even on such platforms the change will not affect any user code unless it explicitly relies on the existing type of the functions (e.g., by taking their address). Such a possibility is IMO quite remote.

Alternate Suggestion from Howard Hinnant, c++std-lib-7780:

This is something of a nit, but I'm wondering if streamoff wouldn't be a better choice than streamsize. The argument to pbump and gbump MUST be signed. But the standard has this to say about streamsize (27.4.1/2/Footnote):

[Footnote: streamsize is used in most places where ISO C would use size_t. Most of the uses of streamsize could use size_t, except for the strstreambuf constructors, which require negative values. It should probably be the signed type corresponding to size_t (which is what Posix.2 calls ssize_t). --- end footnote]

This seems a little weak for the argument to pbump and gbump. Should we ever really get rid of strstream, this footnote might go with it, along with the reason to make streamsize signed.

Rationale:

The LWG believes this change is too big for now. We may wish to reconsider this for a future revision of the standard. One possibility is overloading pbump, rather than changing the signature.

[ [2006-05-04: Reopened at the request of Chris (Krzysztof ?elechowski)] ]


258. Missing allocator requirement

Section: 20.1.5 [lib.default.con.req]  Status: Open  Submitter: Matt Austern  Date: 22 Aug 2000

From lib-7752:

I've been assuming (and probably everyone else has been assuming) that allocator instances have a particular property, and I don't think that property can be deduced from anything in Table 32.

I think we have to assume that allocator type conversion is a homomorphism. That is, if x1 and x2 are of type X, where X::value_type is T, and if type Y is X::template rebind<U>::other, then Y(x1) == Y(x2) if and only if x1 == x2.

Further discussion: Howard Hinnant writes, in lib-7757:

I think I can prove that this is not provable by Table 32. And I agree it needs to be true except for the "and only if". If x1 != x2, I see no reason why it can't be true that Y(x1) == Y(x2). Admittedly I can't think of a practical instance where this would happen, or be valuable. But I also don't see a need to add that extra restriction. I think we only need:

if (x1 == x2) then Y(x1) == Y(x2)

If we decide that == on allocators is transitive, then I think I can prove the above. But I don't think == is necessarily transitive on allocators. That is:

Given x1 == x2 and x2 == x3, this does not mean x1 == x3.

Example:

x1 can deallocate pointers from: x1, x2, x3
x2 can deallocate pointers from: x1, x2, x4
x3 can deallocate pointers from: x1, x3
x4 can deallocate pointers from: x2, x4

x1 == x2, and x2 == x4, but x1 != x4

Proposed resolution:

[Toronto: LWG members offered multiple opinions. One opinion is that it should not be required that x1 == x2 implies Y(x1) == Y(x2), and that it should not even be required that X(x1) == x1. Another opinion is that the second line from the bottom in table 32 already implies the desired property. This issue should be considered in light of other issues related to allocator instances.]


290. Requirements to for_each and its function object

Section: 25.1.1 [lib.alg.foreach]  Status: Open  Submitter: Angelika Langer  Date: 03 Jan 2001

The specification of the for_each algorithm does not have a "Requires" section, which means that there are no restrictions imposed on the function object whatsoever. In essence it means that I can provide any function object with arbitrary side effects and I can still expect a predictable result. In particular I can expect that the function object is applied exactly last - first times, which is promised in the "Complexity" section.

I don't see how any implementation can give such a guarantee without imposing requirements on the function object.

Just as an example: consider a function object that removes elements from the input sequence. In that case, what does the complexity guarantee (applies f exactly last - first times) mean?

One can argue that this is obviously a nonsensical application and a theoretical case, which unfortunately it isn't. I have seen programmers shooting themselves in the foot this way, and they did not understand that there are restrictions even if the description of the algorithm does not say so.

Proposed resolution:

[Lillehammer: This is more general than for_each. We don't want the function object in transform invalidiating iterators either. There should be a note somewhere in clause 17 (17, not 25) saying that user code operating on a range may not invalidate iterators unless otherwise specified. Bill will provide wording.]


299. Incorrect return types for iterator dereference

Section: 24.1.4 [lib.bidirectional.iterators], 24.1.5 [lib.random.access.iterators]  Status: Open  Submitter: John Potter  Date: 22 Jan 2001

In section 24.1.4 [lib.bidirectional.iterators], Table 75 gives the return type of *r-- as convertible to T. This is not consistent with Table 74 which gives the return type of *r++ as T&. *r++ = t is valid while *r-- = t is invalid.

In section 24.1.5 [lib.random.access.iterators], Table 76 gives the return type of a[n] as convertible to T. This is not consistent with the semantics of *(a + n) which returns T& by Table 74. *(a + n) = t is valid while a[n] = t is invalid.

Discussion from the Copenhagen meeting: the first part is uncontroversial. The second part, operator[] for Random Access Iterators, requires more thought. There are reasonable arguments on both sides. Return by value from operator[] enables some potentially useful iterators, e.g. a random access "iota iterator" (a.k.a "counting iterator" or "int iterator"). There isn't any obvious way to do this with return-by-reference, since the reference would be to a temporary. On the other hand, reverse_iterator takes an arbitrary Random Access Iterator as template argument, and its operator[] returns by reference. If we decided that the return type in Table 76 was correct, we would have to change reverse_iterator. This change would probably affect user code.

History: the contradiction between reverse_iterator and the Random Access Iterator requirements has been present from an early stage. In both the STL proposal adopted by the committee (N0527==94-0140) and the STL technical report (HPL-95-11 (R.1), by Stepanov and Lee), the Random Access Iterator requirements say that operator[]'s return value is "convertible to T". In N0527 reverse_iterator's operator[] returns by value, but in HPL-95-11 (R.1), and in the STL implementation that HP released to the public, reverse_iterator's operator[] returns by reference. In 1995, the standard was amended to reflect the contents of HPL-95-11 (R.1). The original intent for operator[] is unclear.

In the long term it may be desirable to add more fine-grained iterator requirements, so that access method and traversal strategy can be decoupled. (See "Improved Iterator Categories and Requirements", N1297 = 01-0011, by Jeremy Siek.) Any decisions about issue 299 should keep this possibility in mind.

Further discussion: I propose a compromise between John Potter's resolution, which requires T& as the return type of a[n], and the current wording, which requires convertible to T. The compromise is to keep the convertible to T for the return type of the expression a[n], but to also add a[n] = t as a valid expression. This compromise "saves" the common case uses of random access iterators, while at the same time allowing iterators such as counting iterator and caching file iterators to remain random access iterators (iterators where the lifetime of the object returned by operator*() is tied to the lifetime of the iterator).

Note that the compromise resolution necessitates a change to reverse_iterator. It would need to use a proxy to support a[n] = t.

Note also there is one kind of mutable random access iterator that will no longer meet the new requirements. Currently, iterators that return an r-value from operator[] meet the requirements for a mutable random access iterartor, even though the expression a[n] = t will only modify a temporary that goes away. With this proposed resolution, a[n] = t will be required to have the same operational semantics as *(a + n) = t.

Proposed resolution:

In section 24.1.4 [lib.bidirectdional.iterators], change the return type in table 75 from "convertible to T" to T&.

In section 24.1.5 [lib.random.access.iterators], change the operational semantics for a[n] to " the r-value of a[n] is equivalent to the r-value of *(a + n)". Add a new row in the table for the expression a[n] = t with a return type of convertible to T and operational semantics of *(a + n) = t.

[Lillehammer: Real problem, but should be addressed as part of iterator redesign]


309. Does sentry catch exceptions?

Section: 27.6 [lib.iostream.format]  Status: Open  Submitter: Martin Sebor  Date: 19 Mar 2001

The descriptions of the constructors of basic_istream<>::sentry (27.6.1.1.2 [lib.istream::sentry]) and basic_ostream<>::sentry (27.6.2.3 [lib.ostream::sentry]) do not explain what the functions do in case an exception is thrown while they execute. Some current implementations allow all exceptions to propagate, others catch them and set ios_base::badbit instead, still others catch some but let others propagate.

The text also mentions that the functions may call setstate(failbit) (without actually saying on what object, but presumably the stream argument is meant). That may have been fine for basic_istream<>::sentry prior to issue 195, since the function performs an input operation which may fail. However, issue 195 amends 27.6.1.1.2 [lib.istream::sentry], p2 to clarify that the function should actually call setstate(failbit | eofbit), so the sentence in p3 is redundant or even somewhat contradictory.

The same sentence that appears in 27.6.2.3 [lib.ostream::sentry], p3 doesn't seem to be very meaningful for basic_istream<>::sentry which performs no input. It is actually rather misleading since it would appear to guide library implementers to calling setstate(failbit) when os.tie()->flush(), the only called function, throws an exception (typically, it's badbit that's set in response to such an event).

Additional comments from Martin, who isn't comfortable with the current proposed resolution (see c++std-lib-11530)

The istream::sentry ctor says nothing about how the function deals with exemptions (27.6.1.1.2, p1 says that the class is responsible for doing "exception safe"(*) prefix and suffix operations but it doesn't explain what level of exception safety the class promises to provide). The mockup example of a "typical implementation of the sentry ctor" given in 27.6.1.1.2, p6, removed in ISO/IEC 14882:2003, doesn't show exception handling, either. Since the ctor is not classified as a formatted or unformatted input function, the text in 27.6.1.1, p1 through p4 does not apply. All this would seem to suggest that the sentry ctor should not catch or in any way handle exceptions thrown from any functions it may call. Thus, the typical implementation of an istream extractor may look something like [1].

The problem with [1] is that while it correctly sets ios::badbit if an exception is thrown from one of the functions called from the sentry ctor, if the sentry ctor reaches EOF while extracting whitespace from a stream that has eofbit or failbit set in exceptions(), it will cause an ios::failure to be thrown, which will in turn cause the extractor to set ios::badbit.

The only straightforward way to prevent this behavior is to move the definition of the sentry object in the extractor above the try block (as suggested by the example in 22.2.8, p9 and also indirectly supported by 27.6.1.3, p1). See [2]. But such an implementation will allow exceptions thrown from functions called from the ctor to freely propagate to the caller regardless of the setting of ios::badbit in the stream object's exceptions().

So since neither [1] nor [2] behaves as expected, the only possible solution is to have the sentry ctor catch exceptions thrown from called functions, set badbit, and propagate those exceptions if badbit is also set in exceptions(). (Another solution exists that deals with both kinds of sentries, but the code is non-obvious and cumbersome -- see [3].)

Please note that, as the issue points out, current libraries do not behave consistently, suggesting that implementors are not quite clear on the exception handling in istream::sentry, despite the fact that some LWG members might feel otherwise. (As documented by the parenthetical comment here: http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1480.html#309)

Also please note that those LWG members who in Copenhagen felt that "a sentry's constructor should not catch exceptions, because sentries should only be used within (un)formatted input functions and that exception handling is the responsibility of those functions, not of the sentries," as noted here http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2001/n1310.html#309 would in effect be either arguing for the behavior described in [1] or for extractors implemented along the lines of [3].

The original proposed resolution (Revision 25 of the issues list) clarifies the role of the sentry ctor WRT exception handling by making it clear that extractors (both library or user-defined) should be implemented along the lines of [2] (as opposed to [1]) and that no exception thrown from the callees should propagate out of either function unless badbit is also set in exceptions().

[1] Extractor that catches exceptions thrown from sentry:

struct S { long i; };

istream& operator>> (istream &strm, S &s)
{
    ios::iostate err = ios::goodbit;
    try {
        const istream::sentry guard (strm, false);
        if (guard) {
            use_facet<num_get<char> >(strm.getloc ())
                .get (istreambuf_iterator<char>(strm),
                      istreambuf_iterator<char>(),
                      strm, err, s.i);
        }
    }
    catch (...) {
        bool rethrow;
        try {
            strm.setstate (ios::badbit);
            rethrow = false;
        }
        catch (...) {
            rethrow = true;
        }
        if (rethrow)
            throw;
    }
    if (err)
        strm.setstate (err);
    return strm;
}

[2] Extractor that propagates exceptions thrown from sentry:

istream& operator>> (istream &strm, S &s)
{
    istream::sentry guard (strm, false);
    if (guard) {
        ios::iostate err = ios::goodbit;
        try {
            use_facet<num_get<char> >(strm.getloc ())
                .get (istreambuf_iterator<char>(strm),
                      istreambuf_iterator<char>(),
                      strm, err, s.i);
        }
        catch (...) {
            bool rethrow;
            try {
                strm.setstate (ios::badbit);
                rethrow = false;
            }
            catch (...) {
                rethrow = true;
            }
            if (rethrow)
                throw;
        }
        if (err)
            strm.setstate (err);
    }
    return strm;
}

[3] Extractor that catches exceptions thrown from sentry but doesn't set badbit if the exception was thrown as a result of a call to strm.clear().

istream& operator>> (istream &strm, S &s)
{
    const ios::iostate state = strm.rdstate ();
    const ios::iostate except = strm.exceptions ();
    ios::iostate err = std::ios::goodbit;
    bool thrown = true;
    try {
        const istream::sentry guard (strm, false);
        thrown = false;
        if (guard) {
            use_facet<num_get<char> >(strm.getloc ())
                .get (istreambuf_iterator<char>(strm),
                      istreambuf_iterator<char>(),
                      strm, err, s.i);
        }
    }
    catch (...) {
        if (thrown && state & except)
            throw;
        try {
            strm.setstate (ios::badbit);
            thrown = false;
        }
        catch (...) {
            thrown = true;
        }
        if (thrown)
            throw;
    }
    if (err)
        strm.setstate (err);

    return strm;
}

[Pre-Berlin] Reopened at the request of Paolo Carlini and Steve Clamage.

Proposed resolution:

Rationale:

The LWG agrees there is minor variation between implementations, but believes that it doesn't matter. This is a rarely used corner case. There is no evidence that this has any commercial importance or that it causes actual portability problems for customers trying to write code that runs on multiple implementations.


342. seek and eofbit

Section: 27.6.1.3 [lib.istream.unformatted]  Status: Open  Submitter: Howard Hinnant  Date: 09 Oct 2001

I think we have a defect.

According to lwg issue 60 which is now a dr, the description of seekg in 27.6.1.3 [lib.istream.unformatted] paragraph 38 now looks like:

Behaves as an unformatted input function (as described in 27.6.1.3, paragraph 1), except that it does not count the number of characters extracted and does not affect the value returned by subsequent calls to gcount(). After constructing a sentry object, if fail() != true, executes rdbuf()­>pubseekpos( pos).

And according to lwg issue 243 which is also now a dr, 27.6.1.3, paragraph 1 looks like:

Each unformatted input function begins execution by constructing an object of class sentry with the default argument noskipws (second) argument true. If the sentry object returns true, when converted to a value of type bool, the function endeavors to obtain the requested input. Otherwise, if the sentry constructor exits by throwing an exception or if the sentry object returns false, when converted to a value of type bool, the function returns without attempting to obtain any input. In either case the number of extracted characters is set to 0; unformatted input functions taking a character array of non-zero size as an argument shall also store a null character (using charT()) in the first location of the array. If an exception is thrown during input then ios::badbit is turned on in *this'ss error state. If (exception()&badbit)!= 0 then the exception is rethrown. It also counts the number of characters extracted. If no exception has been thrown it ends by storing the count in a member object and returning the value specified. In any event the sentry object is destroyed before leaving the unformatted input function.

And finally 27.6.1.1.2/5 says this about sentry:

If, after any preparation is completed, is.good() is true, ok_ != false otherwise, ok_ == false.

So although the seekg paragraph says that the operation proceeds if !fail(), the behavior of unformatted functions says the operation proceeds only if good(). The two statements are contradictory when only eofbit is set. I don't think the current text is clear which condition should be respected.

Further discussion from Redmond:

PJP: It doesn't seem quite right to say that seekg is "unformatted". That makes specific claims about sentry that aren't quite appropriate for seeking, which has less fragile failure modes than actual input. If we do really mean that it's unformatted input, it should behave the same way as other unformatted input. On the other hand, "principle of least surprise" is that seeking from EOF ought to be OK.

Pre-Berlin: Paolo points out several problems with the proposed resolution in Ready state:

Proposed resolution:

Change 27.6.1.3 [lib.istream.unformatted] to:

Behaves as an unformatted input function (as described in 27.6.1.3, paragraph 1), except that it does not count the number of characters extracted, does not affect the value returned by subsequent calls to gcount(), and does not examine the value returned by the sentry object. After constructing a sentry object, if fail() != true, executes rdbuf()->pubseekpos(pos). In case of success, the function calls clear(). In case of failure, the function calls setstate(failbit) (which may throw ios_base::failure).

[Lillehammer: Matt provided wording.]

Rationale:

In C, fseek does clear EOF. This is probably what most users would expect. We agree that having eofbit set should not deter a seek, and that a successful seek should clear eofbit. Note that fail() is true only if failbit or badbit is set, so using !fail(), rather than good(), satisfies this goal.


382. codecvt do_in/out result

Section: 22.2.1.5 [lib.locale.codecvt.byname]  Status: Open  Submitter: Martin Sebor  Date: 30 Aug 2002

It seems that the descriptions of codecvt do_in() and do_out() leave sufficient room for interpretation so that two implementations of codecvt may not work correctly with the same filebuf. Specifically, the following seems less than adequately specified:

  1. the conditions under which the functions terminate
  2. precisely when the functions return ok
  3. precisely when the functions return partial
  4. the full set of conditions when the functions return error
  1. 22.2.1.5.2, p2 says this about the effects of the function: ...Stops if it encounters a character it cannot convert... This assumes that there *is* a character to convert. What happens when there is a sequence that doesn't form a valid source character, such as an unassigned or invalid UNICODE character, or a sequence that cannot possibly form a character (e.g., the sequence "\xc0\xff" in UTF-8)?
  2. Table 53 says that the function returns codecvt_base::ok to indicate that the function(s) "completed the conversion." Suppose that the source sequence is "\xc0\x80" in UTF-8, with from pointing to '\xc0' and (from_end==from + 1). It is not clear whether the return value should be ok or partial (see below).
  3. Table 53 says that the function returns codecvt_base::partial if "not all source characters converted." With the from pointers set up the same way as above, it is not clear whether the return value should be partial or ok (see above).
  4. Table 53, in the row describing the meaning of error mistakenly refers to a "from_type" character, without the symbol from_type having been defined. Most likely, the word "source" character is intended, although that is not sufficient. The functions may also fail when they encounter an invalid source sequence that cannot possibly form a valid source character (e.g., as explained in bullet 1 above).

Finally, the conditions described at the end of 22.2.1.5.2, p4 don't seem to be possible:

"A return value of partial, if (from_next == from_end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced."

If the value is partial, it's not clear to me that (from_next ==from_end) could ever hold if there isn't enough room in the destination buffer. In order for (from_next==from_end) to hold, all characters in that range must have been successfully converted (according to 22.2.1.5.2, p2) and since there are no further source characters to convert, no more room in the destination buffer can be needed.

It's also not clear to me that (from_next==from_end) could ever hold if additional source elements are needed to produce another destination character (not element as incorrectly stated in the text). partial is returned if "not all source characters have been converted" according to Table 53, which also implies that (from_next==from) does NOT hold.

Could it be that the intended qualifying condition was actually (from_next != from_end), i.e., that the sentence was supposed to read

"A return value of partial, if (from_next != from_end),..."

which would make perfect sense, since, as far as I understand it, partial can only occur if (from_next != from_end)?

Proposed resolution:

[Lillehammer: Defer for the moment, but this really needs to be fixed. Right now, the description of codecvt is too vague for it to be a useful contract between providers and clients of codecvt facets. (Note that both vendors and users can be both providers and clients of codecvt facets.) The major philosophical issue is whether the standard should only describe mappings that take a single wide character to multiple narrow characters (and vice versa), or whether it should describe fully general N-to-M conversions. When the original standard was written only the former was contemplated, but today, in light of the popularity of utf8 and utf16, that doesn't seem sufficient for C++0x. Bill supports general N-to-M conversions; we need to make sure Martin and Howard agree.]


385. Does call by value imply the CopyConstructible requirement?

Section: 17 [lib.library]  Status: Open  Submitter: Matt Austern  Date: 23 Oct 2002

Many function templates have parameters that are passed by value; a typical example is find_if's pred parameter in 25.1.2 [lib.alg.find]. Are the corresponding template parameters (Predicate in this case) implicitly required to be CopyConstructible, or does that need to be spelled out explicitly?

This isn't quite as silly a question as it might seem to be at first sight. If you call find_if in such a way that template argument deduction applies, then of course you'll get call by value and you need to provide a copy constructor. If you explicitly provide the template arguments, however, you can force call by reference by writing something like find_if<my_iterator, my_predicate&>. The question is whether implementation are required to accept this, or whether this is ill-formed because my_predicate& is not CopyConstructible.

The scope of this problem, if it is a problem, is unknown. Function object arguments to generic algorithms in clauses 25 [lib.algorithms] and 26 [lib.numerics] are obvious examples. A review of the whole library is necessary.

Proposed resolution:

[ This is really two issues. First, predicates are typically passed by value but we don't say they must be Copy Constructible. They should be. Second: is specialization allowed to transform value arguments into references? References aren't copy constructible, so this should not be allowed. ]


387. std::complex over-encapsulated

Section: 26.3 [lib.complex.numbers]  Status: Open  Submitter: Gabriel Dos Reis  Date: 8 Nov 2002

The absence of explicit description of std::complex<T> layout makes it imposible to reuse existing software developed in traditional languages like Fortran or C with unambigous and commonly accepted layout assumptions. There ought to be a way for practitioners to predict with confidence the layout of std::complex<T> whenever T is a numerical datatype. The absence of ways to access individual parts of a std::complex<T> object as lvalues unduly promotes severe pessimizations. For example, the only way to change, independently, the real and imaginary parts is to write something like

complex<T> z;
// ...
// set the real part to r
z = complex<T>(r, z.imag());
// ...
// set the imaginary part to i
z = complex<T>(z.real(), i);

At this point, it seems appropriate to recall that a complex number is, in effect, just a pair of numbers with no particular invariant to maintain. Existing practice in numerical computations has it that a complex number datatype is usually represented by Cartesian coordinates. Therefore the over-encapsulation put in the specification of std::complex<> is not justified.

Proposed resolution:

Add the following requirements to 26.3 [lib.complex.numbers] as 26.3/4:

If z is an lvalue expression of type cv std::complex<T> then

Moreover, if a is an expression of pointer type cv complex<T>* and the expression a[i] is well-defined for an integer expression i then:

In the header synopsis in 26.3.1 [lib.complex.synopsis], replace

  template<class T> T real(const complex<T>&);
  template<class T> T imag(const complex<T>&);

with

  template<class T> const T& real(const complex<T>&);
  template<class T>       T& real(      complex<T>&);
  template<class T> const T& imag(const complex<T>&);
  template<class T>       T& imag(      complex<T>&);

In 26.3.7 [lib.complex.value.ops] paragraph 1, change

  template<class T> T real(const complex<T>&);

to

  template<class T> const T& real(const complex<T>&);
  template<class T>       T& real(      complex<T>&);

and change the Returns clause to "Returns: The real part of x

.

In 26.3.7 [lib.complex.value.ops] paragraph 2, change

  template<class T> T imag(const complex<T>&);

to

  template<class T> const T& imag(const complex<T>&);
  template<class T>       T& imag(      complex<T>&);

and change the Returns clause to "Returns: The imaginary part of x

.

[Kona: The layout guarantee is absolutely necessary for C compatibility. However, there was disagreement about the other part of this proposal: retrieving elements of the complex number as lvalues. An alternative: continue to have real() and imag() return rvalues, but add set_real() and set_imag(). Straw poll: return lvalues - 2, add setter functions - 5. Related issue: do we want reinterpret_cast as the interface for converting a complex to an array of two reals, or do we want to provide a more explicit way of doing it? Howard will try to resolve this issue for the next meeting.]

[pre-Sydney: Howard summarized the options in n1589.]

Rationale:

The LWG believes that C99 compatibility would be enough justification for this change even without other considerations. All existing implementations already have the layout proposed here.


394. behavior of formatted output on failure

Section: 27.6.2.5.1 [lib.ostream.formatted.reqmts]  Status: Open  Submitter: Martin Sebor  Date: 27 Dec 2002

There is a contradiction in Formatted output about what bit is supposed to be set if the formatting fails. On sentence says it's badbit and another that it's failbit.

27.6.2.5.1, p1 says in the Common Requirements on Formatted output functions:

     ... If the generation fails, then the formatted output function
     does setstate(ios::failbit), which might throw an exception.

27.6.2.5.2, p1 goes on to say this about Arithmetic Inserters:

... The formatting conversion occurs as if it performed the following code fragment:

     bool failed =
         use_facet<num_put<charT,ostreambuf_iterator<charT,traits>
         > >
         (getloc()).put(*this, *this, fill(), val). failed();

     ... If failed is true then does setstate(badbit) ...

The original intent of the text, according to Jerry Schwarz (see c++std-lib-10500), is captured in the following paragraph:

In general "badbit" should mean that the stream is unusable because of some underlying failure, such as disk full or socket closure; "failbit" should mean that the requested formatting wasn't possible because of some inconsistency such as negative widths. So typically if you clear badbit and try to output something else you'll fail again, but if you clear failbit and try to output something else you'll succeed.

In the case of the arithmetic inserters, since num_put cannot report failure by any means other than exceptions (in response to which the stream must set badbit, which prevents the kind of recoverable error reporting mentioned above), the only other detectable failure is if the iterator returned from num_put returns true from failed().

Since that can only happen (at least with the required iostream specializations) under such conditions as the underlying failure referred to above (e.g., disk full), setting badbit would seem to be the appropriate response (indeed, it is required in 27.6.2.5.2, p1). It follows that failbit can never be directly set by the arithmetic (it can only be set by the sentry object under some unspecified conditions).

The situation is different for other formatted output functions which can fail as a result of the streambuf functions failing (they may do so by means other than exceptions), and which are then required to set failbit.

The contradiction, then, is that ostream::operator<<(int) will set badbit if the disk is full, while operator<<(ostream&, char) will set failbit under the same conditions. To make the behavior consistent, the Common requirements sections for the Formatted output functions should be changed as proposed below.

Proposed resolution:

[Kona: There's agreement that this is a real issue. What we decided at Kona: 1. An error from the buffer (which can be detected either directly from streambuf's member functions or by examining a streambuf_iterator) should always result in badbit getting set. 2. There should never be a circumstance where failbit gets set. That represents a formatting error, and there are no circumstances under which the output facets are specified as signaling a formatting error. (Even more so for string output that for numeric because there's nothing to format.) If we ever decide to make it possible for formatting errors to exist then the facets can signal the error directly, and that should go in clause 22, not clause 27. 3. The phrase "if generation fails" is unclear and should be eliminated. It's not clear whether it's intended to mean a buffer error (e.g. a full disk), a formatting error, or something else. Most people thought it was supposed to refer to buffer errors; if so, we should say so. Martin will provide wording.]

Rationale:


396. what are characters zero and one

Section: 23.3.5.1 [lib.bitset.cons]  Status: Open  Submitter: Martin Sebor  Date: 5 Jan 2003

23.3.5.1, p6 [lib.bitset.cons] talks about a generic character having the value of 0 or 1 but there is no definition of what that means for charT other than char and wchar_t. And even for those two types, the values 0 and 1 are not actually what is intended -- the values '0' and '1' are. This, along with the converse problem in the description of to_string() in 23.3.5.2, p33, looks like a defect remotely related to DR 303.

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#303

23.3.5.1:
  -6-  An element of the constructed string has value zero if the
       corresponding character in str, beginning at position pos,
       is 0. Otherwise, the element has the value one.
    
23.3.5.2:
  -33-  Effects: Constructs a string object of the appropriate
        type and initializes it to a string of length N characters.
        Each character is determined by the value of its
        corresponding bit position in *this. Character position N
        ?- 1 corresponds to bit position zero. Subsequent decreasing
        character positions correspond to increasing bit positions.
        Bit value zero becomes the character 0, bit value one becomes
        the character 1.
    

Also note the typo in 23.3.5.1, p6: the object under construction is a bitset, not a string.

Proposed resolution:

Change the constructor's function declaration immediately before 23.3.5.1 [lib.bitset.cons] p3 to:

    template <class charT, class traits, class Allocator>
    explicit
    bitset(const basic_string<charT, traits, Allocator>& str,
           typename basic_string<charT, traits, Allocator>::size_type pos = 0,
           typename basic_string<charT, traits, Allocator>::size_type n =
             basic_string<charT, traits, Allocator>::npos,
           charT zero = charT('0'), charT one = charT('1'))

Change the first two sentences of 23.3.5.1 [lib.bitset.cons] p6 to: "An element of the constructed string has value 0 if the corresponding character in str, beginning at position pos, is zero. Otherwise, the element has the value 1.

Change the text of the second sentence in 23.3.5.1, p5 to read: "The function then throws invalid_argument if any of the rlen characters in str beginning at position pos is other than zero or one. The function uses traits::eq() to compare the character values."

Change the declaration of the to_string member function immediately before 23.3.5.2 [lib.bitset.members] p33 to:

    template <class charT, class traits, class Allocator>
    basic_string<charT, traits, Allocator> 
    to_string(charT zero = charT('0'), charT one = charT('1')) const;

Change the last sentence of 23.3.5.2 [lib.bitset.members] p33 to: "Bit value 0 becomes the character zero, bit value 1 becomes the character one.

Change 23.3.5.3 [lib.bitset.operators] p8 to:

Returns:

  os << x.template to_string<charT,traits,allocator<charT> >(
      use_facet<ctype<charT> >(os.getloc()).widen('0'),
      use_facet<ctype<charT> >(os.getloc()).widen('1'));

Rationale:

There is a real problem here: we need the character values of '0' and '1', and we have no way to get them since strings don't have imbued locales. In principle the "right" solution would be to provide an extra object, either a ctype facet or a full locale, which would be used to widen '0' and '1'. However, there was some discomfort about using such a heavyweight mechanism. The proposed resolution allows those users who care about this issue to get it right.

We fix the inserter to use the new arguments. Note that we already fixed the analogous problem with the extractor in issue 303.


397. ostream::sentry dtor throws exceptions

Section: 27.6.2.3 [lib.ostream::sentry]  Status: Open  Submitter: Martin Sebor  Date: 5 Jan 2003

17.4.4.8, p3 prohibits library dtors from throwing exceptions.

27.6.2.3, p4 says this about the ostream::sentry dtor:

    -4- If ((os.flags() & ios_base::unitbuf) && !uncaught_exception())
        is true, calls os.flush().
    

27.6.2.6, p7 that describes ostream::flush() says:

    -7- If rdbuf() is not a null pointer, calls rdbuf()->pubsync().
        If that function returns ?-1 calls setstate(badbit) (which
        may throw ios_base::failure (27.4.4.3)).
    

That seems like a defect, since both pubsync() and setstate() can throw an exception.

Proposed resolution:

[ The contradiction is real. Clause 17 says destructors may never throw exceptions, and clause 27 specifies a destructor that does throw. In principle we might change either one. We're leaning toward changing clause 17: putting in an "unless otherwise specified" clause, and then putting in a footnote saying the sentry destructor is the only one that can throw. PJP suggests specifying that sentry::~sentry() should internally catch any exceptions it might cause. ]


398. effects of end-of-file on unformatted input functions

Section: 27.6.2.3 [lib.ostream::sentry]  Status: Open  Submitter: Martin Sebor  Date: 5 Jan 2003

While reviewing unformatted input member functions of istream for their behavior when they encounter end-of-file during input I found that the requirements vary, sometimes unexpectedly, and in more than one case even contradict established practice (GNU libstdc++ 3.2, IBM VAC++ 6.0, STLPort 4.5, SunPro 5.3, HP aCC 5.38, Rogue Wave libstd 3.1, and Classic Iostreams).

The following unformatted input member functions set eofbit if they encounter an end-of-file (this is the expected behavior, and also the behavior of all major implementations):

    basic_istream<charT, traits>&
    get (char_type*, streamsize, char_type);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    get (char_type*, streamsize);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    getline (char_type*, streamsize, char_type);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    getline (char_type*, streamsize);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    ignore (int, int_type);
    

    basic_istream<charT, traits>&
    read (char_type*, streamsize);
    

Also sets failbit if it encounters end-of-file.

    streamsize readsome (char_type*, streamsize);
    

The following unformated input member functions set failbit but not eofbit if they encounter an end-of-file (I find this odd since the functions make it impossible to distinguish a general failure from a failure due to end-of-file; the requirement is also in conflict with all major implementation which set both eofbit and failbit):

    int_type get();
    

    basic_istream<charT, traits>&
    get (char_type&);
    

These functions only set failbit of they extract no characters, otherwise they don't set any bits, even on failure (I find this inconsistency quite unexpected; the requirement is also in conflict with all major implementations which set eofbit whenever they encounter end-of-file):

    basic_istream<charT, traits>&
    get (basic_streambuf<charT, traits>&, char_type);
    

    basic_istream<charT, traits>&
    get (basic_streambuf<charT, traits>&);
    

This function sets no bits (all implementations except for STLport and Classic Iostreams set eofbit when they encounter end-of-file):

    int_type peek ();
    

Proposed resolution:

Informally, what we want is a global statement of intent saying that eofbit gets set if we trip across EOF, and then we can take away the specific wording for individual functions. A full review is necessary. The wording currently in the standard is a mishmash, and changing it on an individual basis wouldn't make things better. Dietmar will do this work.


401.  incorrect type casts in table 32 in lib.allocator.requirements

Section: 20.1.5 [lib.default.con.req]  Status: Open  Submitter: Markus Mauhart  Date: 27 Feb 2003

I think that in par2 of 20.1.5 [lib.default.con.req] the last two lines of table 32 contain two incorrect type casts. The lines are ...

  a.construct(p,t)   Effect: new((void*)p) T(t)
  a.destroy(p)       Effect: ((T*)p)?->~T()

.... with the prerequisits coming from the preceding two paragraphs, especially from table 31:

  alloc<T>             a     ;// an allocator for T
  alloc<T>::pointer    p     ;// random access iterator
                              // (may be different from T*)
  alloc<T>::reference  r = *p;// T&
  T const&             t     ;

For that two type casts ("(void*)p" and "(T*)p") to be well-formed this would require then conversions to T* and void* for all alloc<T>::pointer, so it would implicitely introduce extra requirements for alloc<T>::pointer, additionally to the only current requirement (being a random access iterator).

Proposed resolution:

"(void*)p" should be replaced with "(void*)&*p" and that "((T*)p)?->" should be replaced with "(*p)." or with "(&*p)->".

Note: Actually I would prefer to replace "((T*)p)?->dtor_name" with "p?->dtor_name", but AFAICS this is not possible cause of an omission in 13.5.6 [over.ref] (for which I have filed another DR on 29.11.2002).

[Kona: The LWG thinks this is somewhere on the border between Open and NAD. The intend is clear: construct constructs an object at the location p. It's reading too much into the description to think that literally calling new is required. Tweaking this description is low priority until we can do a thorough review of allocators, and, in particular, allocators with non-default pointer types.]


408. Is vector<reverse_iterator<char*> > forbidden?

Section: 24.1 [lib.iterator.requirements]  Status: Open  Submitter: Nathan Myers  Date: 3 June 2003

I've been discussing iterator semantics with Dave Abrahams, and a surprise has popped up. I don't think this has been discussed before.

24.1 [lib.iterator.requirements] says that the only operation that can be performed on "singular" iterator values is to assign a non-singular value to them. (It doesn't say they can be destroyed, and that's probably a defect.) Some implementations have taken this to imply that there is no need to initialize the data member of a reverse_iterator<> in the default constructor. As a result, code like

std::vector<std::reverse_iterator<char*> > v(7); v.reserve(1000);

invokes undefined behavior, because it must default-initialize the vector elements, and then copy them to other storage. Of course many other vector operations on these adapters are also left undefined, and which those are is not reliably deducible from the standard.

I don't think that 24.1 was meant to make standard-library iterator types unsafe. Rather, it was meant to restrict what operations may be performed by functions which take general user- and standard iterators as arguments, so that raw pointers would qualify as iterators. However, this is not clear in the text, others have come to the opposite conclusion.

One question is whether the standard iterator adaptors have defined copy semantics. Another is whether they have defined destructor semantics: is

{ std::vector<std::reverse_iterator<char*> > v(7); }

undefined too?

Note this is not a question of whether algorithms are allowed to rely on copy semantics for arbitrary iterators, just whether the types we actually supply support those operations. I believe the resolution must be expressed in terms of the semantics of the adapter's argument type. It should make clear that, e.g., the reverse_iterator<T> constructor is actually required to execute T(), and so copying is defined if the result of T() is copyable.

Issue 235, which defines reverse_iterator's default constructor more precisely, has some relevance to this issue. However, it is not the whole story.

The issue was whether

reverse_iterator() { }

is allowed, vs.

reverse_iterator() : current() { }

The difference is when T is char*, where the first leaves the member uninitialized, and possibly equal to an existing pointer value, or (on some targets) may result in a hardware trap when copied.

8.5 paragraph 5 seems to make clear that the second is required to satisfy DR 235, at least for non-class Iterator argument types.

But that only takes care of reverse_iterator, and doesn't establish a policy for all iterators. (The reverse iterator adapter was just an example.) In particular, does my function

template <typename Iterator> void f() { std::vector<Iterator> v(7); }

evoke undefined behavior for some conforming iterator definitions? I think it does, now, because vector<> will destroy those singular iterator values, and that's explicitly disallowed.

24.1 shouldn't give blanket permission to copy all singular iterators, because then pointers wouldn't qualify as iterators. However, it should allow copying of that subset of singular iterator values that are default-initialized, and it should explicitly allow destroying any iterator value, singular or not, default-initialized or not.

Related issue: 407

Proposed resolution:

[ We don't want to require all singular iterators to be copyable, because that is not the case for pointers. However, default construction may be a special case. Issue: is it really default construction we want to talk about, or is it something like value initialization? We need to check with core to see whether default constructed pointers are required to be copyable; if not, it would be wrong to impose so strict a requirement for iterators. ]


416. definitions of XXX_MIN and XXX_MAX macros in climits

Section: 18.2.2 [lib.c.limits]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

Given two overloads of the function foo(), one taking an argument of type int and the other taking a long, which one will the call foo(LONG_MAX) resolve to? The expected answer should be foo(long), but whether that is true depends on the #defintion of the LONG_MAX macro, specifically its type. This issue is about the fact that the type of these macros is not actually required to be the same as the the type each respective limit.
Section 18.2.2 of the C++ Standard does not specify the exact types of the XXX_MIN and XXX_MAX macros #defined in the <climits> and <limits.h> headers such as INT_MAX and LONG_MAX and instead defers to the C standard.
Section 5.2.4.2.1, p1 of the C standard specifies that "The values [of these constants] shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall be replaced by expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions."
The "corresponding type converted according to the integer promotions" for LONG_MAX is, according to 6.4.4.1, p5 of the C standard, the type of long converted to the first of the following set of types that can represent it: int, long int, long long int. So on an implementation where (sizeof(long) == sizeof(int)) this type is actually int, while on an implementation where (sizeof(long) > sizeof(int)) holds this type will be long.
This is not an issue in C since the type of the macro cannot be detected by any conforming C program, but it presents a portability problem in C++ where the actual type is easily detectable by overload resolution.

Proposed resolution:

[Kona: the LWG does not believe this is a defect. The C macro definitions are what they are; we've got a better mechanism, std::numeric_limits, that is specified more precisely than the C limit macros. At most we should add a nonnormative note recommending that users who care about the exact types of limit quantities should use <limits> instead of <climits>.]


417. what does ctype::do_widen() return on failure

Section: 22.2.1.1.2 [lib.locale.ctype.virtuals]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The Effects and Returns clauses of the do_widen() member function of the ctype facet fail to specify the behavior of the function on failure. That the function may not be able to simply cast the narrow character argument to the type of the result since doing so may yield the wrong value for some wchar_t encodings. Popular implementations of ctype<wchar_t> that use mbtowc() and UTF-8 as the native encoding (e.g., GNU glibc) will fail when the argument's MSB is set. There is no way for the the rest of locale and iostream to reliably detect this failure.

Proposed resolution:

[Kona: This is a real problem. Widening can fail. It's unclear what the solution should be. Returning WEOF works for the wchar_t specialization, but not in general. One option might be to add a default, like narrow. But that's an incompatible change. Using traits::eof might seem like a good idea, but facets don't have access to traits (a recurring problem). We could have widen throw an exception, but that's a scary option; existing library components aren't written with the assumption that widen can throw.]


418. exceptions thrown during iostream cleanup

Section: 27.4.2.1.6 [lib.ios::Init]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The dtor of the ios_base::Init object is supposed to call flush() on the 6 standard iostream objects cout, cerr, clog, wcout, wcerr, and wclog. This call may cause an exception to be thrown.

17.4.4.8, p3 prohibits all library destructors from throwing exceptions.

The question is: What should this dtor do if one or more of these calls to flush() ends up throwing an exception? This can happen quite easily if one of the facets installed in the locale imbued in the iostream object throws.

Proposed resolution:

[Kona: We probably can't do much better than what we've got, so the LWG is leaning toward NAD. At the point where the standard stream objects are being cleaned up, the usual error reporting mechanism are all unavailable. And exception from flush at this point will definitely cause problems. A quality implementation might reasonably swallow the exception, or call abort, or do something even more drastic.]


419. istream extractors not setting failbit if eofbit is already set

Section: 27.6.1.1.2 [lib.istream::sentry]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

27.6.1.1.2, p2 says that istream::sentry ctor prepares for input if is.good() is true. p4 then goes on to say that the ctor sets the sentry::ok_ member to true if the stream state is good after any preparation. 27.6.1.2.1, p1 then says that a formatted input function endeavors to obtain the requested input if the sentry's operator bool() returns true. Given these requirements, no formatted extractor should ever set failbit if the initial stream rdstate() == eofbit. That is contrary to the behavior of all implementations I tested. The program below prints out eof = 1, fail = 0 eof = 1, fail = 1 on all of them.

#include <sstream>
#include <cstdio>

int main()
{
    std::istringstream strm ("1");

    int i = 0;

    strm >> i;

    std::printf ("eof = %d, fail = %d\n",
                 !!strm.eof (), !!strm.fail ());

    strm >> i;

    std::printf ("eof = %d, fail = %d\n",
                 !!strm.eof (), !!strm.fail ());
}


Comments from Jerry Schwarz (c++std-lib-11373):
Jerry Schwarz wrote:
I don't know where (if anywhere) it says it in the standard, but the formatted extractors are supposed to set failbit if they don't extract any characters. If they didn't then simple loops like
while (cin >> x);
would loop forever.
Further comments from Martin Sebor:
The question is which part of the extraction should prevent this from happening by setting failbit when eofbit is already set. It could either be the sentry object or the extractor. It seems that most implementations have chosen to set failbit in the sentry [...] so that's the text that will need to be corrected.

Pre Berlin: This issue is related to 342. If the sentry sets failbit when it finds eofbit already set, then you can never seek away from the end of stream.

Proposed resolution:

Kona: Possibly NAD. If eofbit is set then good() will return false. We then set ok to false. We believe that the sentry's constructor should always set failbit when ok is false, and we also think the standard already says that. Possibly it could be clearer.


421. is basic_streambuf copy-constructible?

Section: 27.5.2.1 [lib.streambuf.cons]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The reflector thread starting with c++std-lib-11346 notes that the class template basic_streambuf, along with basic_stringbuf and basic_filebuf, is copy-constructible but that the semantics of the copy constructors are not defined anywhere. Further, different implementations behave differently in this respect: some prevent copy construction of objects of these types by declaring their copy ctors and assignment operators private, others exhibit undefined behavior, while others still give these operations well-defined semantics.

Note that this problem doesn't seem to be isolated to just the three types mentioned above. A number of other types in the library section of the standard provide a compiler-generated copy ctor and assignment operator yet fail to specify their semantics. It's believed that the only types for which this is actually a problem (i.e. types where the compiler-generated default may be inappropriate and may not have been intended) are locale facets. See issue 439.

Proposed resolution:

27.5.2 [lib.streambuf]: Add into the synopsis, public section, just above the destructor declaration:

basic_streambuf(const basic_streambuf& sb);
basic_streambuf& operator=(const basic_streambuf& sb);

Insert after 27.5.2.1, paragraph 2:

basic_streambuf(const basic_streambuf& sb);

Constructs a copy of sb.

Postcondtions:

                eback() == sb.eback()
                gptr()  == sb.gptr()
                egptr() == sb.egptr()
                pbase() == sb.pbase()
                pptr()  == sb.pptr()
                epptr() == sb.epptr()
                getloc() == sb.getloc()
basic_streambuf& operator=(const basic_streambuf& sb);

Assigns the data members of sb to this.

Postcondtions:

                eback() == sb.eback()
                gptr()  == sb.gptr()
                egptr() == sb.egptr()
                pbase() == sb.pbase()
                pptr()  == sb.pptr()
                epptr() == sb.epptr()
                getloc() == sb.getloc()

Returns: *this.

27.7.1 [lib.stringbuf]:

Option A:

Insert into the basic_stringbuf synopsis in the private section:

basic_stringbuf(const basic_stringbuf&);             // not defined
basic_stringbuf& operator=(const basic_stringbuf&);  // not defined
Option B:

Insert into the basic_stringbuf synopsis in the public section:

basic_stringbuf(const basic_stringbuf& sb);
basic_stringbuf& operator=(const basic_stringbuf& sb);

27.7.1.1, insert after paragraph 4:

basic_stringbuf(const basic_stringbuf& sb);

Constructs an independent copy of sb as if with sb.str(), and with the openmode that sb was constructed with.

Postcondtions:

               str() == sb.str()
               gptr()  - eback() == sb.gptr()  - sb.eback()
               egptr() - eback() == sb.egptr() - sb.eback()
               pptr()  - pbase() == sb.pptr()  - sb.pbase()
               getloc() == sb.getloc()

Note: The only requirement on epptr() is that it point beyond the initialized range if an output sequence exists. There is no requirement that epptr() - pbase() == sb.epptr() - sb.pbase().

basic_stringbuf& operator=(const basic_stringbuf& sb);

After assignment the basic_stringbuf has the same state as if it were initially copy constructed from sb, except that the basic_stringbuf is allowed to retain any excess capacity it might have, which may in turn effect the value of epptr().

27.8.1.1 [lib.filebuf]

Insert at the bottom of the basic_filebuf synopsis:

private:
  basic_filebuf(const basic_filebuf&);             // not defined
  basic_filebuf& operator=(const basic_filebuf&);  // not defined

[Kona: this is an issue for basic_streambuf itself and for its derived classes. We are leaning toward allowing basic_streambuf to be copyable, and specifying its precise semantics. (Probably the obvious: copying the buffer pointers.) We are less sure whether the streambuf derived classes should be copyable. Howard will write up a proposal.]

[Sydney: Dietmar presented a new argument against basic_streambuf being copyable: it can lead to an encapsulation violation. Filebuf inherits from streambuf. Now suppose you inhert a my_hijacking_buf from streambuf. You can copy the streambuf portion of a filebuf to a my_hijacking_buf, giving you access to the pointers into the filebuf's internal buffer. Perhaps not a very strong argument, but it was strong enough to make people nervous. There was weak preference for having streambuf not be copyable. There was weak preference for having stringbuf not be copyable even if streambuf is. Move this issue to open for now. ]

Rationale:

27.5.2 [lib.streambuf]: The proposed basic_streambuf copy constructor and assignment operator are the same as currently implied by the lack of declarations: public and simply copies the data members. This resolution is not a change but a clarification of the current standard.

27.7.1 [lib.stringbuf]: There are two reasonable options: A) Make basic_stringbuf not copyable. This is likely the status-quo of current implementations. B) Reasonable copy semantics of basic_stringbuf can be defined and implemented. A copyable basic_streambuf is arguably more useful than a non-copyable one. This should be considered as new functionality and not the fixing of a defect. If option B is chosen, ramifications from issue 432 are taken into account.

27.8.1.1 [lib.filebuf]: There are no reasonable copy semantics for basic_filebuf.


422. explicit specializations of member functions of class templates

Section: 17.4.3.1 [lib.reserved.names]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

It has been suggested that 17.4.3.1, p1 may or may not allow programs to explicitly specialize members of standard templates on user-defined types. The answer to the question might have an impact where library requirements are given using the "as if" rule. I.e., if programs are allowed to specialize member functions they will be able to detect an implementation's strict conformance to Effects clauses that describe the behavior of the function in terms of the other member function (the one explicitly specialized by the program) by relying on the "as if" rule.

Proposed resolution:

Add the following sentence immediately after the text of 17.4.3.1 [lib.reserved.names], p1:

The behavior of a program that declares explicit specializations of any members of class templates or explicit specializations of any member templates of classes or class templates defined in this library is undefined.

[Kona: straw poll was 6-1 that user programs should not be allowed to specialize individual member functions of standard library class templates, and that doing so invokes undefined behavior. Post-Kona: Martin provided wording.]

[Sydney: The LWG agrees that the standard shouldn't permit users to specialize individual member functions unless they specialize the whole class, but we're not sure these words say what we want them to; they could be read as prohibiting the specialization of any standard library class templates. We need to consult with CWG to make sure we use the right wording.]


423. effects of negative streamsize in iostreams

Section: 27 [lib.input.output]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

A third party test suite tries to exercise istream::ignore(N) with a negative value of N and expects that the implementation will treat N as if it were 0. Our implementation asserts that (N >= 0) holds and aborts the test.

I can't find anything in section 27 that prohibits such values but I don't see what the effects of such calls should be, either (this applies to a number of unformatted input functions as well as some member functions of the basic_streambuf template).

Proposed resolution:

I propose that we add to each function in clause 27 that takes an argument, say N, of type streamsize a Requires clause saying that "N >= 0." The intent is to allow negative streamsize values in calls to precision() and width() but disallow it in calls to streambuf::sgetn(), istream::ignore(), or ostream::write().

[Kona: The LWG agreed that this is probably what we want. However, we need a review to find all places where functions in clause 27 take arguments of type streamsize that shouldn't be allowed to go negative. Martin will do that review.]


424. normative notes

Section: 17.3.1.1 [lib.structure.summary]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The text in 17.3.1.1, p1 says:
"Paragraphs labelled "Note(s):" or "Example(s):" are informative, other paragraphs are normative."
The library section makes heavy use of paragraphs labeled "Notes(s)," some of which are clearly intended to be normative (see list 1), while some others are not (see list 2). There are also those where the intent is not so clear (see list 3).
List 1 -- Examples of (presumably) normative Notes:
20.4.1.1, p3, 20.4.1.1, p10, 21.3.1, p11, 22.1.1.2, p11, 23.2.1.3, p2, 25.3.7, p3, 26.2.6, p14a, 27.5.2.4.3, p7.
List 2 -- Examples of (presumably) informative Notes:
18.4.1.3, p3, 21.3.5.6, p14, 22.2.1.5.2, p3, 25.1.1, p4, 26.2.5, p1, 27.4.2.5, p6.
List 3 -- Examples of Notes that are not clearly either normative or informative:
22.1.1.2, p8, 22.1.1.5, p6, 27.5.2.4.5, p4.
None of these lists is meant to be exhaustive.

Proposed resolution:

[Definitely a real problem. The big problem is there's material that doesn't quite fit any of the named paragraph categories (e.g. Effects). Either we need a new kind of named paragraph, or we need to put more material in unnamed paragraphs jsut after the signature. We need to talk to the Project Editor about how to do this. ]


427. stage 2 and rationale of DR 221

Section: 22.2.2.1.2 [lib.facet.num.get.virtuals]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The requirements specified in Stage 2 and reiterated in the rationale of DR 221 (and echoed again in DR 303) specify that num_get<charT>:: do_get() compares characters on the stream against the widened elements of "012...abc...ABCX+-"

An implementation is required to allow programs to instantiate the num_get template on any charT that satisfies the requirements on a user-defined character type. These requirements do not include the ability of the character type to be equality comparable (the char_traits template must be used to perform tests for equality). Hence, the num_get template cannot be implemented to support any arbitrary character type. The num_get template must either make the assumption that the character type is equality-comparable (as some popular implementations do), or it may use char_traits<charT> to do the comparisons (some other popular implementations do that). This diversity of approaches makes it difficult to write portable programs that attempt to instantiate the num_get template on user-defined types.

Proposed resolution:

[Kona: the heart of the problem is that we're theoretically supposed to use traits classes for all fundamental character operations like assignment and comparison, but facets don't have traits parameters. This is a fundamental design flaw and it appears all over the place, not just in this one place. It's not clear what the correct solution is, but a thorough review of facets and traits is in order. The LWG considered and rejected the possibility of changing numeric facets to use narrowing instead of widening. This may be a good idea for other reasons (see issue 459), but it doesn't solve the problem raised by this issue. Whether we use widen or narrow the num_get facet still has no idea which traits class the user wants to use for the comparison, because only streams, not facets, are passed traits classes. The standard does not require that two different traits classes with the same char_type must necessarily have the same behavior.]

Informally, one possibility: require that some of the basic character operations, such as eq, lt, and assign, must behave the same way for all traits classes with the same char_type. If we accept that limitation on traits classes, then the facet could reasonably be required to use char_traits<charT>

.

430. valarray subset operations

Section: 26.5.2.4 [lib.valarray.sub]  Status: Open  Submitter: Martin Sebor  Date: 18 Sep 2003

The standard fails to specify the behavior of valarray::operator[](slice) and other valarray subset operations when they are passed an "invalid" slice object, i.e., either a slice that doesn't make sense at all (e.g., slice (0, 1, 0) or one that doesn't specify a valid subset of the valarray object (e.g., slice (2, 1, 1) for a valarray of size 1).

Proposed resolution:

[Kona: the LWG believes that invalid slices should invoke undefined behavior. Valarrays are supposed to be designed for high performance, so we don't want to require specific checking. We need wording to express this decision.]


431. Swapping containers with unequal allocators

Section: 20.1.5 [lib.default.con.req], 25 [lib.algorithms]  Status: Open  Submitter: Matt Austern  Date: 20 Sep 2003

Clause 20.1.5 [lib.default.con.req] paragraph 4 says that implementations are permitted to supply containers that are unable to cope with allocator instances and that container implementations may assume that all instances of an allocator type compare equal. We gave implementers this latitude as a temporary hack, and eventually we want to get rid of it. What happens when we're dealing with allocators that don't compare equal?

In particular: suppose that v1 and v2 are both objects of type vector<int, my_alloc> and that v1.get_allocator() != v2.get_allocator(). What happens if we write v1.swap(v2)? Informally, three possibilities:

1. This operation is illegal. Perhaps we could say that an implementation is required to check and to throw an exception, or perhaps we could say it's undefined behavior.

2. The operation performs a slow swap (i.e. using three invocations of operator=, leaving each allocator with its original container. This would be an O(N) operation.

3. The operation swaps both the vectors' contents and their allocators. This would be an O(1) operation. That is:

    my_alloc a1(...);
    my_alloc a2(...);
    assert(a1 != a2);

    vector<int, my_alloc> v1(a1);
    vector<int, my_alloc> v2(a2);
    assert(a1 == v1.get_allocator());
    assert(a2 == v2.get_allocator());

    v1.swap(v2);
    assert(a1 == v2.get_allocator());
    assert(a2 == v1.get_allocator());
  

Proposed resolution:

[Kona: This is part of a general problem. We need a paper saying how to deal with unequal allocators in general.]

[pre-Sydney: Howard argues for option 3 in n1599.]


446. Iterator equality between different containers

Section: 24.1 [lib.iterator.requirements], 23.1 [lib.container.requirements]  Status: Open  Submitter: Andy Koenig  Date: 16 Dec 2003

What requirements does the standard place on equality comparisons between iterators that refer to elements of different containers. For example, if v1 and v2 are empty vectors, is v1.end() == v2.end() allowed to yield true? Is it allowed to throw an exception?

The standard appears to be silent on both questions.

Proposed resolution:

[Sydney: The intention is that comparing two iterators from different containers is undefined, but it's not clear if we say that, or even whether it's something we should be saying in clause 23 or in clause 24. Intuitively we might want to say that equality is defined only if one iterator is reachable from another, but figuring out how to say it in any sensible way is a bit tricky: reachability is defined in terms of equality, so we can't also define equality in terms of reachability. ]


454. basic_filebuf::open should accept wchar_t names

Section: 27.8.1.3 [lib.filebuf.members]  Status: Open  Submitter: Bill Plauger  Date: 30 Jan 2004

    basic_filebuf *basic_filebuf::open(const char *, ios_base::open_mode);

should be supplemented with the overload:

    basic_filebuf *basic_filebuf::open(const wchar_t *, ios_base::open_mode);

Depending on the operating system, one of these forms is fundamental and the other requires an implementation-defined mapping to determine the actual filename.

[Sydney: Yes, we want to allow wchar_t filenames. Bill will provide wording.]

Proposed resolution:

Change from:

basic_filebuf<charT,traits>* open(
	const char* s,
	ios_base::openmode mode );

Effects: If is_open() != false, returns a null pointer. Otherwise, initializes the filebuf as required. It then opens a file, if possible, whose name is the NTBS s ("as if" by calling std::fopen(s,modstr)).

to:

basic_filebuf<charT,traits>* open(
	const char* s,
	ios_base::openmode mode );

basic_filebuf<charT,traits>* open(
	const wchar_t* ws,
	ios_base::openmode mode );

Effects: If is_open() != false, returns a null pointer. Otherwise, initializes the filebuf as required. It then opens a file, if possible, whose name is the NTBS s ("as if" by calling std::fopen(s,modstr)). For the second signature, the NTBS s is determined from the WCBS ws in an implementation-defined manner.

(NOTE: For a system that "naturally" represents a filename as a WCBS, the NTBS s in the first signature may instead be mapped to a WCBS; if so, it follows the same mapping rules as the first argument to open.)

Rationale:

Slightly controversial, but by a 7-1 straw poll the LWG agreed to move this to Ready. The controversy was because the mapping between wide names and files in a filesystem is implementation defined. The counterargument, which most but not all LWG members accepted, is that the mapping between narrow files names and files is also implemenation defined.

[Lillehammer: Moved back to "open" status, at Beman's urging. (1) Why just basic_filebuf, instead of also basic_fstream (and possibly other things too). (2) Why not also constructors that take std::basic_string? (3) We might want to wait until we see Beman's filesystem library; we might decide that it obviates this.]


456. Traditional C header files are overspecified

Section: 17.4.1.2 [lib.headers]  Status: Open  Submitter: Bill Plauger  Date: 30 Jan 2004

The C++ Standard effectively requires that the traditional C headers (of the form <xxx.h>) be defined in terms of the newer C++ headers (of the form <cxxx>). Clauses 17.4.1.2/4 and D.5 combine to require that:

The rules were left in this form despited repeated and heated objections from several compiler vendors. The C headers are often beyond the direct control of C++ implementors. In some organizations, it's all they can do to get a few #ifdef __cplusplus tests added. Third-party library vendors can perhaps wrap the C headers. But neither of these approaches supports the drastic restructuring required by the C++ Standard. As a result, it is still widespread practice to ignore this conformance requirement, nearly seven years after the committee last debated this topic. Instead, what is often implemented is:

The practical benefit for implementors with the second approach is that they can use existing C library headers, as they are pretty much obliged to do. The practical cost for programmers facing a mix of implementations is that they have to assume weaker rules:

There also exists the possibility of subtle differences due to Koenig lookup, but there are so few non-builtin types defined in the C headers that I've yet to see an example of any real problems in this area.

It is worth observing that the rate at which programmers fall afoul of these differences has remained small, at least as measured by newsgroup postings and our own bug reports. (By an overwhelming margin, the commonest problem is still that programmers include <string> and can't understand why the typename string isn't defined -- this a decade after the committee invented namespace std, nominally for the benefit of all programmers.)

We should accept the fact that we made a serious mistake and rectify it, however belatedly, by explicitly allowing either of the two schemes for declaring C names in headers.

[Sydney: This issue has been debated many times, and will certainly have to be discussed in full committee before any action can be taken. However, the preliminary sentiment of the LWG was in favor of the change. (6 yes, 0 no, 2 abstain) Robert Klarer suggests that we might also want to undeprecate the C-style .h headers.]

Proposed resolution:


458. 24.1.5 contains unintented limitation for operator-

Section: 24.1.5 [lib.random.access.iterators]  Status: Open  Submitter: Daniel Frey  Date: 27 Feb 2004

In 24.1.5 [lib.random.access.iterators], table 76 the operational semantics for the expression "r -= n" are defined as "return r += -n". This means, that the expression -n must be valid, which is not the case for unsigned types.

[ Sydney: Possibly not a real problem, since difference type is required to be a signed integer type. However, the wording in the standard may be less clear than we would like. ]

Proposed resolution:

To remove this limitation, I suggest to change the operational semantics for this column to:

{ Distance m = n; if (m >= 0) while (m--) --r; else while (m++) ++r; return r; }

459. Requirement for widening in stage 2 is overspecification

Section: 22.2.2.1.2 [lib.facet.num.get.virtuals]  Status: Open  Submitter: Martin Sebor  Date: 16 Mar 2004

When parsing strings of wide-character digits, the standard requires the library to widen narrow-character "atoms" and compare the widened atoms against the characters that are being parsed. Simply narrowing the wide characters would be far simpler, and probably more efficient. The two choices are equivalent except in convoluted test cases, and many implementations already ignore the standard and use narrow instead of widen.

First, I disagree that using narrow() instead of widen() would necessarily have unfortunate performance implications. A possible implementation of narrow() that allows num_get to be implemented in a much simpler and arguably comparably efficient way as calling widen() allows, i.e. without making a virtual call to do_narrow every time, is as follows:

  inline char ctype<wchar_t>::narrow (wchar_t wc, char dflt) const
  {
      const unsigned wi = unsigned (wc);

      if (wi > UCHAR_MAX)
          return typeid (*this) == typeid (ctype<wchar_t>) ?
                 dflt : do_narrow (wc, dflt);

      if (narrow_ [wi] < 0) {
         const char nc = do_narrow (wc, dflt);
         if (nc == dflt)
             return dflt;
         narrow_ [wi] = nc;
      }

      return char (narrow_ [wi]);
  }

Second, I don't think the change proposed in the issue (i.e., to use narrow() instead of widen() during Stage 2) would be at all drastic. Existing implementations with the exception of libstdc++ currently already use narrow() so the impact of the change on programs would presumably be isolated to just a single implementation. Further, since narrow() is not required to translate alternate wide digit representations such as those mentioned in issue 303 to their narrow equivalents (i.e., the portable source characters '0' through '9'), the change does not necessarily imply that these alternate digits would be treated as ordinary digits and accepted as part of numbers during parsing. In fact, the requirement in 22.2.1.1.2 [lib.locale.ctype.virtuals], p13 forbids narrow() to translate an alternate digit character, wc, to an ordinary digit in the basic source character set unless the expression (ctype<charT>::is(ctype_base::digit, wc) == true) holds. This in turn is prohibited by the C standard (7.25.2.1.5, 7.25.2.1.5, and 5.2.1, respectively) for charT of either char or wchar_t.

[Sydney: To a large extent this is a nonproblem. As long as you're only trafficking in char and wchar_t we're only dealing with a stable character set, so you don't really need either 'widen' or 'narrow': can just use literals. Finally, it's not even clear whether widen-vs-narrow is the right question; arguably we should be using codecvt instead.]

Proposed resolution:

Change stage 2 so that implementations are permitted to use either technique to perform the comparison:

  1. call widen on the atoms and compare (either by using operator== or char_traits<charT>::eq) the input with the widened atoms, or
  2. call narrow on the input and compare the narrow input with the atoms
  3. do (1) or (2) only if charT is not char or wchar_t, respectively; i.e., avoid calling widen or narrow if it the source and destination types are the same

462. Destroying objects with static storage duration

Section: 3.6.3 [basic.start.term], 18.3 [lib.cstdint]  Status: Open  Submitter: Bill Plauger  Date: 23 Mar 2004

3.6.3 Termination spells out in detail the interleaving of static destructor calls and calls to functions registered with atexit. To match this behavior requires intimate cooperation between the code that calls destructors and the exit/atexit machinery. The former is tied tightly to the compiler; the latter is a primitive mechanism inherited from C that traditionally has nothing to do with static construction and destruction. The benefits of intermixing destructor calls with atexit handler calls is questionable at best, and very difficult to get right, particularly when mixing third-party C++ libraries with different third-party C++ compilers and C libraries supplied by still other parties.

I believe the right thing to do is defer all static destruction until after all atexit handlers are called. This is a change in behavior, but one that is likely visible only to perverse test suites. At the very least, we should permit deferred destruction even if we don't require it.

Proposed resolution:

[If this is to be changed, it should probably be changed by CWG. At this point, however, the LWG is leaning toward NAD. Implementing what the standard says is hard work, but it's not impossible and most vendors went through that pain years ago. Changing this behavior would be a user-visible change, and would break at least one real application.]


463. auto_ptr usability issues

Section: 20.4.5 [lib.meta.unary]  Status: Open  Submitter: Rani Sharoni  Date: 7 Dec 2003

TC1 CWG DR #84 effectively made the template<class Y> operator auto_ptr<Y>() member of auto_ptr (20.4.5.3/4) obsolete.

The sole purpose of this obsolete conversion member is to enable copy initialization base from r-value derived (or any convertible types like cv-types) case:

#include <memory>
using std::auto_ptr;

struct B {};
struct D : B {};

auto_ptr<D> source();
int sink(auto_ptr<B>);
int x1 = sink( source() ); // #1 EDG - no suitable copy constructor

The excellent analysis of conversion operations that was given in the final auto_ptr proposal (http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/1997/N1128.pdf) explicitly specifies this case analysis (case 4). DR #84 makes the analysis wrong and actually comes to forbid the loophole that was exploited by the auto_ptr designers.

I didn't encounter any compliant compiler (e.g. EDG, GCC, BCC and VC) that ever allowed this case. This is probably because it requires 3 user defined conversions and in fact current compilers conform to DR #84.

I was surprised to discover that the obsolete conversion member actually has negative impact of the copy initialization base from l-value derived case:

auto_ptr<D> dp;
int x2 = sink(dp); // #2 EDG - more than one user-defined conversion applies

I'm sure that the original intention was allowing this initialization using the template<class Y> auto_ptr(auto_ptr<Y>& a) constructor (20.4.5.1/4) but since in this copy initialization it's merely user defined conversion (UDC) and the obsolete conversion member is UDC with the same rank (for the early overloading stage) there is an ambiguity between them.

Removing the obsolete member will have impact on code that explicitly invokes it:

int y = sink(source().operator auto_ptr<B>());

IMHO no one ever wrote such awkward code and the reasonable workaround for #1 is:

int y = sink( auto_ptr<B>(source()) );

I was even more surprised to find out that after removing the obsolete conversion member the initialization was still ill-formed: int x3 = sink(dp); // #3 EDG - no suitable copy constructor

This copy initialization semantically requires copy constructor which means that both template conversion constructor and the auto_ptr_ref conversion member (20.4.5.3/3) are required which is what was explicitly forbidden in DR #84. This is a bit amusing case in which removing ambiguity results with no candidates.

I also found exception safety issue with auto_ptr related to auto_ptr_ref:

int f(auto_ptr<B>, std::string);
auto_ptr<B> source2();

// string constructor throws while auto_ptr_ref
// "holds" the pointer
int x4 = f(source2(), "xyz"); // #4

The theoretic execution sequence that will cause a leak:

  1. call auto_ptr<B>::operator auto_ptr_ref<B>()
  2. call string::string(char const*) and throw

According to 20.4.5.3/3 and 20.4.5/2 the auto_ptr_ref conversion member returns auto_ptr_ref<Y> that holds *this and this is another defect since the type of *this is auto_ptr<X> where X might be different from Y. Several library vendors (e.g. SGI) implement auto_ptr_ref<Y> with Y* as member which is much more reasonable. Other vendor implemented auto_ptr_ref as defectively required and it results with awkward and catastrophic code: int oops = sink(auto_ptr<B>(source())); // warning recursive on all control paths

Dave Abrahams noticed that there is no specification saying that auto_ptr_ref copy constructor can't throw.

My proposal comes to solve all the above issues and significantly simplify auto_ptr implementation. One of the fundamental requirements from auto_ptr is that it can be constructed in an intuitive manner (i.e. like ordinary pointers) but with strict ownership semantics which yield that source auto_ptr in initialization must be non-const. My idea is to add additional constructor template with sole propose to generate ill-formed, diagnostic required, instance for const auto_ptr arguments during instantiation of declaration. This special constructor will not be instantiated for other types which is achievable using 14.8.2/2 (SFINAE). Having this constructor in hand makes the constructor template<class Y> auto_ptr(auto_ptr<Y> const&) legitimate since the actual argument can't be const yet non const r-value are acceptable.

This implementation technique makes the "private auxiliary class" auto_ptr_ref obsolete and I found out that modern C++ compilers (e.g. EDG, GCC and VC) consume the new implementation as expected and allow all intuitive initialization and assignment cases while rejecting illegal cases that involve const auto_ptr arguments.

The proposed auto_ptr interface:

namespace std {
    template<class X> class auto_ptr {
    public:
        typedef X element_type;

        // 20.4.5.1 construct/copy/destroy:
        explicit auto_ptr(X* p=0) throw();
        auto_ptr(auto_ptr&) throw();
        template<class Y> auto_ptr(auto_ptr<Y> const&) throw();
        auto_ptr& operator=(auto_ptr&) throw();
        template<class Y> auto_ptr& operator=(auto_ptr<Y>) throw();
        ~auto_ptr() throw();

        // 20.4.5.2 members:
        X& operator*() const throw();
        X* operator->() const throw();
        X* get() const throw();
        X* release() throw();
        void reset(X* p=0) throw();

    private:
        template<class U>
        auto_ptr(U& rhs, typename
unspecified_error_on_const_auto_ptr<U>::type = 0);
    };
}

One compliant technique to implement the unspecified_error_on_const_auto_ptr helper class is using additional private auto_ptr member class template like the following:

template<typename T> struct unspecified_error_on_const_auto_ptr;

template<typename T>
struct unspecified_error_on_const_auto_ptr<auto_ptr<T> const>
{ typedef typename auto_ptr<T>::const_auto_ptr_is_not_allowed type; };

There are other techniques to implement this helper class that might work better for different compliers (i.e. better diagnostics) and therefore I suggest defining its semantic behavior without mandating any specific implementation. IMO, and I didn't found any compiler that thinks otherwise, 14.7.1/5 doesn't theoretically defeat the suggested technique but I suggest verifying this with core language experts.

Further changes in standard text:

Remove section 20.4.5.3

Change 20.4.5/2 to read something like: Initializing auto_ptr<X> from const auto_ptr<Y> will result with unspecified ill-formed declaration that will require unspecified diagnostic.

Change 20.4.5.1/4,5,6 to read:

template<class Y> auto_ptr(auto_ptr<Y> const& a) throw();

4 Requires: Y* can be implicitly converted to X*.

5 Effects: Calls const_cast<auto_ptr<Y>&>(a).release().

6 Postconditions: *this holds the pointer returned from a.release().

Change 20.4.5.1/10

template<class Y> auto_ptr& operator=(auto_ptr<Y> a) throw();

10 Requires: Y* can be implicitly converted to X*. The expression delete get() is well formed.

LWG TC DR #127 is obsolete.

Notice that the copy constructor and copy assignment operator should remain as before and accept non-const auto_ptr& since they have effect on the form of the implicitly declared copy constructor and copy assignment operator of class that contains auto_ptr as member per 12.8/5,10:

struct X {
    // implicit X(X&)
    // implicit X& operator=(X&)
    auto_ptr<D> aptr_;
};

In most cases this indicates about sloppy programming but preserves the current auto_ptr behavior.

Dave Abrahams encouraged me to suggest fallback implementation in case that my suggestion that involves removing of auto_ptr_ref will not be accepted. In this case removing the obsolete conversion member to auto_ptr<Y> and 20.4.5.3/4,5 is still required in order to eliminate ambiguity in legal cases. The two constructors that I suggested will co exist with the current members but will make auto_ptr_ref obsolete in initialization contexts. auto_ptr_ref will be effective in assignment contexts as suggested in DR #127 and I can't see any serious exception safety issues in those cases (although it's possible to synthesize such). auto_ptr_ref<X> semantics will have to be revised to say that it strictly holds pointer of type X and not reference to an auto_ptr for the favor of cases in which auto_ptr_ref<Y> is constructed from auto_ptr<X> in which X is different from Y (i.e. assignment from r-value derived to base).

Proposed resolution:

[Redmond: punt for the moment. We haven't decided yet whether we want to fix auto_ptr for C++-0x, or remove it and replace it with move_ptr and unique_ptr.]


466. basic_string ctor should prevent null pointer error

Section: 21.3.1 [lib.string.cons]  Status: Open  Submitter: Daniel Frey  Date: 10 Jun 2004

Today, my colleagues and me wasted a lot of time. After some time, I found the problem. It could be reduced to the following short example:

  #include <string>
  int main() { std::string( 0 ); }

The problem is that the tested compilers (GCC 2.95.2, GCC 3.3.1 and Comeau online) compile the above without errors or warnings! The programs (at least for the GCC) resulted in a SEGV.

I know that the standard explicitly states that the ctor of string requires a char* which is not zero. STLs could easily detect the above case with a private ctor for basic_string which takes a single 'int' argument. This would catch the above code at compile time and would not ambiguate any other legal ctors.

Proposed resolution:

[Redmond: No great enthusiasm for doing this. If we do, however, we want to do it for all places that take charT* pointers, not just the single-argument constructor. The other question is whether we want to catch this at compile time (in which case we catch the error of a literal 0, but not an expression whose value is a null pointer), at run time, or both.]


470. accessing containers from their elements' special functions

Section: 23 [lib.containers]  Status: Open  Submitter: Martin Sebor  Date: 28 Jun 2004

The standard doesn't prohibit the destructors (or any other special functions) of containers' elements invoked from a member function of the container from "recursively" calling the same (or any other) member function on the same container object, potentially while the container is in an intermediate state, or even changing the state of the container object while it is being modified. This may result in some surprising (i.e., undefined) behavior.

Read email thread starting with c++std-lib-13637 for more.

Proposed resolution:

Add to Container Requirements the following new paragraph:

    Unless otherwise specified, the behavior of a program that
    invokes a container member function f from a member function
    g of the container's value_type on a container object c that
    called g from its mutating member function h, is undefined.
    I.e., if v is an element of c, directly or indirectly calling
    c.h() from v.g() called from c.f(), is undefined.

[Redmond: This is a real issue, but it's probably a clause 17 issue, not clause 23. We get the same issue, for example, if we try to destroy a stream from one of the stream's callback functions.]


471. result of what() implementation-defined

Section: 18.6.1 [lib.type.info]  Status: Open  Submitter: Martin Sebor  Date: 28 Jun 2004

[lib.exception] specifies the following:

    exception (const exception&) throw();
    exception& operator= (const exception&) throw();

    -4- Effects: Copies an exception object.
    -5- Notes: The effects of calling what() after assignment
        are implementation-defined.

First, does the Note only apply to the assignment operator? If so, what are the effects of calling what() on a copy of an object? Is the returned pointer supposed to point to an identical copy of the NTBS returned by what() called on the original object or not?

Second, is this Note intended to extend to all the derived classes in section 19? I.e., does the standard provide any guarantee for the effects of what() called on a copy of any of the derived class described in section 19?

Finally, if the answer to the first question is no, I believe it constitutes a defect since throwing an exception object typically implies invoking the copy ctor on the object. If the answer is yes, then I believe the standard ought to be clarified to spell out exactly what the effects are on the copy (i.e., after the copy ctor was called).

[Redmond: Yes, this is fuzzy. The issue of derived classes is fuzzy too.]

Proposed resolution:


473. underspecified ctype calls

Section: 22.2.1.1 [lib.locale.ctype]  Status: Open  Submitter: Martin Sebor  Date: 1 Jul 2004

Most ctype member functions come in two forms: one that operates on a single character at a time and another form that operates on a range of characters. Both forms are typically described by a single Effects and/or Returns clause.

The Returns clause of each of the single-character non-virtual forms suggests that the function calls the corresponding single character virtual function, and that the array form calls the corresponding virtual array form. Neither of the two forms of each virtual member function is required to be implemented in terms of the other.

There are three problems:

1. One is that while the standard does suggest that each non-virtual member function calls the corresponding form of the virtual function, it doesn't actually explicitly require it.

Implementations that cache results from some of the virtual member functions for some or all values of their arguments might want to call the array form from the non-array form the first time to fill the cache and avoid any or most subsequent virtual calls. Programs that rely on each form of the virtual function being called from the corresponding non-virtual function will see unexpected behavior when using such implementations.

2. The second problem is that either form of each of the virtual functions can be overridden by a user-defined function in a derived class to return a value that is different from the one produced by the virtual function of the alternate form that has not been overriden.

Thus, it might be possible for, say, ctype::widen(c) to return one value, while for ctype::widen(&c, &c + 1, &wc) to set wc to another value. This is almost certainly not intended. Both forms of every function should be required to return the same result for the same character, otherwise the same program using an implementation that calls one form of the functions will behave differently than when using another implementation that calls the other form of the function "under the hood."

3. The last problem is that the standard text fails to specify whether one form of any of the virtual functions is permitted to be implemented in terms of the other form or not, and if so, whether it is required or permitted to call the overridden virtual function or not.

Thus, a program that overrides one of the virtual functions so that it calls the other form which then calls the base member might end up in an infinite loop if the called form of the base implementation of the function in turn calls the other form.

Proposed resolution:

Lillehammer: Part of this isn't a real problem. We already talk about caching. 22.1.1/6 But part is a real problem. ctype virtuals may call each other, so users don't know which ones to override to avoid avoid infinite loops.

This is a problem for all facet virtuals, not just ctype virtuals, so we probably want a blanket statement in clause 22 for all facets. The LWG is leaning toward a blanket prohibition, that a facet's virtuals may never call each other. We might want to do that in clause 27 too, for that matter. A review is necessary. Bill will provide wording.


479. Container requirements and placement new

Section: 23.1 [lib.container.requirements]  Status: Open  Submitter: Herb Sutter  Date: 1 Aug 2004

Nothing in the standard appears to make this program ill-formed:

  struct C {
    void* operator new( size_t s ) { return ::operator new( s ); }
    // NOTE: this hides in-place and nothrow new
  };

  int main() {
    vector<C> v;
    v.push_back( C() );
  }

Is that intentional? We should clarify whether or not we intended to require containers to support types that define their own special versions of operator new.

[ Lillehammer: A container will definitely never use this overridden operator new, but whether it will fail to compile is unclear from the standard. Are containers supposed to use qualified or unqualified placement new? 20.4.1.1 is somewhat relevant, but the standard doesn't make it completely clear whether containers have to use Allocator::construct(). If containers don't use it, the details of how containers use placement new are unspecified. That is the real bug, but it needs to be fixed as part of the allocator overhaul. Weak support that the eventual solution should make this code well formed. ]

Proposed resolution:


482. Swapping pairs

Section: 20.2.2 [lib.pairs], 25.2.2 [lib.alg.swap]  Status: Open  Submitter: Andrew Koenig  Date: 14 Sep 2004

(Based on recent comp.std.c++ discussion)

Pair (and tuple) should specialize std::swap to work in terms of std::swap on their components. For example, there's no obvious reason why swapping two objects of type pair<vector<int>, list<double> > should not take O(1).

Proposed resolution:

[Lillehammer: We agree it should be swappable. Howard will provide wording.]


484. Convertible to T

Section: 24.1.1 [lib.input.iterators]  Status: Open  Submitter: Chris Jefferson  Date: 16 Sep 2004

From comp.std.c++:

I note that given an input iterator a for type T, then *a only has to be "convertable to T", not actually of type T.

Firstly, I can't seem to find an exact definition of "convertable to T". While I assume it is the obvious definition (an implicit conversion), I can't find an exact definition. Is there one?

Slightly more worryingly, there doesn't seem to be any restriction on the this type, other than it is "convertable to T". Consider two input iterators a and b. I would personally assume that most people would expect *a==*b would perform T(*a)==T(*b), however it doesn't seem that the standard requires that, and that whatever type *a is (call it U) could have == defined on it with totally different symantics and still be a valid inputer iterator.

Is this a correct reading? When using input iterators should I write T(*a) all over the place to be sure that the object i'm using is the class I expect?

This is especially a nuisance for operations that are defined to be "convertible to bool". (This is probably allowed so that implementations could return say an int and avoid an unnessary conversion. However all implementations I have seen simply return a bool anyway. Typical implemtations of STL algorithms just write things like while(a!=b && *a!=0). But strictly speaking, there are lots of types that are convertible to T but that also overload the appropriate operators so this doesn't behave as expected.

If we want to make code like this legal (which most people seem to expect), then we'll need to tighten up what we mean by "convertible to T".

Proposed resolution:

[Lillehammer: The first part is NAD, since "convertible" is well-defined in core. The second part is basically about pathological overloads. It's a minor problem but a real one. So leave open for now, hope we solve it as part of iterator redesign.]


485. output iterator insufficently constrained

Section: 24.1.2 [lib.output.iterators]  Status: Open  Submitter: Chris Jefferson  Date: 13 Oct 2004

The note on 24.1.2 Output iterators insufficently limits what can be performed on output iterators. While it requires that each iterator is progressed through only once and that each iterator is written to only once, it does not require the following things:

Note: Here it is assumed that x is an output iterator of type X which has not yet been assigned to.

a) That each value of the output iterator is written to: The standard allows: ++x; ++x; ++x;

b) That assignments to the output iterator are made in order X a(x); ++a; *a=1; *x=2; is allowed

c) Chains of output iterators cannot be constructed: X a(x); ++a; X b(a); ++b; X c(b); ++c; is allowed, and under the current wording (I believe) x,a,b,c could be written to in any order.

I do not believe this was the intension of the standard?

Proposed resolution:

[Lillehammer: Real issue. There are lots of constraints we intended but didn't specify. Should be solved as part of iterator redesign.]


488. rotate throws away useful information

Section: 25.2.10 [lib.alg.rotate]  Status: Open  Submitter: Howard Hinnant  Date: 22 Nov 2004

rotate takes 3 iterators: first, middle and last which point into a sequence, and rearranges the sequence such that the subrange [middle, last) is now at the beginning of the sequence and the subrange [first, middle) follows. The return type is void.

In many use cases of rotate, the client needs to know where the subrange [first, middle) starts after the rotate is performed. This might look like:

  rotate(first, middle, last);
  Iterator i = advance(first, distance(middle, last));

Unless the iterators are random access, the computation to find the start of the subrange [first, middle) has linear complexity. However, it is not difficult for rotate to return this information with negligible additional computation expense. So the client could code:

  Iterator i = rotate(first, middle, last);

and the resulting program becomes significantly more efficient.

While the backwards compatibility hit with this change is not zero, it is very small (similar to that of lwg 130), and there is a significant benefit to the change.

Proposed resolution:

In 25p2, change:

  template<class ForwardIterator>
    void rotate(ForwardIterator first, ForwardIterator middle,
                ForwardIterator last);

to:

  template<class ForwardIterator>
    ForwardIterator rotate(ForwardIterator first, ForwardIterator middle,
                           ForwardIterator last);

In 25.2.10, change:

  template<class ForwardIterator>
    void rotate(ForwardIterator first, ForwardIterator middle,
                ForwardIterator last);

to:

  template<class ForwardIterator>
    ForwardIterator rotate(ForwardIterator first, ForwardIterator middle,
                           ForwardIterator last);

In 25.2.10 insert a new paragraph after p1:

Returns: first + (last - middle).

[ The LWG agrees with this idea, but has one quibble: we want to make sure not to give the impression that the function "advance" is actually called, just that the nth iterator is returned. (Calling advance is observable behavior, since users can specialize it for their own iterators.) Howard will provide wording. ]

[Howard provided wording for mid-meeting-mailing Jun. 2005.]


492. Invalid iterator arithmetic expressions

Section: 23 [lib.containers], 24 [lib.iterators], 25 [lib.algorithms]  Status: Open  Submitter: Thomas Mang  Date: 12 Dec 2004

Various clauses other than clause 25 make use of iterator arithmetic not supported by the iterator category in question. Algorithms in clause 25 are exceptional because of 25 [lib.algorithms], paragraph 9, but this paragraph does not provide semantics to the expression "iterator - n", where n denotes a value of a distance type between iterators.

1) Examples of current wording:

Current wording outside clause 25:

23.2.2.4 [lib.list.ops], paragraphs 19-21: "first + 1", "(i - 1)", "(last - first)" 23.3.1.1 [lib.map.cons], paragraph 4: "last - first" 23.3.2.1 [lib.multimap.cons], paragraph 4: "last - first" 23.3.3.1 [lib.set.cons], paragraph 4: "last - first" 23.3.4.1 [lib.multiset.cons], paragraph 4: "last - first" 24.4.1 [lib.reverse.iterators], paragraph 1: "(i - 1)"

[Important note: The list is not complete, just an illustration. The same issue might well apply to other paragraphs not listed here.]

None of these expressions is valid for the corresponding iterator category.

Current wording in clause 25:

25.1.1 [lib.alg.foreach], paragraph 1: "last - 1" 25.1.3 [lib.alg.find.end], paragraph 2: "[first1, last1 - (last2-first2))" 25.2.8 [lib.alg.unique], paragraph 1: "(i - 1)" 25.2.8 [lib.alg.unique], paragraph 5: "(i - 1)"

However, current wording of 25 [lib.algorithms], paragraph 9 covers neither of these four cases:

Current wording of 25 [lib.algorithms], paragraph 9:

"In the description of the algorithms operator + and - are used for some of the iterator categories for which they do not have to be defined. In these cases the semantics of a+n is the same as that of

{X tmp = a;
advance(tmp, n);
return tmp;
}

and that of b-a is the same as of return distance(a, b)"

This paragrpah does not take the expression "iterator - n" into account, where n denotes a value of a distance type between two iterators [Note: According to current wording, the expression "iterator - n" would be resolved as equivalent to "return distance(n, iterator)"]. Even if the expression "iterator - n" were to be reinterpreted as equivalent to "iterator + -n" [Note: This would imply that "a" and "b" were interpreted implicitly as values of iterator types, and "n" as value of a distance type], then 24.3.4/2 interfers because it says: "Requires: n may be negative only for random access and bidirectional iterators.", and none of the paragraphs quoted above requires the iterators on which the algorithms operate to be of random access or bidirectional category.

2) Description of intended behavior:

For the rest of this Defect Report, it is assumed that the expression "iterator1 + n" and "iterator1 - iterator2" has the semantics as described in current 25 [lib.algorithms], paragraph 9, but applying to all clauses. The expression "iterator1 - n" is equivalent to an result-iterator for which the expression "result-iterator + n" yields an iterator denoting the same position as iterator1 does. The terms "iterator1", "iterator2" and "result-iterator" shall denote the value of an iterator type, and the term "n" shall denote a value of a distance type between two iterators.

All implementations known to the author of this Defect Report comply with these assumptions. No impact on current code is expected.

3) Proposed fixes:

Change 25 [lib.algorithms], paragraph 9 to:

"In the description of the algorithms operator + and - are used for some of the iterator categories for which they do not have to be defined. In this paragraph, a and b denote values of an iterator type, and n denotes a value of a distance type between two iterators. In these cases the semantics of a+n is the same as that of

{X tmp = a;
advance(tmp, n);
return tmp;
}

,the semantics of a-n denotes the value of an iterator i for which the following condition holds: advance(i, n) == a, and that of b-a is the same as of return distance(a, b)".

Comments to the new wording:

a) The wording " In this paragraph, a and b denote values of an iterator type, and n denotes a value of a distance type between two iterators." was added so the expressions "b-a" and "a-n" are distinguished regarding the types of the values on which they operate. b) The wording ",the semantics of a-n denotes the value of an iterator i for which the following condition holds: advance(i, n) == a" was added to cover the expression 'iterator - n'. The wording "advance(i, n) == a" was used to avoid a dependency on the semantics of a+n, as the wording "i + n == a" would have implied. However, such a dependency might well be deserved. c) DR 225 is not considered in the new wording.

Proposed fixes regarding invalid iterator arithmetic expressions outside clause 25:

Either a) Move modified 25 [lib.algorithms], paragraph 9 (as proposed above) before any current invalid iterator arithmetic expression. In that case, the first sentence of 25 [lib.algorithms], paragraph 9, need also to be modified and could read: "For the rest of this International Standard, ...." / "In the description of the following clauses including this ...." / "In the description of the text below ..." etc. - anyways substituting the wording "algorithms", which is a straight reference to clause 25. In that case, 25 [lib.algorithms] paragraph 9 will certainly become obsolete. Alternatively, b) Add an appropiate paragraph similar to resolved 25 [lib.algorithms], paragraph 9, to the beginning of each clause containing invalid iterator arithmetic expressions. Alternatively, c) Fix each paragraph (both current wording and possible resolutions of DRs) containing invalid iterator arithmetic expressions separately.

5) References to other DRs:

See DR 225. See DR 237. The resolution could then also read "Linear in last - first".

Proposed resolution:

[Lillehammer: Minor issue, but real. We have a blanket statement about this in 25/11. But (a) it should be in 17, not 25; and (b) it's not quite broad enough, because there are some arithmetic expressions it doesn't cover. Bill will provide wording.]


498. Requirements for partition() and stable_partition() too strong

Section: 25.2.12 [lib.alg.partitions]  Status: Open  Submitter: Sean Parent, Joe Gottman  Date: 4 May 2005

Problem: The iterator requirements for partition() and stable_partition() [25.2.12] are listed as BidirectionalIterator, however, there are efficient algorithms for these functions that only require ForwardIterator that have been known since before the standard existed. The SGI implementation includes these (see http://www.sgi.com/tech/stl/partition.html and http://www.sgi.com/tech/stl/stable_partition.html).

Proposed resolution:

Change 25.2.12 from

template<class BidirectionalIterator, class Predicate> 
BidirectionalIterator partition(BidirectionalIterato r first, 
                                BidirectionalIterator last, 
                                Predicate pred); 

to

template<class ForwardIterator, class Predicate> 
ForwardIterator partition(ForwardIterator first, 
                          ForwardIterator last, 
                          Predicate pred); 

Change the complexity from

At most (last - first)/2 swaps are done. Exactly (last - first) applications of the predicate are done.

to

If ForwardIterator is a bidirectional_iterator, at most (last - first)/2 swaps are done; otherwise at most (last - first) swaps are done. Exactly (last - first) applications of the predicate are done.

Rationale:

Partition is a "foundation" algorithm useful in many contexts (like sorting as just one example) - my motivation for extending it to include forward iterators is slist - without this extension you can't partition an slist (without writing your own partition). Holes like this in the standard library weaken the argument for generic programming (ideally I'd be able to provide a library that would refine std::partition() to other concepts without fear of conflicting with other libraries doing the same - but that is a digression). I consider the fact that partition isn't defined to work for ForwardIterator a minor embarrassment.

[Mont Tremblant: Moved to Open, request motivation and use cases by next meeting. Sean provided further rationale by post-meeting mailing.]


502. Proposition: Clarification of the interaction between a facet and an iterator

Section: 22.1.1.1.1 [lib.locale.category]  Status: Open  Submitter: Christopher Conrade Zseleghovski  Date: 7 Jun 2005

Motivation:

This requirement seems obvious to me, it is the essence of code modularity. I have complained to Mr. Plauger that the Dinkumware library does not observe this principle but he objected that this behaviour is not covered in the standard.

Proposed resolution:

Append the following point to 22.1.1.1.1:

6. The implementation of a facet of Table 52 parametrized with an InputIterator/OutputIterator should use that iterator only as character source/sink respectively. For a *_get facet, it means that the value received depends only on the sequence of input characters and not on how they are accessed. For a *_put facet, it means that the sequence of characters output depends only on the value to be formatted and not of how the characters are stored.

[ Berlin: Moved to Open, Need to clean up this area to make it clear locales don't have to contain open ended sets of facets. Jack, Howard, Bill. ]


503. more on locales

Section: 22.2 [lib.locale.categories]  Status: Open  Submitter: P.J. Plauger  Date: 20 Jun 2005

a) In 22.2.1.1 para. 2 we refer to "the instantiations required in Table 51" to refer to the facet *objects* associated with a locale. And we almost certainly mean just those associated with the default or "C" locale. Otherwise, you can't switch to a locale that enforces a different mapping between narrow and wide characters, or that defines additional uppercase characters.

b) 22.2.1.5 para. 3 (codecvt) has the same issues.

c) 22.2.1.5.2 (do_unshift) is even worse. It *forbids* the generation of a homing sequence for the basic character set, which might very well need one.

d) 22.2.1.5.2 (do_length) likewise dictates that the default mapping between wide and narrow characters be taken as one-for-one.

e) 22.2.2 para. 2 (num_get/put) is both muddled and vacuous, as far as I can tell. The muddle is, as before, calling Table 51 a list of instantiations. But the constraint it applies seems to me to cover *all* defined uses of num_get/put, so why bother to say so?

f) 22.2.3.1.2 para. 1(do_decimal_point) says "The required instantiations return '.' or L'.'.) Presumably this means "as appropriate for the character type. But given the vague definition of "required" earlier, this overrules *any* change of decimal point for non "C" locales. Surely we don't want to do that.

g) 22.2.3.1.2 para. 2 (do_thousands_sep) says "The required instantiations return ',' or L','.) As above, this probably means "as appropriate for the character type. But this overrules the "C" locale, which requires *no* character ('\0') for the thousands separator. Even if we agree that we don't mean to block changes in decimal point or thousands separator, we should also eliminate this clear incompatibility with C.

h) 22.2.3.1.2 para. 2 (do_grouping) says "The required instantiations return the empty string, indicating no grouping." Same considerations as for do_decimal_point.

i) 22.2.4.1 para. 1 (collate) refers to "instantiations required in Table 51". Same bad jargon.

j) 22.2.4.1.2 para. 1 (do_compare) refers to "instantiations required in Table 51". Same bad jargon.

k) 22.2.5 para. 1 (time_get/put) uses the same muddled and vacuous as num_get/put.

l) 22.2.6 para. 2 (money_get/put) uses the same muddled and vacuous as num_get/put.

m) 22.2.6.3.2 (do_pos/neg_format) says "The instantiations required in Table 51 ... return an object of type pattern initialized to {symbol, sign, none, value}." This once again *overrides* the "C" locale, as well as any other locale."

3) We constrain the use_facet calls that can be made by num_get/put, so why don't we do the same for money_get/put? Or for any of the other facets, for that matter?

4) As an almost aside, we spell out when a facet needs to use the ctype facet, but several also need to use a codecvt facet and we don't say so.

Proposed resolution:

[ Berlin: Bill to provide wording. ]


504. Integer types in pseudo-random number engine requirements

Section: TR1 5.1.1 [tr.rand.req]  Status: Ready  Submitter: Walter Brown  Date: 3 Jul 2005

In [tr.rand.req], Paragraph 2 states that "... s is a value of integral type, g is an ... object returning values of unsigned integral type ..."

Proposed resolution:

In 5.1.1 [tr.rand.req], Paragraph 2 replace

... s is a value of integral type, g is an lvalue of a type other than X that defines a zero-argument function object returning values of unsigned integral type unsigned long int, ...

In 5.1.1 [tr.rand.seq], Table 16, replace in the line for X(s)

creates an engine with the initial internal state determined by static_cast<unsigned long>(s)

[ Mont Tremblant: Both s and g should be unsigned long. This should refer to the constructor signatures. Jens provided wording post Mont Tremblant. ]

[ Berlin: N1932 adopts the proposed resolution: see 26.3.1.3/1e and Table 3 row 2. Moved to Ready. ]

Rationale:

Jens: Just requiring X(unsigned long) still makes it possible for an evil library writer to also supply a X(int) that does something unexpected. The wording above requires that X(s) always performs as if X(unsigned long) would have been called. I believe that is sufficient and implements our intentions from Mont Tremblant. I see no additional use in actually requiring a X(unsigned long) signature. u.seed(s) is covered by its reference to X(s), same arguments.


512. Seeding subtract_with_carry_01 from a single unsigned long

Section: TR1 5.1.4.4 [tr.rand.eng.sub1]  Status: Ready  Submitter: Walter Brown  Date: 3 Jul 2005

Paragraph 8 specifies the algorithm by which a subtract_with_carry_01 engine is to be seeded given a single unsigned long. This algorithm is seriously flawed in the case where the engine parameter w (also known as word_size) exceeds 31 [bits]. The key part of the paragraph reads:

sets x(-r) ... x(-1) to (lcg(1)*2**(-w)) mod 1

and so forth.

Since the specified linear congruential engine, lcg, delivers numbers with a maximum of 2147483563 (just a shade under 31 bits), then when w is, for example, 48, each of the x(i) will be less than 2**-17. The consequence is that roughly the first 400 numbers delivered will be conspicuously close to either zero or one.

Unfortunately, this is not an innocuous flaw: One of the predefined engines in [tr.rand.predef], namely ranlux64_base_01, has w = 48 and would exhibit this poor behavior, while the original N1378 proposal states that these pre-defined engines are intended to be of "known good properties."

Proposed resolution:

In 5.1.4.4 [tr.rand.eng.sub1], replace the "effects" clause for void seed(unsigned long value = 19780503) by

Effects: If value == 0, sets value to 19780503. In any case, with a linear congruential generator lcg(i) having parameters mlcg = 2147483563, alcg = 40014, clcg = 0, and lcg(0) = value, sets carry(-1) and x(-r) … x(-1) as if executing

linear_congruential<unsigned long, 40014, 0, 2147483563> lcg(value);
seed(lcg);
to (lcg(1) · 2-w) mod 1 … (lcg(r) · 2-w) mod 1, respectively. If x(-1) == 0, sets carry(-1) = 2-w, else sets carry(-1) = 0.

[ Jens provided revised wording post Mont Tremblant. ]

[ Berlin: N1932 adopts the originally-proposed resolution of the issue. Jens's supplied wording is a clearer description of what is intended. Moved to Ready. ]

Rationale:

Jens: I'm using an explicit type here, because fixing the prose would probably not qualify for the (with issue 504 even stricter) requirements we have for seed(Gen&).


515. Random number engine traits

Section: TR1 5.1.2 [tr.rand.synopsis]  Status: Open  Submitter: Walter Brown  Date: 3 Jul 2005

To accompany the concept of a pseudo-random number engine as defined in Table 17, we propose and recommend an adjunct template, engine_traits, to be declared in [tr.rand.synopsis] as:

template< class PSRE >
class engine_traits;

This templateÕs primary purpose would be as an aid to generic programming involving pseudo-random number engines. Given only the facilities described in tr1, it would be very difficult to produce any algorithms involving the notion of a generic engine. The intent of this proposal is to provide, via engine_traits<>, sufficient descriptive information to allow an algorithm to employ a pseudo-random number engine without regard to its exact type, i.e., as a template parameter.

For example, today it is not possible to write an efficient generic function that requires any specific number of random bits. More specifically, consider a cryptographic application that internally needs 256 bits of randomness per call:

template< class Eng, class InIter, class OutIter >
void crypto( Eng& e, InIter in, OutIter out );

Without knowning the number of bits of randomness produced per call to a provided engine, the algorithm has no means of determining how many times to call the engine.

In a new section [tr.rand.eng.traits], we proposed to define the engine_traits template as:

template< class PSRE >
class engine_traits
{
  static  std::size_t  bits_of_randomness = 0u;
  static  std::string  name()  { return "unknown_engine"; }
  // TODO: other traits here
};

Further, each engine described in [tr.rand.engine] would be accompanied by a complete specialization of this new engine_traits template.

Proposed resolution:

[ Berlin: Walter: While useful for implementation per TR1, N1932 has no need for this feature. ]


516. Seeding subtract_with_carry_01 using a generator

Section: TR1 5.1.4.4 [tr.rand.eng.sub1]  Status: Ready  Submitter: Walter Brown  Date: 3 Jul 2005

Paragraph 6 says:

... obtained by successive invocations of g, ...

We recommend instead:

... obtained by taking successive invocations of g mod 2**32, ...

as the context seems to require only 32-bit quantities be used here.

Proposed resolution:

Berlin: N1932 adopts the proposed resultion: see 26.3.3.4/7. Moved to Ready.


518. Are insert and erase stable for unordered_multiset and unordered_multimap?

Section: TR1 6.3 [tr.hash]  Status: Review  Submitter: Matt Austern  Date: 3 Jul 2005

Issue 371 deals with stability of multiset/multimap under insert and erase (i.e. do they preserve the relative order in ranges of equal elements). The same issue applies to unordered_multiset and unordered_multimap.

Proposed resolution:


520. Result_of and pointers to data members

Section: TR1 3.6 [tr.func.bind]  Status: Tentatively Ready  Submitter: Pete Becker  Date: 3 Jul 2005

In the original proposal for binders, the return type of bind() when called with a pointer to member data as it's callable object was defined to be mem_fn(ptr); when Peter Dimov and I unified the descriptions of the TR1 function objects we hoisted the descriptions of return types into the INVOKE pseudo-function and into result_of. Unfortunately, we left pointer to member data out of result_of, so bind doesn't have any specified behavior when called with a pointer to member data.

Proposed resolution:

[ Pete and Peter will provide wording. ]

In 20.5.4 [lib.func.ret] ([tr.func.ret]) p3 add the following bullet after bullet 2:

  1. If F is a member data pointer type R T::*, type shall be cv R& when T1 is cv U1&, R otherwise.

[ Peter provided wording. ]


521. Garbled requirements for argument_type in reference_wrapper

Section: TR1 2.1.2 [tr.util.refwrp.refwrp]  Status: Ready  Submitter: Pete Becker  Date: 3 Jul 2005

2.1.2/3, second bullet item currently says that reference_wrapper<T> is derived from unary_function<T, R> if T is:

a pointer to member function type with cv-qualifier cv and no arguments; the type T1 is cv T* and R is the return type of the pointer to member function;

The type of T1 can't be cv T*, 'cause that's a pointer to a pointer to member function. It should be a pointer to the class that T is a pointer to member of. Like this:

a pointer to a member function R T0::f() cv (where cv represents the member function's cv-qualifiers); the type T1 is cv T0*

Similarly, bullet item 2 in 2.1.2/4 should be:

a pointer to a member function R T0::f(T2) cv (where cv represents the member function's cv-qualifiers); the type T1 is cv T0*

Proposed resolution:

Change bullet item 2 in 2.1.2/3:

Change bullet item 2 in 2.1.2/4:


522. Tuple doesn't define swap

Section: TR1 6.1 [tr.tuple]  Status: Open  Submitter: Andy Koenig  Date: 3 Jul 2005

Tuple doesn't define swap(). It should.

[ Berlin: Doug to provide wording. ]

Proposed resolution:


523. regex case-insensitive character ranges are unimplementable as specified

Section: TR1 7 [tr.re]  Status: New  Submitter: Eric Niebler  Date: 1 Jul 2005

A problem with TR1 regex is currently being discussed on the Boost developers list. It involves the handling of case-insensitive matching of character ranges such as [Z-a]. The proper behavior (according to the ECMAScript standard) is unimplementable given the current specification of the TR1 regex_traits<> class template. John Maddock, the author of the TR1 regex proposal, agrees there is a problem. The full discussion can be found at http://lists.boost.org/boost/2005/06/28850.php (first message copied below). We don't have any recommendations as yet.

-- Begin original message --

The situation of interest is described in the ECMAScript specification (ECMA-262), section 15.10.2.15:

"Even if the pattern ignores case, the case of the two ends of a range is significant in determining which characters belong to the range. Thus, for example, the pattern /[E-F]/i matches only the letters E, F, e, and f, while the pattern /[E-f]/i matches all upper and lower-case ASCII letters as well as the symbols [, \, ], ^, _, and `."

A more interesting case is what should happen when doing a case-insentitive match on a range such as [Z-a]. It should match z, Z, a, A and the symbols [, \, ], ^, _, and `. This is not what happens with Boost.Regex (it throws an exception from the regex constructor).

The tough pill to swallow is that, given the specification in TR1, I don't think there is any effective way to handle this situation. According to the spec, case-insensitivity is handled with regex_traits<>::translate_nocase(CharT) -- two characters are equivalent if they compare equal after both are sent through the translate_nocase function. But I don't see any way of using this translation function to make character ranges case-insensitive. Consider the difficulty of detecting whether "z" is in the range [Z-a]. Applying the transformation to "z" has no effect (it is essentially std::tolower). And we're not allowed to apply the transformation to the ends of the range, because as ECMA-262 says, "the case of the two ends of a range is significant."

So AFAICT, TR1 regex is just broken, as is Boost.Regex. One possible fix is to redefine translate_nocase to return a string_type containing all the characters that should compare equal to the specified character. But this function is hard to implement for Unicode, and it doesn't play nice with the existing ctype facet. What a mess!

-- End original message --

[ John Maddock adds: ]

One small correction, I have since found that ICU's regex package does implement this correctly, using a similar mechanism to the current TR1.Regex.

Given an expression [c1-c2] that is compiled as case insensitive it:

Enumerates every character in the range c1 to c2 and converts it to it's case folded equivalent. That case folded character is then used a key to a table of equivalence classes, and each member of the class is added to the list of possible matches supported by the character-class. This second step isn't possible with our current traits class design, but isn't necessary if the input text is also converted to a case-folded equivalent on the fly.

ICU applies similar brute force mechanisms to character classes such as [[:lower:]] and [[:word:]], however these are at least cached, so the impact is less noticeable in this case.

Quick and dirty performance comparisons show that expressions such as "[X-\\x{fff0}]+" are indeed very slow to compile with ICU (about 200 times slower than a "normal" expression). For an application that uses a lot of regexes this could have a noticeable performance impact. ICU also has an advantage in that it knows the range of valid characters codes: code points outside that range are assumed not to require enumeration, as they can not be part of any equivalence class. I presume that if we want the TR1.Regex to work with arbitrarily large character sets enumeration really does become impractical.

Finally note that Unicode has:

Three cases (upper, lower and title). One to many, and many to one case transformations. Character that have context sensitive case translations - for example an uppercase sigma has two different lowercase forms - the form chosen depends on context(is it end of a word or not), a caseless match for an upper case sigma should match either of the lower case forms, which is why case folding is often approximated by tolower(toupper(c)).

Probably we need some way to enumerate character equivalence classes, including digraphs (either as a result or an input), and some way to tell whether the next character pair is a valid digraph in the current locale.

Hoping this doesn't make this even more complex that it was already,

Proposed resolution:


524. regex named character classes and case-insensitivity don't mix

Section: TR1 7 [tr.re]  Status: New  Submitter: Eric Niebler  Date: 1 Jul 2005

This defect is also being discussed on the Boost developers list. The full discussion can be found here: http://lists.boost.org/boost/2005/07/29546.php

-- Begin original message --

Also, I may have found another issue, closely related to the one under discussion. It regards case-insensitive matching of named character classes. The regex_traits<> provides two functions for working with named char classes: lookup_classname and isctype. To match a char class such as [[:alpha:]], you pass "alpha" to lookup_classname and get a bitmask. Later, you pass a char and the bitmask to isctype and get a bool yes/no answer.

But how does case-insensitivity work in this scenario? Suppose we're doing a case-insensitive match on [[:lower:]]. It should behave as if it were [[:lower:][:upper:]], right? But there doesn't seem to be enough smarts in the regex_traits interface to do this.

Imagine I write a traits class which recognizes [[:fubar:]], and the "fubar" char class happens to be case-sensitive. How is the regex engine to know that? And how should it do a case-insensitive match of a character against the [[:fubar:]] char class? John, can you confirm this is a legitimate problem?

I see two options:

1) Add a bool icase parameter to lookup_classname. Then, lookup_classname( "upper", true ) will know to return lower|upper instead of just upper.

2) Add a isctype_nocase function

I prefer (1) because the extra computation happens at the time the pattern is compiled rather than when it is executed.

-- End original message --

For what it's worth, John has also expressed his preference for option (1) above.

Proposed resolution:


525. type traits definitions not clear

Section: TR1 4.5 [tr.meta.unary]  Status: Open  Submitter: Robert Klarer  Date: 11 Jul 2005

It is not completely clear how the primary type traits deal with cv-qualified types. And several of the secondary type traits seem to be lacking a definition.

[ Berlin: Howard to provide wording. ]

Proposed resolution:


526. Is it undefined if a function in the standard changes in parameters?

Section: 23.1.1 [lib.sequence.reqmts]  Status: Open  Submitter: Chris Jefferson  Date: 14 Sep 2005

Problem: There are a number of places in the C++ standard library where it is possible to write what appear to be sensible ways of calling functions, but which can cause problems in some (or all) implementations, as they cause the values given to the function to be changed in a way not specified in standard (and therefore not coded to correctly work). These fall into two similar categories.

1) Parameters taken by const reference can be changed during execution of the function

Examples:

Given std::vector<int> v:

v.insert(v.begin(), v[2]);

v[2] can be changed by moving elements of vector

Given std::list<int> l:

l.remove(*l.begin());

Will delete the first element, and then continue trying to access it. This is particularily vicious, as it will appear to work in almost all cases.

2) A range is given which changes during the execution of the function: Similarly,

v.insert(v.begin(), v.begin()+4, v.begin()+6);

This kind of problem has been partly covered in some cases. For example std::copy(first, last, result) states that result cannot be in the range [first, last). However, does this cover the case where result is a reverse_iterator built from some iterator in the range [first, last)? Also, std::copy would still break if result was reverse_iterator(last + 1), yet this is not forbidden by the standard

Solution:

One option would be to try to more carefully limit the requirements of each function. There are many functions which would have to be checked. However as has been shown in the std::copy case, this may be difficult. A simpler, more global option would be to somewhere insert text similar to:

If the execution of any function would change either any values passed by reference or any value in any range passed to a function in a way not defined in the definition of that function, the result is undefined.

Such code would have to at least cover chapters 23 and 25 (the sections I read through carefully). I can see no harm on applying it to much of the rest of the standard.

Some existing parts of the standard could be improved to fit with this, for example the requires for 25.2.1 (Copy) could be adjusted to:

Requires: For each non-negative integer n < (last - first), assigning to *(result + n) must not alter any value in the range [first + n, last).

However, this may add excessive complication.

One other benefit of clearly introducing this text is that it would allow a number of small optimisations, such as caching values passed by const reference.

Matt Austern adds that this issue also exists for the insert and erase members of the ordered and unordered associative containers.

[ Berlin: Lots of controversey over how this should be solved. Lots of confusion as to whether we're talking about self referencing iterators or references. Needs a good survey as to the cases where this matters, for which implementations, and how expensive it is to fix each case. ]

Proposed resolution:


527. tr1::bind has lost its Throws clause

Section: TR1 3.6.3 [tr.func.bind.bind]  Status: Open  Submitter: Peter Dimov  Date: 01 Oct 2005

The original bind proposal gives the guarantee that tr1::bind(f, t1, ..., tN) does not throw when the copy constructors of f, t1, ..., tN don't.

This guarantee is not present in the final version of TR1.

I'm pretty certain that we never removed it on purpose. Editorial omission? :-)

[ Berlin: not quite editorial, needs proposed wording. ]

Proposed resolution:

In 20.5.10.1.3 [lib.func.bind.bind] ([tr.func.bind.bind]), add a new paragraph after p2:

Throws: Nothing unless one of the copy constructors of f, t1, t2, ..., tN throws an exception.

Add a new paragraph after p4:

Throws: nothing unless one of the copy constructors of f, t1, t2, ..., tN throws an exception.

528. TR1: issue 6.19 vs 6.3.4.3/2 (and 6.3.4.5/2)

Section: TR1 6.3.4 [tr.unord.unord]  Status: Open  Submitter: Paolo Carlini  Date: 12 Oct 2005

while implementing the resolution of issue 6.19 I'm noticing the following: according to 6.3.4.3/2 (and 6.3.4.5/2), for unordered_set and unordered_multiset:

"The iterator and const_iterator types are both const types. It is unspecified whether they are the same type"

Now, according to the resolution of 6.19, we have overloads of insert with hint and erase (single and range) both for iterator and const_iterator, which, AFAICS, can be meaningful at the same time *only* if iterator and const_iterator *are* in fact different types.

Then, iterator and const_iterator are *required* to be different types? Or that is an unintended consequence? Maybe the overloads for plain iterators should be added only to unordered_map and unordered_multimap? Or, of course, I'm missing something?

Proposed resolution:

Add to 6.3.4.3p2 (and 6.3.4.5p2):

2 ... The iterator and const_iterator types are both const constant iterator types. It is unspecified whether they are the same type.

Add a new subsection to 17.4.4 [lib.conforming]:

An implementation shall not supply an overloaded function signature specified in any library clause if such a signature would be inherently ambiguous during overload resolution due to two library types referring to the same type.

[Note: For example, this occurs when a container's iterator and const_iterator types are the same. -- end note]

[ Post-Berlin: Beman supplied wording. ]


529. The standard encourages redundant and confusing preconditions

Section: 17.4.3.8 [lib.res.on.required]  Status: Open  Submitter: David Abrahams  Date: 25 Oct 2005

17.4.3.8/1 says:

Violation of the preconditions specified in a function's Required behavior: paragraph results in undefined behavior unless the function's Throws: paragraph specifies throwing an exception when the precondition is violated.

This implies that a precondition violation can lead to defined behavior. That conflicts with the only reasonable definition of precondition: that a violation leads to undefined behavior. Any other definition muddies the waters when it comes to analyzing program correctness, because precondition violations may be routinely done in correct code (e.g. you can use std::vector::at with the full expectation that you'll get an exception when your index is out of range, catch the exception, and continue). Not only is it a bad example to set, but it encourages needless complication and redundancy in the standard. For example:

  21 Strings library 
  21.3.3 basic_string capacity

  void resize(size_type n, charT c);

  5 Requires: n <= max_size()
  6 Throws: length_error if n > max_size().
  7 Effects: Alters the length of the string designated by *this as follows:

The Requires clause is entirely redundant and can be dropped. We could make that simplifying change (and many others like it) even without changing 17.4.3.8/1; the wording there just seems to encourage the redundant and error-prone Requires: clause.

Proposed resolution:

1. Change 17.4.3.8/1 to read:

Violation of the preconditions specified in a function's Required behavior: paragraph results in undefined behavior unless the function's Throws: paragraph specifies throwing an exception when the precondition is violated.

2. Go through and remove redundant Requires: clauses. Specifics to be provided by Dave A.

[ Berlin: The LWG requests a detailed survey of part 2 of the proposed resolution. ]


530. Must elements of a string be contiguous?

Section: 21.3 [lib.basic.string]  Status: Ready  Submitter: Matt Austern  Date: 15 Nov 2005

Issue 69, which was incorporated into C++03, mandated    that the elements of a vector must be stored in contiguous memory.    Should the same also apply to basic_string?

We almost require contiguity already. Clause 23.3.4 [lib.multiset]   defines operator[] as data()[pos]. What's missing   is a similar guarantee if we access the string's elements via the   iterator interface.

Given the existence of data(), and the definition of   operator[] and at in terms of data,   I don't believe it's possible to write a useful and standard-   conforming basic_string that isn't contiguous. I'm not   aware of any non-contiguous implementation. We should just require   it.

Proposed resolution:

Add the following text to the end of 21.3 [lib.basic.string], paragraph 2.

 

The characters in a string are stored contiguously, meaning that if   s is a basic_string<charT, Allocator>, then   it obeys the identity   &*(s.begin() + n) == &*s.begin() + n   for all 0 <= n < s.size().


531. array forms of unformatted input functions

Section: 27.6.1.3 [lib.istream.unformatted]  Status: Ready  Submitter: Martin Sebor  Date: 23 Nov 2005

The array forms of unformatted input functions don't seem to have well-defined semantics for zero-element arrays in a couple of cases. The affected ones (istream::get() and istream::getline()) are supposed to terminate when (n - 1) characters are stored, which obviously can never be true when (n == 0) holds to start with. See c++std-lib-16071.

Proposed resolution:

I suggest changing 27.6.1.3, p7 (istream::get()), bullet 1 to read:

and similarly p17 (istream::getline()), bullet 3 to:

In addition, to clarify that istream::getline() must not store the terminating NUL character unless the the array has non-zero size, Robert Klarer suggests in c++std-lib-16082 to change 27.6.1.3, p20 to read:

In any case, provided (n > 0) is true, it then stores a null character (using charT()) into the next successive location of the array.


532. Tuple comparison

Section: TR1 6.1.3.5 [tr.tuple.rel]  Status: Open  Submitter: David Abrahams  Date: 29 Nov 2005

Where possible, tuple comparison operators <,<=,=>, and > ought to be defined in terms of std::less rather than operator<, in order to support comparison of tuples of pointers.

Proposed resolution:

change 6.1.3.5/5 from:

Returns: The result of a lexicographical comparison between t and u. The result is defined as: (bool)(get<0>(t) < get<0>(u)) || (!(bool)(get<0>(u) < get<0>(t)) && ttail < utail), where rtail for some tuple r is a tuple containing all but the first element of r. For any two zero-length tuples e and f, e < f returns false.

to:

Returns: The result of a lexicographical comparison between t and u. For any two zero-length tuples e and f, e < f returns false. Otherwise, the result is defined as: cmp( get<0>(t), get<0>(u)) || (!cmp(get<0>(u), get<0>(t)) && ttail < utail), where rtail for some tuple r is a tuple containing all but the first element of r, and cmp(x,y) is an unspecified function template defined as follows.

Where T is the type of x and U is the type of y:

if T and U are pointer types and T is convertible to U, returns less<U>()(x,y)

otherwise, if T and U are pointer types, returns less<T>()(x,y)

otherwise, returns (bool)(x < y)

[ Berlin: This issue is much bigger than just tuple (pair, containers, algorithms). Dietmar will survey and work up proposed wording. ]


534. Missing basic_string members

Section: 21.3 [lib.basic.string]  Status: Review  Submitter: Alisdair Meredith  Date: 16 Nov 2005

OK, we all know std::basic_string is bloated and already has way too many members. However, I propose it is missing 3 useful members that are often expected by users believing it is a close approximation of the container concept. All 3 are listed in table 71 as 'optional'

i/ pop_back.

This is the one I feel most strongly about, as I only just discovered it was missing as we are switching to a more conforming standard library <g>

I find it particularly inconsistent to support push_back, but not pop_back.

ii/ back.

There are certainly cases where I want to examine the last character of a string before deciding to append, or to trim trailing path separators from directory names etc. *rbegin() somehow feels inelegant.

iii/ front

This one I don't feel strongly about, but if I can get the first two, this one feels that it should be added as a 'me too' for consistency.

I believe this would be similarly useful to the data() member recently added to vector, or at() member added to the maps.

Proposed resolution:

Add the following members to definition of class template basic_string, 21.3p7

void pop_back ()

const charT & front() const
charT & front()

const charT & back() const
charT & back()

Add the following paragraphs to basic_string description

21.3.4p5

const charT & front() const
charT & front()

Precondition: !empty()

Effects: Equivalent to operator[](0).

21.3.4p6

const charT & back() const
charT & back()

Precondition: !empty()

Effects: Equivalent to operator[]( size() - 1).

21.3.5.5p10

void pop_back ()

Precondition: !empty()

Effects: Equivalent to erase( size() - 1, 1 ).

Update Table 71: (optional sequence operations) Add basic_string to the list of containers for the following operations.

a.front()
a.back()
a.push_back()
a.pop_back()
a[n]

[ Berlin: Has support. Alisdair provided wording. ]


535. std::string::swap specification poorly worded

Section: 21.3.5.8 [lib.string::swap]  Status: Ready  Submitter: Beman Dawes  Date: 14 Dec 2005

std::string::swap currently says for effects and postcondition:

Effects: Swaps the contents of the two strings.

Postcondition: *this contains the characters that were in s, s contains the characters that were in *this.

Specifying both Effects and Postcondition seems redundant, and the postcondition needs to be made stronger. Users would be unhappy if the characters were not in the same order after the swap.

Proposed resolution:

Effects: Swaps the contents of the two strings.

Postcondition: *this contains the same sequence of characters that were was in s, s contains the same sequence of characters that were was in *this.


536. Container iterator constructor and explicit convertibility

Section: 23.1 [lib.container.requirements]  Status: Open  Submitter: Joaquín M López Muñoz  Date: 17 Dec 2005

The iterator constructor X(i,j) for containers as defined in 23.1.1 and 23.2.2 does only require that i and j be input iterators but nothing is said about their associated value_type. There are three sensible options:

  1. iterator's value_type is exactly X::value_type (modulo cv).
  2. iterator's value_type is *implicitly* convertible to X::value_type.
  3. iterator's value_type is *explicitly* convertible to X::value_type.

The issue has practical implications, and stdlib vendors have taken divergent approaches to it: Dinkumware follows 2, libstdc++ follows 3.

The same problem applies to the definition of insert(p,i,j) for sequences and insert(i,j) for associative contianers, as well as assign.

[ The following added by Howard and the example code was originally written by Dietmar. ]

Valid code below?

#include <vector> 
#include <iterator> 
#include <iostream> 

struct foo 
{ 
    explicit foo(int) {} 
}; 

int main() 
{ 
    std::vector<int> v_int; 
    std::vector<foo> v_foo1(v_int.begin(), v_int.end()); 
    std::vector<foo> v_foo2((std::istream_iterator<int>(std::cin)), 
                             std::istream_iterator<int>()); 
} 

Proposed resolution:

[ Berlin: Some support, not universal, for respecting the explicit qualifier. ]


537. Typos in the signatures in 27.6.1.3/42-43 and 27.6.2.4

Section: 27.6.1.3 [lib.istream.unformatted]  Status: Ready  Submitter: Paolo Carlini  Date: 12 Feb 2006

In the most recent working draft, I'm still seeing:

seekg(off_type& off, ios_base::seekdir dir)

and

seekp(pos_type& pos)

seekp(off_type& off, ios_base::seekdir dir)

that is, by reference off and pos arguments.

Proposed resolution:

After 27.6.1.3p42 change:

basic_istream<charT,traits>& seekg(off_type& off, ios_base::seekdir dir);

After 27.6.2.4p1 change:

basic_ostream<charT,traits>& seekp(pos_type& pos);

After 27.6.2.4p3 change:

basic_ostream<charT,traits>& seekp(off_type& off, ios_base::seekdir dir);

538. 241 again: Does unique_copy() require CopyConstructible and Assignable?

Section: 25.2.8 [lib.alg.unique]  Status: Ready  Submitter: Howard Hinnant  Date: 9 Feb 2006

I believe I botched the resolution of 241 "Does unique_copy() require CopyConstructible and Assignable?" which now has WP status.

This talks about unique_copy requirements and currently reads:

-5- Requires: The ranges [first, last) and [result, result+(last-first)) shall not overlap. The expression *result = *first shall be valid. If neither InputIterator nor OutputIterator meets the requirements of forward iterator then the value type of InputIterator must be CopyConstructible (20.1.3). Otherwise CopyConstructible is not required.

The problem (which Paolo discovered) is that when the iterators are at their most restrictive (InputIterator, OutputIterator), then we want InputIterator::value_type to be both CopyConstructible and CopyAssignable (for the most efficient implementation). However this proposed resolution only makes it clear that it is CopyConstructible, and that one can assign from *first to *result. This latter requirement does not necessarily imply that you can:

*first = *first;

Proposed resolution:

-5- Requires: The ranges [first, last) and [result, result+(last-first)) shall not overlap. The expression *result = *first shall be valid. If neither InputIterator nor OutputIterator meets the requirements of forward iterator then the value type value_type of InputIterator must be CopyConstructible (20.1.3) and Assignable. Otherwise CopyConstructible is not required.

539. partial_sum and adjacent_difference should mention requirements

Section: 26.4 [lib.rand]  Status: Open  Submitter: Marc Schoolderman  Date: 6 Feb 2006

There are some problems in the definition of partial_sum and adjacent_difference in 26.4 [lib.numeric.ops]

Unlike accumulate and inner_product, these functions are not parametrized on a "type T", instead, 26.4.3 [lib.partial.sum] simply specifies the effects clause as;

Assigns to every element referred to by iterator i in the range [result,result + (last - first)) a value correspondingly equal to
((...(* first + *( first + 1)) + ...) + *( first + ( i - result )))

And similarly for BinaryOperation. Using just this definition, it seems logical to expect that:

char i_array[4] = { 100, 100, 100, 100 };
int  o_array[4];

std::partial_sum(i_array, i_array+4, o_array);

Is equivalent to

int o_array[4] = { 100, 100+100, 100+100+100, 100+100+100+100 };

i.e. 100, 200, 300, 400, with addition happening in the result type, int.

Yet all implementations I have tested produce 100, -56, 44, -112, because they are using an accumulator of the InputIterator's value_type, which in this case is char, not int.

The issue becomes more noticeable when the result of the expression *i + *(i+1) or binary_op(*i, *i-1) can't be converted to the value_type. In a contrived example:

enum not_int { x = 1, y = 2 };
...
not_int e_array[4] = { x, x, y, y };
std::partial_sum(e_array, e_array+4, o_array);

Is it the intent that the operations happen in the input type, or in the result type?

If the intent is that operations happen in the result type, something like this should be added to the "Requires" clause of 26.4.3/4 [lib.partial.sum]:

The type of *i + *(i+1) or binary_op(*i, *(i+1)) shall meet the requirements of CopyConstructible (20.1.3) and Assignable (23.1) types.

(As also required for T in 26.4.1 [lib.accumulate] and 26.4.2 [lib.inner.product].)

The "auto initializer" feature proposed in N1894 is not required to implement partial_sum this way. The 'narrowing' behaviour can still be obtained by using the std::plus<> function object.

If the intent is that operations happen in the input type, then something like this should be added instead;

The type of *first shall meet the requirements of CopyConstructible (20.1.3) and Assignable (23.1) types. The result of *i + *(i+1) or binary_op(*i, *(i+1)) shall be convertible to this type.

The 'widening' behaviour can then be obtained by writing a custom proxy iterator, which is somewhat involved.

In both cases, the semantics should probably be clarified.

26.4.4 [lib.adjacent.difference] is similarly underspecified, although all implementations seem to perform operations in the 'result' type:

unsigned char i_array[4] = { 4, 3, 2, 1 };
int o_array[4];

std::adjacent_difference(i_array, i_array+4, o_array);

o_array is 4, -1, -1, -1 as expected, not 4, 255, 255, 255.

In any case, adjacent_difference doesn't mention the requirements on the value_type; it can be brought in line with the rest of 26.4 [lib.numeric.ops] by adding the following to 26.4.4/2 [lib.adjacent.difference]:

The type of *first shall meet the requirements of CopyConstructible (20.1.3) and Assignable (23.1) types."

Proposed resolution:

[ Berlin: Giving output iterator's value_types very controversial. Suggestion of adding signatures to allow user to specify "accumulator". ]


540. shared_ptr<void>::operator*()

Section: TR1 2.2.3.5 [tr.util.smartptr.shared.obs]  Status: Ready  Submitter: Martin Sebor  Date: 15 Oct 2005

I'm trying to reconcile the note in tr.util.smartptr.shared.obs, p6 that talks about the operator*() member function of shared_ptr:

Notes: When T is void, attempting to instantiate this member function renders the program ill-formed. [Note: Instantiating shared_ptr<void> does not necessarily result in instantiating this member function. --end note]

with the requirement in temp.inst, p1:

The implicit instantiation of a class template specialization causes the implicit instantiation of the declarations, but not of the definitions...

I assume that what the note is really trying to say is that "instantiating shared_ptr<void> *must not* result in instantiating this member function." That is, that this function must not be declared a member of shared_ptr<void>. Is my interpretation correct?

Proposed resolution:

Change 2.2.3.5p6

-6- Notes: When T is void, attempting to instantiate this member function renders the program ill-formed. [Note: Instantiating shared_ptr<void> does not necessarily result in instantiating this member function. --end note] it is unspecified whether this member function is declared or not, and if so, what its return type is, except that the declaration (although not necessarily the definition) of the function shall be well-formed.

541. shared_ptr template assignment and void

Section: TR1 2.2.3 [tr.util.smartptr.shared]  Status: Tentatively Ready  Submitter: Martin Sebor  Date: 16 Oct 2005

Is the void specialization of the template assignment operator taking a shared_ptr<void> as an argument supposed be well-formed?

I.e., is this snippet well-formed:

shared_ptr<void> p;
p.operator=<void>(p);

Gcc complains about auto_ptr<void>::operator*() returning a reference to void. I suspect it's because shared_ptr has two template assignment operators, one of which takes auto_ptr, and the auto_ptr template gets implicitly instantiated in the process of overload resolution.

The only way I see around it is to do the same trick with auto_ptr<void> operator*() as with the same operator in shared_ptr<void>.

PS Strangely enough, the EDG front end doesn't mind the code, even though in a small test case (below) I can reproduce the error with it as well.

template <class T>
struct A { T& operator*() { return *(T*)0; } };

template <class T>
struct B {
    void operator= (const B&) { }
    template <class U>
    void operator= (const B<U>&) { }
    template <class U>
    void operator= (const A<U>&) { }
};

int main ()
{
    B<void> b;
    b.operator=<void>(b);
}

Proposed resolution:

In [lib.memory] change:

template<class X> class auto_ptr;
template<> class auto_ptr<void>;

In [lib.auto.ptr]/2 add the following before the last closing brace:

template<> class auto_ptr<void>
{
public:
    typedef void element_type;
};

542. shared_ptr observers

Section: TR1 2.2.3.5 [tr.util.smartptr.shared.obs]  Status: New  Submitter: Martin Sebor  Date: 18 Oct 2005

Peter Dimov wrote: To: C++ libraries mailing list Message c++std-lib-15614 [...] The intent is for both use_count() and unique() to work in a threaded environment. They are intrinsically prone to race conditions, but they never return garbage.

This is a crucial piece of information that I really wish were captured in the text. Having this in a non-normative note would have made everything crystal clear to me and probably stopped me from ever starting this discussion :) Instead, the sentence in p12 "use only for debugging and testing purposes, not for production code" very strongly suggests that implementations can and even are encouraged to return garbage (when threads are involved) for performance reasons.

How about adding an informative note along these lines:

Note: Implementations are encouraged to provide well-defined behavior for use_count() and unique() even in the presence of multiple threads.

I don't necessarily insist on the exact wording, just that we capture the intent.

Proposed resolution:


543. valarray slice default constructor

Section: 26.3.4 [lib.complex.members]  Status: New  Submitter: Howard Hinnant  Date: 3 Nov 2005

If one explicitly constructs a slice or glice with the default constructor, does the standard require this slice to have any usable state? It says "creates a slice which specifies no elements", which could be interpreted two ways:

  1. There are no elements to which the slice refers (i.e. undefined).
  2. The slice specifies an array with no elements in it (i.e. defined).

Here is a bit of code to illustrate:

#include <iostream>
#include <valarray>

int main()
{
    std::valarray<int> v(10);
    std::valarray<int> v2 = v[std::slice()];
    std::cout << "v[slice()].size() = " << v2.size() << '\n';
}

Is the behavior undefined? Or should the output be:

v[slice()].size() = 0

There is a similar question and wording for gslice at 26.3.6.1p1.

Proposed resolution:


544. minor NULL problems in C.2

Section: C.2 [diff.library]  Status: Tentatively Ready  Submitter: Martin Sebor  Date: 25 Nov 2005

According to C.2.2.3, p1, "the macro NULL, defined in any of <clocale>, <cstddef>, <cstdio>, <cstdlib>, <cstring>, <ctime>, or <cwchar>." This is consistent with the C standard.

However, Table 95 in C.2 fails to mention <clocale> and <cstdlib>.

In addition, C.2, p2 claims that "The C++ Standard library provides 54 standard macros from the C library, as shown in Table 95." While table 95 does have 54 entries, since a couple of them (including the NULL macro) are listed more than once, the actual number of macros defined by the C++ Standard Library may not be 54.

Proposed resolution:

I propose we add <clocale> and <cstdlib> to Table 96 and remove the number of macros from C.2, p2 and reword the sentence as follows:

The C++ Standard library provides 54 standard macros from defines a number macros corresponding to those defined by the C Standard library, as shown in Table 96.

545. When is a deleter deleted?

Section: TR1 2.2.3.2 [tr.util.smartptr.shared.dest]  Status: New  Submitter: Matt Austern  Date: 10 Jan 2006

The description of ~shared_ptr doesn't say when the shared_ptr's deleter, if any, is destroyed. In principle there are two possibilities: it is destroyed unconditionally whenever ~shared_ptr is executed (which, from an implementation standpoint, means that the deleter is copied whenever the shared_ptr is copied), or it is destroyed immediately after the owned pointer is destroyed (which, from an implementation standpoint, means that the deleter object is shared between instances). We should say which it is.

Proposed resolution:

Add after the first sentence of [lib.util.smartptr.getdeleter]/1:

The returned pointer remains valid as long as there exists a shared_ptr instance that owns d.

[Note: it is unspecified whether the pointer remains valid longer than that. This can happen if the implementation doesn't destroy the deleter until all weak_ptr instances in the ownership group are destroyed. -- end note]


546. _Longlong and _ULonglong are integer types

Section: TR1 5.1.1 [tr.rand.req]  Status: New  Submitter: Matt Austern  Date: 10 Jan 2006

The TR sneaks in two new integer types, _Longlong and _Ulonglong, in [tr.c99]. The rest of the TR should use that type.  I believe this affects two places. First, the random number requirements, 5.1.1/10-11, lists all of the types with which template parameters named IntType and UIntType may be instantiated. _Longlong (or "long long", assuming it is added to C++0x) should be added to the IntType list, and UIntType (again, or "unsigned long long") should be added to the UIntType list.  Second, 6.3.2 lists the types for which hash<> is required to be instantiable. _Longlong and _Ulonglong should be added to that list, so that people may use long long as a hash key.

Proposed resolution:


547. division should be floating-point, not integer

Section: TR1 5.1.3 [tr.rand.var]  Status: New  Submitter: Matt Austern  Date: 10 Jan 2006

Paragraph 10 describes how a variate generator uses numbers produced by an engine to pass to a generator. The sentence that concerns me is: "Otherwise, if the value for engine_value_type::result_type is true and the value for Distribution::input_type is false [i.e. if the engine produces integers and the engine wants floating-point values], then the numbers in s_eng are divided by engine().max() - engine().min() + 1 to obtain the numbers in s_e." Since the engine is producing integers, both the numerator and the denominator are integers and we'll be doing integer division, which I don't think is what we want. Shouldn't we be performing a conversion to a floating-point type first?

Proposed resolution:


548. May random_device block?

Section: TR1 5.1.6 [tr.rand.device]  Status: Open  Submitter: Matt Austern  Date: 10 Jan 2006

Class random_device "produces non-deterministic random numbers", using some external source of entropy. In most real-world systems, the amount of available entropy is limited. Suppose that entropy has been exhausted. What is an implementation permitted to do? In particular, is it permitted to block indefinitely until more random bits are available, or is the implementation required to detect failure immediately? This is not an academic question. On Linux a straightforward implementation would read from /dev/random, and "When the entropy pool is empty, reads to /dev/random will block until additional environmental noise is gathered." Programmers need to know whether random_device is permitted to (or possibly even required to?) behave the same way.

[ Berlin: Walter: N1932 considers this NAD. Does the standard specify whether std::cin may block? ]

Proposed resolution:


549. Undefined variable in binomial_distribution

Section: TR1 5.1.7.5 [tr.rand.dist.bin]  Status: Ready  Submitter: Matt Austern  Date: 10 Jan 2006

Paragraph 1 says that "A binomial distributon random distribution produces integer values i>0 with p(i) = (n choose i) * p*i * (1-p)^(t-i), where t and p are the parameters of the distribution. OK, that tells us what t, p, and i are. What's n?

Proposed resolution:

Berlin: Typo: "n" replaced by "t" in N1932: see 26.3.7.2.2/1.


550. What should the return type of pow(float,int) be?

Section: 26.5 [lib.numarray]  Status: New  Submitter: Howard Hinnant  Date: 12 Jan 2006

Assuming we adopt the C compatibility package from C99 what should be the return type of the following signature be:

?  pow(float, int);

C++03 says that the return type should be float. TR1 and C90/99 say the return type should be double. This can put clients into a situation where C++03 provides answers that are not as high quality as C90/C99/TR1. For example:

#include <math.h>

int main()
{
    float x = 2080703.375F;
    double y = pow(x, 2);
}

Assuming an IEEE 32 bit float and IEEE 64 bit double, C90/C99/TR1 all suggest:

y = 4329326534736.390625

which is exactly right. While C++98/C++03 demands:

y = 4329326510080.

which is only approximately right.

I recommend that C++0X adopt the mixed mode arithmetic already adopted by Fortran, C and TR1 and make the return type of pow(float,int) be double.

Proposed resolution:


551. <ccomplex>

Section: TR1 8.2 [tr.c99.ccmplx]  Status: New  Submitter: Howard Hinnant  Date: 23 Jan 2006

Previously xxx.h was parsable by C++. But in the case of C99's <complex.h> it isn't. Otherwise we could model it just like <string.h>, <cstring>, <string>:

In the case of C's complex, the C API won't compile in C++. So we have:

The ? can't refer to the C API. TR1 currently says:

Proposed resolution:

Strike 8.2 and 8.3 (and any other references to <ccomplex> and <complex.h>).


552. random_shuffle and its generator

Section: 25.2.11 [lib.alg.random.shuffle]  Status: New  Submitter: Martin Sebor  Date: 25 Jan 2006

...is specified to shuffle its range by calling swap but not how (or even that) it's supposed to use the RandomNumberGenerator argument passed to it.

Shouldn't we require that the generator object actually be used by the algorithm to obtain a series of random numbers and specify how many times its operator() should be invoked by the algorithm?

Proposed resolution:


553. very minor editorial change intptr_t / uintptr_t

Section: TR1 8.22.1 [tr.c99.cstdint.syn]  Status: New  Submitter: Paolo Carlini  Date: 30 Jan 2006

In the synopsis, some types are identified as optional: int8_t, int16_t, and so on, consistently with C99, indeed.

On the other hand, intptr_t and uintptr_t, are not marked as such and probably should, consistently with C99, 7.18.1.4.

Proposed resolution:


554. Problem with lwg DR 184 numeric_limits<bool>

Section: 18.2.1.5 [lib.numeric.special]  Status: New  Submitter: Howard Hinnant  Date: 29 Jan 2006

I believe we have a bug in the resolution of: lwg 184 (WP status).

The resolution spells out each member of numeric_limits<bool>. The part I'm having a little trouble with is:

static const bool traps = false;

Should this not be implementation defined? Given:

int main()
{
     bool b1 = true;
     bool b2 = false;
     bool b3 = b1/b2;
}

If this causes a trap, shouldn't numeric_limits<bool>::traps be true?

Proposed resolution:

Change 18.2.1.5p3:

-3- The specialization for bool shall be provided as follows:
namespace std { 
   template <> class numeric_limits<bool> {
      ...
      static const bool traps = false implementation-defined;
      ...
   };
}

555. TR1, 8.21/1: typo

Section: TR1 8.21 [tr.c99.boolh]  Status: New  Submitter: Paolo Carlini  Date: 2 Feb 2006

This one, if nobody noticed it yet, seems really editorial: s/cstbool/cstdbool/

Proposed resolution:

Change 8.21p1:

-1- The header behaves as if it defines the additional macro defined in <cstdbool> by including the header <cstdbool>.

556. is Compare a BinaryPredicate?

Section: 25.3 [lib.alg.sorting]  Status: New  Submitter: Martin Sebor  Date: 5 Feb 2006

In 25, p8 we allow BinaryPredicates to return a type that's convertible to bool but need not actually be bool. That allows predicates to return things like proxies and requires that implementations be careful about what kinds of expressions they use the result of the predicate in (e.g., the expression in if (!pred(a, b)) need not be well-formed since the negation operator may be inaccessible or return a type that's not convertible to bool).

Here's the text for reference:

...if an algorithm takes BinaryPredicate binary_pred as its argument and first1 and first2 as its iterator arguments, it should work correctly in the construct if (binary_pred(*first1, first2)){...}.

In 25.3, p2 we require that the Compare function object return true of false, which would seem to preclude such proxies. The relevant text is here:

Compare is used as a function object which returns true if the first argument is less than the second, and false otherwise...

Proposed resolution:

I think we could fix this by rewording 25.3, p2 to read somthing like:

-2- Compare is used as a function object which returns true if the first argument a BinaryPredicate. The return value of the function call operator applied to an object of type Compare, when converted to type bool, yields true if the first argument of the call is less than the second, and false otherwise. Compare comp is used throughout for algorithms assuming an ordering relation. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

557. TR1: div(_Longlong, _Longlong) vs div(intmax_t, intmax_t)

Section: TR1 8.11.1 [tr.c99.cinttypes.syn]  Status: New  Submitter: Paolo Carlini  Date: 6 Feb 2006

I'm seeing a problem with such overloads: when, _Longlong == intmax_t == long long we end up, essentially, with the same arguments and different return types (lldiv_t and imaxdiv_t, respectively). Similar issue with abs(_Longlong) and abs(intmax_t), of course.

Comparing sections 8.25 and 8.11, I see an important difference, however: 8.25.3 and 8.25.4 carefully describe div and abs for _Longlong types (rightfully, because not moved over directly from C99), whereas there is no equivalent in 8.11: the abs and div overloads for intmax_t types appear only in the synopsis and are not described anywhere, in particular no mention in 8.11.2 (at variance with 8.25.2).

I'm wondering whether we really, really, want div and abs for intmax_t...

Proposed resolution:

Change the <cstdint> synopsis in 8.11.1:

...
intmax_t imaxabs(intmax_t i);
intmax_t abs(intmax_t i);

imaxdiv_t imaxdiv(intmax_t numer, intmax_t denom);
imaxdiv_t div(intmax_t numer, intmax_t denom);
...

558. lib.input.iterators Defect

Section: 24.1.1 [lib.input.iterators]  Status: New  Submitter: David Abrahams  Date: 9 Feb 2006

24.1.1 Input iterators [lib.input.iterators]

1 A class or a built-in type X satisfies the requirements of an input iterator for the value type T if the following expressions are valid, where U is the type of any specified member of type T, as shown in Table 73.

There is no capital U used in table 73. There is a lowercase u, but that is clearly not meant to denote a member of type T. Also, there's no description in 24.1.1 of what lowercase a means. IMO the above should have been...Hah, a and b are already covered in 24.1/11, so maybe it should have just been:

Proposed resolution:

Change 24.1.1p1:

-1- A class or a built-in type X satisfies the requirements of an input iterator for the value type T if the following expressions are valid, where U is the type of any specified member of type T, as shown in Table 73.

559. numeric_limits<const T>

Section: 18.2 [lib.support.limits]  Status: New  Submitter: Martin Sebor  Date: 19 Feb 2006

18.2.1, p2 requires implementations to provide specializations of the numeric_limits template for each scalar type. While this could be interepreted to include cv-qualified forms of such types such an interepretation is not reflected in the synopsis of the <limits> header.

The absence of specializations of the template on cv-qualified forms of fundamental types makes numeric_limits difficult to use in generic code where the constness (or volatility) of a type is not always immediately apparent. In such contexts, the primary template ends up being instantiated instead of the provided specialization, typically yielding unexpected behavior.

Proposed resolution:

Require that specializations of numeric_limits on cv-qualified fundamental types have the same semantics as those on the unqualifed forms of the same types.

Add to the synopsis of the <limits> header, immediately below the declaration of the primary template, the following:

template <class T> class numeric_limits<const T>;
template <class T> class numeric_limits<volatile T>;
template <class T> class numeric_limits<const volatile T>;

Add a new paragraph to the end of 18.2.1.1, with the following text:

-new-para- The value of each member of a numeric_limits specialization on a cv-qualified T is equal to the value of the same member of numeric_limits<T>.


560. User-defined allocators without default constructor

Section: 20.1.5 [lib.default.con.req]  Status: New  Submitter: Sergey P. Derevyago  Date: 17 Feb 2006

1. The essence of the problem.

User-defined allocators without default constructor are not explicitly supported by the standard but they can be supported just like std::vector supports elements without default constructor.

As a result, there exist implementations that work well with such allocators and implementations that don't.

2. The cause of the problem.

1) The standard doesn't explicitly state this intent but it should. In particular, 20.1.5p5 explicitly state the intent w.r.t. the allocator instances that compare non-equal. So it can similarly state the intent w.r.t. the user-defined allocators without default constructor.

2) Some container operations are obviously underspecified. In particular, 21.3.7.1p2 tells:

template<class charT, class traits, class Allocator>
  basic_string<charT,traits,Allocator> operator+(
    const charT* lhs,
    const basic_string<charT,traits,Allocator>& rhs
  );
Returns: basic_string<charT,traits,Allocator>(lhs) + rhs.

That leads to the basic_string<charT,traits,Allocator>(lhs, Allocator()) call. Obviously, the right requirement is:

Returns: basic_string<charT,traits,Allocator>(lhs, rhs.get_allocator()) + rhs.

It seems like a lot of DRs can be submitted on this "Absent call to get_allocator()" topic.

3. Proposed actions.

1) Explicitly state the intent to allow for user-defined allocators without default constructor in 20.1.5 Allocator requirements.

2) Correct all the places, where a correct allocator object is available through the get_allocator() call but default Allocator() gets passed instead.

4. Code sample.

Let's suppose that the following memory pool is available:

class mem_pool {
      // ...
      void* allocate(size_t size);
      void deallocate(void* ptr, size_t size);
};

So the following allocator can be implemented via this pool:

class stl_allocator {
      mem_pool& pool;

 public:
      explicit stl_allocator(mem_pool& mp) : pool(mp) {}
      stl_allocator(const stl_allocator& sa) : pool(sa.pool) {}
      template <class U>
      stl_allocator(const stl_allocator<U>& sa)  : pool(sa.get_pool()) {}
      ~stl_allocator() {}

      pointer allocate(size_type n, std::allocator<void>::const_pointer = 0)
      {
       return (n!=0) ? static_cast<pointer>(pool.allocate(n*sizeof(T))) : 0;
      }

      void deallocate(pointer p, size_type n)
      {
       if (n!=0) pool.deallocate(p, n*sizeof(T));
      }

      // ...
};

Then the following code works well on some implementations and doesn't work on another:

typedef basic_string<char, char_traits<char>, stl_allocator<char> > 
  tl_string;
mem_pool mp;
tl_string s1("abc", stl_allocator<int>(mp));
printf("(%s)\n", ("def"+s1).c_str());

In particular, on some implementations the code can't be compiled without default stl_allocator() constructor.

The obvious way to solve the compile-time problems is to intentionally define a NULL pointer dereferencing default constructor

stl_allocator() : pool(*static_cast<mem_pool*>(0)) {}

in a hope that it will not be called. The problem is that it really gets called by operator+(const char*, const string&) under the current 21.3.7.1p2 wording.

Proposed resolution:


561. inserter overly generic

Section: 24.4.2.6.5 [lib.inserter]  Status: New  Submitter: Howard Hinnant  Date: 21 Feb 2006

The declaration of std::inserter is:

template <class Container, class Iterator>
insert_iterator<Container>
inserter(Container& x, Iterator i);

The template parameter Iterator in this function is completely unrelated to the template parameter Container when it doesn't need to be. This causes the code to be overly generic. That is, any type at all can be deduced as Iterator, whether or not it makes sense. Now the same is true of Container. However, for every free (unconstrained) template parameter one has in a signature, the opportunity for a mistaken binding grows geometrically.

It would be much better if inserter had the following signature instead:

template <class Container>
insert_iterator<Container>
inserter(Container& x, typename Container::iterator i);

Now there is only one free template parameter. And the second argument to inserter must be implicitly convertible to the container's iterator, else the call will not be a viable overload (allowing other functions in the overload set to take precedence). Furthermore, the first parameter must have a nested type named iterator, or again the binding to std::inserter is not viable. Contrast this with the current situation where any type can bind to Container or Iterator and those types need not be anything closely related to containers or iterators.

This can adversely impact well written code. Consider:

#include <iterator>
#include <string>

namespace my
{

template <class String>
struct my_type {};

struct my_container
{
template <class String>
void push_back(const my_type<String>&);
};

template <class String>
void inserter(const my_type<String>& m, my_container& c) {c.push_back(m);}

}  // my

int main()
{
    my::my_container c;
    my::my_type<std::string> m;
    inserter(m, c);
}

Today this code fails because the call to inserter binds to std::inserter instead of to my::inserter. However with the proposed change std::inserter will no longer be a viable function which leaves only my::inserter in the overload resolution set. Everything works as the client intends.

To make matters a little more insidious, the above example works today if you simply change the first argument to an rvalue:

    inserter(my::my_type(), c);

It will also work if instantiated with some string type other than std::string (or any other std type). It will also work if <iterator> happens to not get included.

And it will fail again for such inocuous reaons as my_type or my_container privately deriving from any std type.

It seems unfortunate that such simple changes in the client's code can result in such radically differing behavior.

Proposed resolution:

Change 24.2:

24.2 Header <iterator> synopsis
...
template <class Container, class Iterator>
   insert_iterator<Container> inserter(Container& x, Iterator typename Container::iterator i);
...

Change 24.4.2.5:

24.4.2.5 Class template insert_iterator
...
template <class Container, class Iterator>
   insert_iterator<Container> inserter(Container& x, Iterator typename Container::iterator i);
...

Change 24.4.2.6.5:

24.4.2.6.5 inserter

template <class Container, class Inserter>
   insert_iterator<Container> inserter(Container& x, Inserter typename Container::iterator i);
-1- Returns: insert_iterator<Container>(x,typename Container::iterator(i)).

562. stringbuf ctor inefficient

Section: 27.7 [lib.string.streams]  Status: New  Submitter: Martin Sebor  Date: 23 Feb 2006

For better efficiency, the requirement on the stringbuf ctor that takes a string argument should be loosened up to let it set epptr() beyond just one past the last initialized character just like overflow() has been changed to be allowed to do (see issue 432). That way the first call to sputc() on an object won't necessarily cause a call to overflow. The corresponding change should be made to the string overload of the str() member function.

Proposed resolution:

Change 27.7.1.1, p3 of the Working Draft, N1804, as follows:

explicit basic_stringbuf(const basic_string<charT,traits,Allocator>& str,
               ios_base::openmode which = ios_base::in | ios_base::out);

-3- Effects: Constructs an object of class basic_stringbuf, initializing the base class with basic_streambuf() (27.5.2.1), and initializing mode with which. Then calls str(s). copies the content of str into the basic_stringbuf underlying character sequence. If which & ios_base::out is true, initializes the output sequence such that pbase() points to the first underlying character, epptr() points one past the last underlying character, and pptr() is equal to epptr() if which & ios_base::ate is true, otherwise pptr() is equal to pbase(). If which & ios_base::in is true, initializes the input sequence such that eback() and gptr() point to the first underlying character and egptr() points one past the last underlying character.

Change the Effects clause of the str() in 27.7.1.2, p2 to read:

-2- Effects: Copies the contents of s into the basic_stringbuf underlying character sequence and initializes the input and output sequences according to mode. If mode & ios_base::out is true, initializes the output sequence such that pbase() points to the first underlying character, epptr() points one past the last underlying character, and pptr() is equal to epptr() if mode & ios_base::in is true, otherwise pptr() is equal to pbase(). If mode & ios_base::in is true, initializes the input sequence such that eback() and gptr() point to the first underlying character and egptr() points one past the last underlying character.

-3- Postconditions: If mode & ios_base::out is true, pbase() points to the first underlying character and (epptr() >= pbase() + s.size()) holds; in addition, if mode & ios_base::in is true, (pptr() == pbase() + s.data()) holds, otherwise (pptr() == pbase()) is true. If mode & ios_base::in is true, eback() points to the first underlying character, and (gptr() == eback()) and (egptr() == eback() + s.size()) hold.


563. stringbuf seeking from end

Section: 27.7.1.3 [lib.stringbuf.virtuals]  Status: New  Submitter: Martin Sebor  Date: 23 Feb 2006

According to Table 92 (unchanged by issue 432), when (way == end) the newoff value in out mode is computed as the difference between epptr() and pbase().

This value isn't meaningful unless the value of epptr() can be precisely controlled by a program. That used to be possible until we accepted the resolution of issue 432, but since then the requirements on overflow() have been relaxed to allow it to make more than 1 write position available (i.e., by setting epptr() to some unspecified value past pptr()). So after the first call to overflow() positioning the output sequence relative to end will have unspecified results.

In addition, in in|out mode, since (egptr() == epptr()) need not hold, there are two different possible values for newoff: epptr() - pbase() and egptr() - eback().

Proposed resolution:

Change the newoff column in the last row of Table 94 to read:

the end high mark pointer minus the beginning pointer (xend high_mark - xbeg).

564. stringbuf seekpos underspecified

Section: 27.7.1.3 [lib.stringbuf.virtuals]  Status: New  Submitter: Martin Sebor  Date: 23 Feb 2006

The effects of the seekpos() member function of basic_stringbuf simply say that the function positions the input and/or output sequences but fail to spell out exactly how. This is in contrast to the detail in which seekoff() is described.

Proposed resolution:

Change 27.7.1.3, p13 to read:

-13- Effects: Same as seekoff(off_type(sp), ios_base::beg, which). Alters the stream position within the controlled sequences, if possible, to correspond to the stream position stored in sp (as described below).


565. xsputn inefficient

Section: 27.5.2.4.5 [lib.streambuf.virt.put]  Status: New  Submitter: Martin Sebor  Date: 23 Feb 2006

streambuf::xsputn() is specified to have the effect of "writing up to n characters to the output sequence as if by repeated calls to sputc(c)."

Since sputc() is required to call overflow() when (pptr() == epptr()) is true, strictly speaking xsputn() should do the same. However, doing so would be suboptimal in some interesting cases, such as in unbuffered mode or when the buffer is basic_stringbuf.

Assuming calling overflow() is not really intended to be required and the wording is simply meant to describe the general effect of appending to the end of the sequence it would be worthwhile to mention in xsputn() that the function is not actually required to cause a call to overflow().

Proposed resolution:

Add the following sentence to the xsputn() Effects clause in 27.5.2.4.5, p1 (N1804):

-1- Effects: Writes up to n characters to the output sequence as if by repeated calls to sputc(c). The characters written are obtained from successive elements of the array whose first element is designated by s. Writing stops when either n characters have been written or a call to sputc(c) would return traits::eof(). It is uspecified whether the function calls overflow() when (pptr() == epptr()) becomes true or whether it achieves the same effects by other means.

In addition, I suggest to add a footnote to this function with the same text as Footnote 292 to make it extra clear that derived classes are permitted to override xsputn() for efficiency.


566. array forms of unformatted input function undefined for zero-element arrays

Section: 27.6.1.3 [lib.istream.unformatted]  Status: New  Submitter: Martin Sebor  Date: 23 Feb 2006

The array forms of unformatted input functions don't have well-defined semantics for zero-element arrays in a couple of cases. The affected ones (istream::get() and getline()) are supposed to terminate when (n - 1) characters are stored, which obviously can never be true when (n == 0) to start with.

Proposed resolution:

I propose the following changes (references are relative to the Working Draft (document N1804).

Change 27.6.1.3, p8 (istream::get()), bullet 1 as follows:

if (n < 1) is true or (n - 1) characters are stored;

Similarly, change 27.6.1.3, p18 (istream::getline()), bullet 3 as follows:

(n < 1) is true or (n - 1) characters are stored (in which case the function calls setstate(failbit)).

Finally, change p21 as follows:

In any case, provided (n > 0) is true, it then stores a null character (using charT()) into the next successive location of the array.


567. streambuf inserter and extractor should be unformatted

Section: 27.6 [lib.iostream.format]  Status: New  Submitter: Martin Sebor  Date: 25 Feb 2006

Issue 60 explicitly made the extractor and inserter operators that take a basic_streambuf* argument formatted input and output functions, respectively. I believe that's wrong, certainly in the case of the extractor, since formatted functions begin by extracting and discarding whitespace. The extractor should not discard any characters.

Proposed resolution:

I propose to change each operator to behave as unformatted input and output function, respectively. The changes below are relative to the working draft document number N1804.

Specifically, change 27.6.1.2.3, p14 as follows:

Effects: Behaves as an unformatted input function (as described in 27.6.1.2.127.6.1.3, paragraph 1).

And change 27.6.2.5.3, p7 as follows:

Effects: Behaves as an unformatted output function (as described in 27.6.2.5.127.6.2.6, paragraph 1).


568. log2 overloads missing

Section: TR1 8.16.4 [tr.c99.cmath.over]  Status: New  Submitter: Paolo Carlini  Date: 7 Mar 2006

log2 is missing from the list of "additional overloads" in 8.16.4p1.

Proposed resolution:

Add log2 to the list of functions in 8.16.4p1.


569. Postcondition for basic_ios::clear(iostate) incorrectly stated

Section: 27.4.4.3 [lib.iostate.flags]  Status: Tentatively Ready  Submitter: Seungbeom Kim  Date: 10 Mar 2006

Section: 27.4.4.3 [lib.iostate.flags]

Paragraph 4 says:

void clear(iostate state = goodbit);

Postcondition: If rdbuf()!=0 then state == rdstate(); otherwise rdstate()==state|ios_base::badbit.

The postcondition "rdstate()==state|ios_base::badbit" is parsed as "(rdstate()==state)|ios_base::badbit", which is probably what the committee meant.

Proposed resolution:

Rationale:

This is a duplicate of issue 272.


570. Request adding additional explicit specializations of char_traits

Section: 21.1 [lib.char.traits]  Status: New  Submitter: Jack Reeves  Date: 6 Apr 2006

Currently, the Standard Library specifies only a declaration for template class char_traits<> and requires the implementation provide two explicit specializations: char_traits<char> and char_traits<wchar_t>. I feel the Standard should require explicit specializations for all built-in character types, i.e. char, wchar_t, unsigned char, and signed char.

I have put together a paper (N1985) that describes this in more detail and includes all the necessary wording.

Proposed resolution:


571. Update C90 references to C99?

Section: 1.2 [intro.refs]  Status: New  Submitter: Beman Dawes  Date: 8 Apr 2006

1.2 Normative references [intro.refs] of the WP currently refers to ISO/IEC 9899:1990, Programming languages - C. Should that be changed to ISO/IEC 9899:1999?

What impact does this have on the library?

Proposed resolution:

In 1.2/1 [intro.refs] of the WP, change:


572. Oops, we gave 507 WP status

Section: TR1 5.1.3 [tr.rand.var]  Status: New  Submitter: Howard Hinnant  Date: 11 Apr 2006

In Berlin, as a working group, we voted in favor of N1932 which makes issue 507 moot: variate_generator has been eliminated. Then in full committee we voted to give this issue WP status (mistakenly).

Proposed resolution:

Strike the proposed resolution of issue 507.


573. C++0x file positioning should handle modern file sizes

Section: 27.4.3 [lib.fpos]  Status: New  Submitter: Beman Dawes  Date: 12 Apr 2006

There are two deficiencies related to file sizes:

  1. It doesn't appear that the Standard Library is specified in a way that handles modern file sizes, which are often too large to be represented by an unsigned long.
  2. The std::fpos class does not currently have the ability to set/get file positions.

The Dinkumware implementation of the Standard Library as shipped with the Microsoft compiler copes with these issues by:

  1. Defining fpos_t be long long, which is large enough to represent any file position likely in the foreseeable future.
  2. Adding member functions to class fpos. For example,
    fpos_t seekpos() const;
    

Because there are so many types relating to file positions and offsets (fpos_t, fpos, pos_type, off_type, streamoff, streamsize, streampos, wstreampos, and perhaps more), it is difficult to know if the Dinkumware extensions are sufficient. But they seem a useful starting place for discussions, and they do represent existing practice.

Proposed resolution:


574. DR 369 Contradicts Text

Section: 27.3 [lib.iostream.objects]  Status: New  Submitter: Pete Becker  Date: 18 Apr 2006

lib.iostream.objects requires that the standard stream objects are never destroyed, and it requires that they be destroyed.

DR 369 adds words to say that we really mean for ios_base::Init objects to force construction of standard stream objects. It ends, though, with the phrase "these stream objects shall be destroyed after the destruction of dynamically ...". However, the rule for destruction is stated in the standard: "The objects are not destroyed during program execution."

Proposed resolution:


575. the specification of ~shared_ptr is MT-unfriendly, makes implementation assumptions

Section: 20.6.6.2.2 [lib.util.smartptr.shared.dest]  Status: New  Submitter: Peter Dimov  Date: 23 Apr 2006

[tr.util.smartptr.shared.dest] says in its second bullet:

"If *this shares ownership with another shared_ptr instance (use_count() > 1), decrements that instance's use count."

The problem with this formulation is that it presupposes the existence of an "use count" variable that can be decremented and that is part of the state of a shared_ptr instance (because of the "that instance's use count".)

This is contrary to the spirit of the rest of the specification that carefully avoids to require an use count variable. Instead, use_count() is specified to return a value, a number of instances.

In multithreaded code, the usual implicit assumption is that a shared variable should not be accessed by more than one thread without explicit synchronization, and by introducing the concept of an "use count" variable, the current wording implies that two shared_ptr instances that share ownership cannot be destroyed simultaneously.

In addition, if we allow the interpretation that an use count variable is part of shared_ptr's state, this would lead to other undesirable consequences WRT multiple threads. For example,

p1 = p2;

would now visibly modify the state of p2, a "write" operation, requiring a lock.

Proposed resolution:

Change the first two bullets of [lib.util.smartptr.shared.dest]/1 to:

Add the following paragraph after [lib.util.smartptr.shared.dest]/1:

[Note: since the destruction of *this decreases the number of instances in *this's ownership group by one, all shared_ptr instances that share ownership with *this will report an use_count() that is one lower than its previous value after *this is destroyed. --end note]


576. find_first_of is overconstrained

Section: 25.1.4 [lib.alg.find.first.of]  Status: New  Submitter: Doug Gregor  Date: 25 Apr 2006

In 25.1.4 Find First [lib.alg.find.first], the two iterator type parameters to find_first_of are specified to require Forward Iterators, as follows:

template<class ForwardIterator1, class ForwardIterator2>
  ForwardIterator1
  find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
                        ForwardIterator2 first2, ForwardIterator2 last2);
template<class ForwardIterator1, class ForwardIterator2,
                  class BinaryPredicate>
ForwardIterator1
  find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
                         ForwardIterator2 first2, ForwardIterator2 last2,
                        BinaryPredicate pred);

However, ForwardIterator1 need not actually be a Forward Iterator; an Input Iterator suffices, because we do not need the multi-pass property of the Forward Iterator or a true reference.

Proposed resolution:

Change the declarations of find_first_of to:

template<class ForwardIterator1InputIterator1, class ForwardIterator2>
  ForwardIterator1InputIterator1
  find_first_of(ForwardIterator1InputIterator1 first1, ForwardIterator1InputIterator1 last1,
                        ForwardIterator2 first2, ForwardIterator2 last2);
template<class ForwardIterator1InputIterator1, class ForwardIterator2,
                  class BinaryPredicate>
ForwardIterator1InputIterator1
  find_first_of(ForwardIterator1InputIterator1 first1, ForwardIterator1InputIterator1 last1,
                         ForwardIterator2 first2, ForwardIterator2 last2,
                        BinaryPredicate pred);

577. upper_bound(first, last, ...) cannot return last

Section: 25.3.3.2 [lib.upper.bound]  Status: New  Submitter: Seungbeom Kim  Date: 3 May 2006

ISO/IEC 14882:2003 says:

25.3.3.2 upper_bound

Returns: The furthermost iterator i in the range [first, last) such that for any iterator j in the range [first, i) the following corresponding conditions hold: !(value < *j) or comp(value, *j) == false.

From the description above, upper_bound cannot return last, since it's not in the interval [first, last). This seems to be a typo, because if value is greater than or equal to any other values in the range, or if the range is empty, returning last seems to be the intended behaviour. The corresponding interval for lower_bound is also [first, last].

Proposed resolution:

Change [lib.upper.bound]:

Returns: The furthermost iterator i in the range [first, last)] such that for any iterator j in the range [first, i) the following corresponding conditions hold: !(value < *j) or comp(value, *j) == false.


578. purpose of hint to allocator::allocate()

Section: 20.6.1.1 [lib.allocator.members]  Status: New  Submitter: Martin Sebor  Date: 17 May 2006

The description of the allocator member function allocate() requires that the hint argument be either 0 or a value previously returned from allocate(). Footnote 227 further suggests that containers may pass the address of an adjacent element as this argument.

I believe that either the footnote is wrong or the normative requirement that the argument be a value previously returned from a call to allocate() is wrong. The latter is supported by the resolution to issue 20-004 proposed in c++std-lib-3736 by Nathan Myers. In addition, the hint is an ordinary void* and not the pointer type returned by allocate(), with the two types potentially being incompatible and the requirement impossible to satisfy.

See also c++std-lib-14323 for some more context on where this came up (again).

Proposed resolution:

Remove the requirement in 20.6.1.1, p4 that the hint be a value previously returned from allocate(). Specifically, change the paragraph as follows:

Requires: hint is either 0 or previously obtained from member allocate and not yet passed to member deallocate the address of a byte in memory (1.7).


579. erase(iterator) for unordered containers should not return an iterator

Section: 23.1.3 [lib.unord.req]  Status: New  Submitter: Joaquín M López Muñoz  Date: 13 Jun 2006

See N2023 for full discussion.

Proposed resolution:

Option 1:

The problem can be eliminated by omitting the requirement that a.erase(q) return an iterator. This is, however, in contrast with the equivalent requirements for other standard containers.

Option 2:

a.erase(q) can be made to compute the next iterator only when explicitly requested: the technique consists in returning a proxy object implicitly convertible to iterator, so that

iterator q1=a.erase(q);

works as expected, while

a.erase(q);

does not ever invoke the conversion-to-iterator operator, thus avoiding the associated computation. To allow this technique, some sections of TR1 along the line "return value is an iterator..." should be changed to "return value is an unspecified object implicitly convertible to an iterator..." Although this trick is expected to work transparently, it can have some collateral effects when the expression a.erase(q) is used inside generic code.


580. unused allocator members

Section: 20.1.6 [lib.allocator.requirements]  Status: New  Submitter: Martin Sebor  Date: 14 June 2006

C++ Standard Library templates that take an allocator as an argument are required to call the allocate() and deallocate() members of the allocator object to obtain storage. However, they do not appear to be required to call any other allocator members such as construct(), destroy(), address(), and max_size(). This makes these allocator members less than useful in portable programs.

Proposed resolution:

It's unclear to me whether the absence of the requirement to use these allocator members is an unintentional omission or a deliberate choice. However, since the functions exist in the standard allocator and since they are required to be provided by any user-defined allocator I believe the standard ought to be clarified to explictly specify whether programs should or should not be able to rely on standard containers calling the functions.

I propose that all containers be required to make use of these functions.


581. flush() not unformatted function

Section: 27.6.2.6 [lib.ostream.unformatted]  Status: New  Submitter: Martin Sebor  Date: 14 June 2006

The resolution of issue 60 changed basic_ostream::flush() so as not to require it to behave as an unformatted output function. That has at least two in my opinion problematic consequences:

First, flush() now calls rdbuf()->pubsync() unconditionally, without regard to the state of the stream. I can't think of any reason why flush() should behave differently from the vast majority of stream functions in this respect.

Second, flush() is not required to catch exceptions from pubsync() or set badbit in response to such events. That doesn't seem right either, as most other stream functions do so.

Proposed resolution:

I propose to revert the resolution of issue 60 with respect to flush(). Specifically, I propose to change 27.6.2.6, p7 as follows:

Effects: Behaves as an unformatted output function (as described in 27.6.2.6, paragraph 1). If rdbuf() is not a null pointer, constructs a sentry object. If this object returns true when converted to a value of type bool the function calls rdbuf()->pubsync(). If that function returns -1 calls setstate(badbit) (which may throw ios_base::failure (27.4.4.3)). Otherwise, if the sentry object returns false, does nothing.Does not behave as an unformatted output function (as described in 27.6.2.6, paragraph 1).

582. specialized algorithms and volatile storage

Section: 20.6.4.1 [lib.uninitialized.copy]  Status: New  Submitter: Martin Sebor  Date: 14 June 2006

The specialized algorithms [lib.specialized.algorithms] are specified as having the general effect of invoking the following expression:

new (static_cast<void*>(&*i))
    typename iterator_traits<ForwardIterator>::value_type (x)

            

This expression is ill-formed when the type of the subexpression &*i is some volatile-qualified T.

Proposed resolution:

In order to allow these algorithms to operate on volatile storage I propose to change the expression so as to make it well-formed even for pointers to volatile types. Specifically, I propose the following changes to clauses 20 and 24. Change 20.6.4.1, p1 to read:

Effects:

typedef typename iterator_traits<ForwardIterator>::pointer    pointer;
typedef typename iterator_traits<ForwardIterator>::value_type value_type;

for (; first != last; ++result, ++first)
    new (static_cast<void*>(const_cast<pointer>(&*result))
        value_type (*first);

            

change 20.6.4.2, p1 to read

Effects:

typedef typename iterator_traits<ForwardIterator>::pointer    pointer;
typedef typename iterator_traits<ForwardIterator>::value_type value_type;

for (; first != last; ++result, ++first)
    new (static_cast<void*>(const_cast<pointer>(&*first))
        value_type (*x);

            

and change 20.6.4.3, p1 to read

Effects:

typedef typename iterator_traits<ForwardIterator>::pointer    pointer;
typedef typename iterator_traits<ForwardIterator>::value_type value_type;

for (; n--; ++first)
    new (static_cast<void*>(const_cast<pointer>(&*first))
        value_type (*x);

            

In addition, since there is no partial specialization for iterator_traits<volatile T*> I propose to add one to parallel such specialization for <const T*>. Specifically, I propose to add the following text to the end of 24.3.1, p3:

and for pointers to volatile as

namespace std {
template<class T> struct iterator_traits<volatile T*> {
typedef ptrdiff_t difference_type;
typedef T value_type;
typedef volatile T* pointer;
typedef volatile T& reference;
typedef random_access_iterator_tag iterator_category;
};
}

            

Note that the change to iterator_traits isn't necessary in order to implement the specialized algorithms in a way that allows them to operate on volatile strorage. It is only necesassary in order to specify their effects in terms of iterator_traits as is done here. Implementations can (and some do) achieve the same effect by means of function template overloading.


583. div() for unsigned integral types

Section: 26.7 [lib.c.math]  Status: New  Submitter: Beman Dawes  Date: 15 Jun 2006

There is no div() function for unsigned integer types.

There are several possible resolutions. The simplest one is noted below. Other possibilities include a templated solution.

Proposed resolution:

Add to 26.7 [lib.c.math] paragraph 8:

struct udiv_t div(unsigned, unsigned);
struct uldiv_t div(unsigned long, unsigned long);
struct ulldiv_t div(unsigned long long, unsigned long long);

584. missing int pow(int,int) functionality

Section: 26.7 [lib.c.math]  Status: New  Submitter: Beman Dawes  Date: 15 Jun 2006

There is no pow() function for any integral type.

Proposed resolution:

Add something like:

template< typename T>
T power( T x, int n );
// requires: n >=0

585. facet error reporting

Section: 22.2 [lib.locale.categories]  Status: New  Submitter: Martin Sebor, Paolo Carlini  Date: 22 June 2006

Section 22.2, paragraph 2 requires facet get() members that take an ios_base::iostate& argument, err, to ignore the (initial) value of the argument, but to set it to ios_base::failbit in case of a parse error.

We believe there are a few minor problems with this blanket requirement in conjunction with the wording specific to each get() member function.

First, besides get() there are other member functions with a slightly different name (for example, get_date()). It's not completely clear that the intent of the paragraph is to include those as well, and at least one implementation has interpreted the requirement literally.

Second, the requirement to "set the argument to ios_base::failbit suggests that the functions are not permitted to set it to any other value (such as ios_base::eofbit, or even ios_base::eofbit | ios_base::failbit).

However, 22.2.2.1.2, p5 (Stage 3 of num_get parsing) and p6 (bool parsing) specifies that the do_get functions perform err |= ios_base::eofbit, which contradicts the earlier requirement to ignore err's initial value.

22.2.6.1.2, p1 (the Effects clause of the money_get facet's do_get member functions) also specifies that err's initial value be used to compute the final value by ORing it with either ios_base::failbit or withios_base::eofbit | ios_base::failbit.

Proposed resolution:

We believe the intent is for all facet member functions that take an ios_base::iostate& argument to:

To that effect we propose to change 22.2, p2 as follows:

The put() members make no provision for error reporting. (Any failures of the OutputIterator argument must be extracted from the returned iterator.) Unless otherwise specified, the get() members that take an ios_base::iostate& argument whose value they ignore, but set to ios_base::failbit in case of a parse error., err, start by evaluating err = ios_base::goodbit, and may subsequently set err to either ios_base::eofbit, or ios_base::failbit, or ios_base::eofbit | ios_base::failbit in response to reaching the end-of-file or in case of a parse error, or both, respectively.


586. string inserter not a formatted function

Section: 21.3.7.9 [lib.string.io]  Status: New  Submitter: Martin Sebor  Date: 22 June 2006

Section and paragraph numbers in this paper are relative to the working draft document number N2009 from 4/21/2006.

The basic_string extractor in 21.3.7.9, p1 is clearly required to behave as a formatted input function, as is the std::getline() overload for string described in p7.

However, the basic_string inserter described in p5 of the same section has no such requirement. This has implications on how the operator responds to exceptions thrown from xsputn() (formatted output functions are required to set badbit and swallow the exception unless badbit is also set in exceptions(); the string inserter doesn't have any such requirement).

I don't see anything in the spec for the string inserter that would justify requiring it to treat exceptions differently from all other similar operators. (If it did, I think it should be made this explicit by saying that the operator "does not behave as a formatted output function" as has been made customary by the adoption of the resolution of issue 60).

Proposed resolution:

I propose to change the Effects clause in 21.3.7.9, p5, as follows:

Effects: Begins by constructing a sentry object k as if k were constructed by typename basic_ostream<charT, traits>::sentry k (os). If bool(k) is true, Behaves as a formatted output function (27.6.2.5.1). After constructing a sentry object, if this object returns true when converted to a value of type bool, determines padding as described in 22.2.2.2.2, then inserts the resulting sequence of characters seq as if by calling os.rdbuf()->sputn(seq , n), where n is the larger of os.width() and str.size(); then calls os.width(0). If the call to sputn fails, calls os.setstate(ios_base::failbit).

This proposed resilution assumes the resolution of issue 394 (i.e., that all formatted output functions are required to set ios_base::badbit in response to any kind of streambuf failure), and implicitly assumes that a return value of sputn(seq, n) other than n indicates a failure.


587. iststream ctor missing description

Section: D.7.2.1 [depr.istrstream.cons]  Status: New  Submitter: Martin Sebor  Date: 22 June 2006

The iststream(char*, streamsize) ctor is in the class synopsis in D.7.2 but its signature is missing in the description below (in D.7.2.1).

Proposed resolution:

This seems like a simple editorial issue and the missing signature can be added to the one for const char* in paragraph 2.

----- End of document -----