1. Revision History
r0: initial revision
2. Intro
Project Euler is a project with many mathematical-related questions that are intended to encourage the reader to write a small program to compute the result. In this case, one of the problems there, no. 37 [EULER], helped reveal a pitfall coming from the definition ofstd :: counted_iterator
.
3. Problem description
Look at this example code [CE-FILTER]:#include <ranges>#include <iostream>namespace rv = std :: views ; int main () { for ( auto i : rv :: iota ( 0 ) | rv :: filter ([]( auto i ) { return i < 10 ; }) | rv :: take ( 10 )) std :: cout << i << '\n' ; }
Compiler explorer gets a timeout when trying to run this simple example, instead
of printing the numbers from 0 to 9. Running the same code locally, it runs for
very long time. Tracking the roots of the issue, the problem is that
uses
when the range isn’t
and
increments the internal iterator even if the counter has reached the requested
count. In this case, the filter never returns when trying to increment it once
again (at least not until
reaches the UB case of signed overflow).
The example above is just for illustration, but we can think about cases where
it isn’t clear for the user how many items the filter is expected to return, so
limiting the output count with
becomes dangerous and results in
unexpected behavior.
The problem mentioned in the intro is one that actually describes a filter that
return exactly 11 elements, so trying to use
on it means the program
never ends even while it got 11 elements already.
3.1. Counter argument
If the user doesn’t know how many items the filter returns, it’s possible that the filter returns less items than whattake
tries to take, and never ends
anyway. So one might claim that the user must know the number of items returned
by the filter and then there is no point in using take
when they are the same.
It still leaves the cases that there is a huge performance penalty on incrementing the iterator one additional time, but those are maybe considered less interesting.
3.2. The real problem
The real problem we see is when using input ranges, e.g.basic_istream_view
.
In these cases, advancing the internal iterator means hanging forever if no
additional input exists and the stream isn’t closed or eating an additional
input that can’t be retrieved anymore. For example [CE-ISTREAM]:
#include <ranges>#include <iostream>#include <sstream>#include <cassert>namespace rn = std :: ranges ; namespace rv = rn :: views ; int main () { auto iss = std :: istringstream ( "0 1 2" ); for ( auto i : rn :: istream_view < int > ( iss ) | rv :: take ( 1 )) std :: cout << i << '\n' ; auto i = 0 ; iss >> i ; std :: cout << i << std :: endl ; // flush it in case the assert fails assert ( i == 1 ); // FAILS, i == 2 }
It means that one can’t use ranges when parsing input, for example.
4. Design of suggested solution
4.1. The main changes
We suggest changing the behavior ofcounted_iterator
operators around 0 length
so it doesn’t increment the internal iterator when reaching 0 length, and as a
result doesn’t decrement it when getting back from 0 to 1 length. This requires
changing base ()
behavior too, including the return type to be by value.
4.2. random_ access_ iterator
case kept as-is
To reduce the amount of changes required, we keep the current behavior for
case, so we don’t have to touch the additional operators
defined only for this category. The rational behind it is that for
case we can expect the view to either have all the items
ready or able to compute all of them efficiently, so it doesn’t suffer from this
issue.
4.3. Open question: Constructing with 0 length
We don’t have a good solution for the case that user constructs
with
as argument for length. This puts it in an
inconsistent internal state, as the next operations will be
or
,
and those expect the iterator to be one step back, with the changes suggested
here.
Please note that
and
are the only operations involving the state
of the internal iterator and still legal for
constructed with
;
Option 1: Require that if
,
must be decrementable, and actually
decrement it in the c-tor. (This option assumes the only reason to create such
an
is to decrement it anyway.)
Option 2: Require that if
,
must be "the one before" the actual
iterator (leaving it to the user to decide how to handle, and if neither
nor
are ever called on it, it doesn’t matter what the user does).
Option 3: Mark this case internally (e.g. with
) and handle specially
when decrementing (
"jumps" to
after decrementing the internal
iterator). Please note that
doesn’t need any special handling here.
5. Proposed Wording
Note: Wording doesn’t include any of the options suggested for the c-tor.
Under 23.5.6.3 [counted.iter.access]:
constexpr I base () const & ;
Effects: Equivalent to:
return length ? current : next ( current );
Note: calling
base ()
twice isn’t safe when I
isn’t forward_iterator
Effects: Equivalent to:
constexpr I base () && ;
Returns:
std :: move ( length ? current : next ( current ))
.
Returns:
.
Under 23.5.6.5 [counted.iter.nav]:
constexpr counted_iterator & operator ++ ();
Preconditions:
length > 0
.Effects: Equivalent to:
if ( length > 1 ) ++ current ;
-- length ;
return * this ;
Preconditions:
.
Effects: Equivalent to:
Preconditions:
.
Effects: Equivalent to:
constexpr counted_iterator & operator -- ()
requires bidirectional_ iterator < I > ;
Effects: Equivalent to:
if ( length ) -- current ;
++ length ;
return * this ;
Effects: Equivalent to:
6. Effect on current state
This breaks ABI (as every inline function does). If implementers don’t accept this change, or if WG21 feels it’s too late to apply it to C++20, we can addlazy_counted_iterator
(or bikeshed a better name) which has the same behavior
as counted_iterator
except for the mentioned changes. In additional, it
requires adding lazy_take
and lazy_counted_view
that uses it instead of counted_iterator
.
In such a case, we suggest to consider requiring
for
, as it always does the wrong thing for
if it doesn’t reach the
of the input (while for
and
it’s mainly just a possible performance issue and
rarely an infinite loop, as mentioned, and for
the
incrementing is usually no-op).
At the very least, we have to warn users against using
and
its consumers with
.