ISO/IEC JTC1 SC22 WG21 N3350=12-0040
Date:
Jeffrey Yasskin <jyasskin@google.com>
Source at http://code.google.com/p/cxx1y-range/
Ranges are ubiquitous in programs that use STL algorithms, but they're always written as adjacent iterator arguments, which makes them hard to use. Several groups have proposed a new set of range concepts that would allow ranges to be passed around as single objects. Two notable examples are the Boost.Range library and Andrei Alexandrescu's range concepts in D.
This paper, and the library at http://code.google.com/p/cxx1y-range/, attempt to take these ideas and produce a minimal library that's useful today and is suitable for standardization.
This library makes certain changes from the existing range libraries:
Boost's range library focuses on defining a set of Range concepts that allow Containers to be Ranges. Because Containers are heavy-weight, this forces Boost to use references for all of their uses of Ranges, including Ranges that are captured by Range Adapters. This worked fine in C++98, where users couldn't name the types of the Range Adapters in order to declare local variables, but with C++11's auto
keyword, users can now save adapted ranges. Since Boost's adapted ranges capture references to their arguments, which can be temporaries, this is no longer a safe design.
This paper makes the opposite choice: it defines a concrete type and no concepts, and the concrete type is a lightweight view into a container. This implies that algorithms can't treat Ranges and Containers interchangeably, but there are ways around this, which I explore below. Of the Boost types, this paper's std::range
is most like boost::iterator_range
.
Andrei Alexandrescu's range concepts provide a complete replacement for iterators. However, they require that the whole machinery be adopted at once, which isn't appropriate for a widely-used language like C++. This paper's std::range
allows users to program against the Range concepts while still accepting data from iterator-oriented containers.
This paper's range<>
omits Alexandrescu's ForwardRange.save()
operation, on the theory that it's just a type assertion for most ForwardRanges (since they provide a full-featured copy operation), so most algorithms would forget to call save()
. This would introduce bugs when those algorithms were used with the rare ForwardRange that could take advantage of it to save work on plain copies. It's better to omit the optimization than to lay traps.
Like the D ranges, but unlike the STL's iterators, this paper's range
uses the same names for O(1) and O(N) operations. Some reviewers found the O(N) names (advance
and shorten
) confusing, and with static_assert()
, users can now check the category explicitly if they're concerned. The one operation we didn't make work for all categories was operator[]
on the theory that indexing is much more likely to indicate a programmer mistake than other operations.
This proposal allows negative indices for indexing operations, following the lead of several scripting languages. Neither of the earlier range implementations does this for arbitrary ranges, but D does allow users to offset from the end of a random-access range.
As mentioned above, you can't simply write an algorithm that takes either Containers or std::range<>
s and treats them interchangeably:
template<typename T>
void my_algorithm(T argument) { // Copies any containers.
while (!argument.empty()) {
process(argument.front());
// Fails on vectors; destructive on deques; fine on ranges:
argument.pop_front();
}
}
The limits of template argument deduction make it inconvenient to take a std::range
explicitly:
template<typename Iter>
void my_algorithm(std::range<Iter> argument) {
while (!argument.empty()) {
process(argument.front());
argument.pop_front();
}
}
void user() {
ContainerT my_container;
// Fails! std::range<ContainerT::iterator> has an implicit
// conversion from ContainerT, but because the type of the
// explicit argument doesn't match "std::range<*>", the
// compiler can't deduce the template argument.
my_algorithm(my_container);
// Works, but verbose:
my_algorithm(std::make_range(my_container));
}
The best option seems to be for algorithms that want to take either containers or ranges, and use ranges internally, to take their parameter by T&
or const T&
and use make_range() to build a range out of it explicitly:
template<typename T>
void partial_sum(T& container) {
using std::make_range;
auto range = make_range(container);
typename T::value_type sum = 0;
// One would actually use a range-based for loop
// in this case, but you get the idea:
for (; !range.empty(); range.pop_front()) {
sum += range.front();
range.front() = sum;
}
}
void user1() {
std::vector<int> v(5, 2);
partial_sum(v);
assert(v[3] == 8);
}
By storing the result of make_range()
in an auto
variable, we allow authors of future range types to overload it as the identity operation on types that are already ranges.
The one case where taking std::range
explicitly seems to make sense is range<T*>
. If an algorithm needs contiguous iterators, this can allow its authors to avoid making it a template:
void my_write(int fd, std::range<const char*> buffer) {
write(fd, buffer.begin(), buffer.size());
}
void user2() {
std::string s("Hello world");
my_write(1, s);
}
A std::range<Iterator>
represents a half-open iterator range built from two iterators, 'begin'
, and 'end'
. If end
is not reachable from begin
, the behavior is undefined.
The mutability of elements of the range is controlled by the Iterator argument. Instantiate range<Foo::iterator>
or range<T*>
, or call make_range(non_const_container)
, and you get a mutable range. Instantiate range<Foo::const_iterator>
or range<const T*>
, or call make_range(const_container)
, and you get a constant range.
namespace std {
template<typename Iterator>
class range {
// types
typedef iterator_traits< Iterator >::iterator_category iterator_category;
typedef iterator_traits< Iterator >::value_type value_type;
typedef iterator_traits< Iterator >::difference_type difference_type;
typedef iterator_traits< Iterator >::reference reference;
typedef iterator_traits< Iterator >::pointer pointer;
// constructors
range();
constexpr range(Iterator begin, Iterator end);
template<typename R>
constexpr range(R && r);
template<typename R>
constexpr range(R && r);
// iterator access
constexpr Iterator begin() const;
constexpr Iterator end() const;
// element access
constexpr reference front() const;
constexpr reference back() const;
constexpr reference operator[](difference_type index) const;
// size
constexpr bool empty() const;
constexpr difference_type size() const;
// traversal from the beginning of the range
void pop_front();
void pop_front(difference_type n);
void pop_front_upto(difference_type n);
// traversal from the end of the range
void pop_back();
void pop_back(difference_type n);
void pop_back_upto(difference_type n);
// creating derived ranges
pair< range, range > split(difference_type index) const;
range slice(difference_type start, difference_type stop) const;
range slice(difference_type start) const;
};
// deducing constructor wrappers
template<typename Iterator>
constexpr range< Iterator > make_range(Iterator begin, Iterator end);
template<typename Range>
constexpr auto make_range(Range && r) -> range< decltype(begin(r))>;
template<typename Range>
constexpr auto make_ptr_range(Range && r) -> range< decltype(&*begin(r))>;
}
These functions do the same thing as the constructor with the same signature. They just allow users to avoid writing the iterator type.
template<typename Iterator>
constexpr range< Iterator > make_range(Iterator begin, Iterator end);
template<typename Range>
constexpr auto make_range(Range && r) -> range< decltype(begin(r))>;
begin(r)
and end(r)
return the same type.
template<typename Range>
constexpr auto make_ptr_range(Range && r) -> range< decltype(&*begin(r))>;
begin(r)
and end(r)
return the same type,&*(i + N) == (&*i) + N
, and&*begin
(r) has a pointer type. typedef iterator_traits< Iterator >::iterator_category iterator_category;
The iterator category of Iterator
.
typedef iterator_traits< Iterator >::value_type value_type;
The type of elements of the range. Not cv-qualified.
typedef iterator_traits< Iterator >::difference_type difference_type;
The type of the size of the range and offsets within the range.
typedef iterator_traits< Iterator >::reference reference;
The return type of element access methods: front()
, back()
, etc.
range();
Creates a range of default-constructed (not value-initialized) iterators. For most Iterator
types, this will be an invalid range.
constexpr range(Iterator begin, Iterator end);
end
is reachable from begin
.
this->begin() == begin && this->end() == end
template<typename R>
constexpr range(R && r);
Iterator
is not a pointer type,begin(r)
and end(r)
return the same type, andIterator
.template<typename R>
constexpr range(R && r);
This constructor creates a range<T*>
from any range with contiguous iterators. Because dereferencing a past-the-end iterator can be undefined behavior, empty ranges get initialized with nullptr
rather than &*begin
().
Iterator
is a pointer type T*
,begin(r)
and end(r)
return the same type,i
of that type satisfy the invariant &*(i + N) == (&*i) + N
, and&*begin()
is convertible to T*
using only qualification conversions [conv.qual] (since pointer conversions stop the pointer from pointing to an array element).constexpr reference front() const;
O(1)
!empty
()
a reference to the element at the front of the range.
constexpr reference back() const;
iterator_category
is convertible to bidirectional_iterator_tag
.
O(2) (Involves copying and decrementing an iterator, so not quite as cheap as front()
)
!empty
()
a reference to the element at the front of the range.
constexpr reference operator[](difference_type index) const;
This method is drawn from scripting language indexing. It indexes forward from the beginning of the range if the argument is positive, or backwards from the end of the array if the argument is negative.
constexpr bool empty() const;
O(1)
true
if the range contains no elements.
constexpr difference_type size() const;
iterator_category
is convertible to forward_iterator_tag
.
O(1) if iterator_category
is convertible to random_access_iterator_tag
. O(size()
) otherwise.
the number of times pop_front()
can be called before empty()
becomes true.
void pop_front();
Advances the beginning of the range by one element.
!empty
()
void pop_front(difference_type n);
Advances the beginning of the range by n
elements.
O(1) if iterator_category
is convertible to random_access_iterator_tag
, O(n
) otherwise.
n >= 0
, and there must be at least n
elements in the range.
void pop_front_upto(difference_type n);
Advances the beginning of the range by at most n
elements, stopping if the range becomes empty. A negative argument causes no change.
O(1) if iterator_category
is convertible to random_access_iterator_tag
, O(min(n, #-elements-in-range)
) otherwise.
Could be provided as a free function with little-to-no loss in efficiency.
void pop_back();
Moves the end of the range earlier by one element.
iterator_category
is convertible to bidirectional_iterator_tag
.
O(1)
!empty
()
void pop_back(difference_type n);
Moves the end of the range earlier by n
elements.
iterator_category
is convertible to bidirectional_iterator_tag
.
O(1) if iterator_category
is convertible to random_access_iterator_tag
, O(n
) otherwise.
n >= 0
, and there must be at least n
elements in the range.
void pop_back_upto(difference_type n);
Moves the end of the range earlier by min(n, size())
elements. A negative argument causes no change.
iterator_category
is convertible to bidirectional_iterator_tag
.
O(1) if iterator_category
is convertible to random_access_iterator_tag
, O(min(n, #-elements-in-range)
) otherwise.
Could be provided as a free function with little-to-no loss in efficiency.
pair< range, range > split(difference_type index) const;
Divides the range into two pieces at index
, where a positive index
represents an offset from the beginning of the range and a negative index
represents an offset from the end. range[index]
is the first element in the second piece. If index >= size()
, the second piece will be empty. If index < -size()
, the first piece will be empty.
iterator_category
is convertible to forward_iterator_tag
.
iterator_category
is convertible to random_access_iterator_tag:
O(1)iterator_category
is convertible to bidirectional_iterator_tag
, abs(index)
iterator increments or decrementsindex >= 0
, index
iterator incrementssize() + (size() + index)
iterator increments.a pair of adjacent ranges.
range slice(difference_type start, difference_type stop) const;
A sub-range from start
to stop
(not including stop
, as usual). start
and stop
are interpreted as for operator[]
, with negative values offsetting from the end of the range. Omitting the stop
argument makes the sub-range continue to the end of the original range. Positive arguments saturate to the end of the range, and negative arguments saturate to the beginning. If stop
is before start
, returns an empty range beginning and ending at start
.
iterator_category
is convertible to forward_iterator_tag
.
iterator_category
is convertible to random_access_iterator_tag:
O(1)iterator_category
is convertible to bidirectional_iterator_tag
, at most min(abs(start), size()) + min(abs(stop), size())
iterator increments or decrementsstart >= 0 && stop >= 0
, max(start, stop)
iterator incrementssize() + max(start', stop')
iterator increments, where start'
and stop'
are the offsets of the elements start
and stop
refer to.slice(start)
should be implemented with a different overload, rather than defaulting stop
to numeric_limits<difference_type>::max()
, because using a default would force non-random-access ranges to use an O(size()
) algorithm to compute the end rather than the O(1) they're capable of.
Certain other pieces of the library should change to make life easier for range<>
.
std::begin()
, std::end()
, and std::distance
should become constexpr
.contiguous_iterator_tag
to let range<T*>
safely construct from user-defined types that control contiguous sequences of objects.*_upto
methods could take advantage of a std::advance_upto(it, n, end)
algorithm on plain iterators, which advances the iterator either n increments or to 'end', whichever comes first.Many functions will be able to take either Ranges or Containers as arguments. We should work out what concept their argument follows. "Iterable"?
Versions of all of the algorithms should be added that accept ranges. While it would be possible to make them take std::range<>
itself, a set of Range concepts will likely work better in the long run, leaving std::range<>
as an interface between the iterator and range worlds.
Boost's adapters are a great idea and should be added to the standard.
Infinite ranges could be defined, whose empty()
method returns std::false_type
.
We should figure out how to fit OutputIterators into the range framework.
A collection of any_range
types would be useful.
make_range
taking a single iterator argument representing the beginning of a range that ends with a default-constructed Iterator
. This would help with using iterators like istream_iterator
. However, using just make_range()
could be confusing and lead to people writing incorrect ranges of more common iterators. Is there a better name? Inherit from std::pair<Iterator, Iterator>?
This interface contains some functions that could be provided as free algorithms rather than member functions, and all of the pop_*()
functions could be replaced by slice()
at the cost of some extra iterator copies. This makes them more awkward to use, but makes it easier for users to write their own types that follow the same interface. On the other hand, a range_facade
could be provided to help users write new ranges, and it could provide the members. Such functions are marked with a note in their documentation. (Of course, all of these member functions could be provided as free functions using the iterator access methods, but one goal here is to allow people to program without touching iterators at all.)
&*(i + N) == (&*i) + N
invariant is currently impossible to check for user-defined types. We need a contiguous_iterator_tag
to let users assert it. N+1
-element tuple<>
. This is tricky to implement with negative indices in the optimal number of increments or decrements for a bidirectional iterator, but it should be possible. Do we want it?