Doc no: N2231=07-0091
Reply to: Matt Austern <austern@google.com>
Date: 2007-03-19
STL singly linked lists
The standard container class
list is a doubly linked list. It
provides bidirectional iterators, and the obvious implementation is a collection
of list nodes each of which contains a pointer to both the following and the
preceding node. Doubly linked lists are more flexible than singly linked lists,
but they also have higher overhead in both space and time. In practice, many
programs don't need the flexibility of doubly linked lists; singly linked lists
are the primary data structure in many functional languages, and structs that
are essentially hand-written singly linked list nodes are a common pattern for C
APIs.
There is extensive experience with STL singly linked lists classes. Many C++
standard library implementations provide an
slist class as an extension; it's
similar to list, but it has forward
iterators instead of bidirectional. The SGI, Dinkumware, and Metrowerks
libraries all include slist. This
proposal is a subset of the SGI version. (And hence a subset of the GNU version,
which is descended from the SGI.)
This proposal is a pure extension. It does not propose any changes to the list
class, or to the sequence concepts. A future proposal might reasonably introduce
new concepts for node-based containers, and might implement algorithms in terms
of node-based concepts.
Design decisions
Except where there is reason to differ,
slist is modeled on
list.
Since singly linked lists have strictly less functionality than doubly linked
lists, the only reason to include them in the library is performance. It's
important to notice, furthermore, that this isn't a matter of asymptotic
complexity, but a matter of bytes and cycles. Inserting a new node is
O(1) in either a singly linked list or a
doubly linked list, but in a singly linked list it's faster. This fact
influences most of the design decisions in this proposal.
Insert-before versus insert-after
There is one fundamental design decision that can be expressed in several
different ways. From the point of view of the standard it is most naturally
expressed as a question about the complexity of certain container operations,
such as insert and erase, but it is more illuminating to think about this
decision in terms of the implementation.
Consider a typical C singly-linked list node, such as this example from libxml2:
typedef struct _xmlEnumeration
xmlEnumeration;
struct _xmlEnumeration
{
struct
_xmlEnumeration *next; /* next one
*/
const
xmlChar
*name; /* Enumeration name
*/
};
A list would just be represented as a pointer to the first list node, an
xmlEnumeration*. An "iterator",
i.e. a way of referring to a particular list element, would also just be a
pointer to some list node. Traversing the list is a loop with a skeleton of this
form:
while (p != 0)
{
...
p =
p->next;
}
Inserting a new element into such a list is equally straightforward: create a
new node, and fix up pointers:
xmlEnumeration *new =
malloc(sizeof(xmlEnumeration));
new->name =
strdup(name);
new->next =
p->next;
p->next = new;
Note that the new node is inserted after
p, not before. This is unavoidable
given the definition of
xmlEnumeration, since there is no
way to find the node whose next
pointer points to p.
It's easy to translate this general idea into a C++ class. The core of such a
list class might be written something like this:
struct list_node_base
{
list_node_base() : next(0) {
}
list_node_base*
next;
};
template <typename
Value>
struct list_node : public list_node_base
{
list_node(const Value& v) :
list_node_base(), val(v) { }
Value
val;
};
template <typename
Value>
struct list_iterator
{
typedef Value
value_type;
typedef Value*
pointer;
typedef Value&
reference;
typedef ptrdiff_t
difference_type;
typedef std::forward_iterator_tag
iterator_category;
list_iterator(list_node_base* n)
: node(n) { }
value_type& operator*() const
{
return
static_cast<list_node<value_type>*>(node)->val;
}
list_iterator& operator++() {
node = node->next; return *this;
}
list_iterator operator++(int)
{
list_iterator
tmp(*this); ++*this; return tmp;
}
bool operator==(const
list_iterator<value_type>& x) const
{
return node ==
x.node;
}
bool operator!=(const
list_iterator<value_type>& x) const
{
return node !=
x.node;
}
list_node_base*
node;
};
template <typename
Value>
class slist
{
private:
typedef list_node<Value>
Node;
list_node_base
head;
public:
slist() { head.next = 0;
}
slist(const slist& x) :
head() {
const list_node_base*
from = &x.head;
list_node_base* to =
&this->head;
while (from->next
!= 0) {
const
Node* nextf =
static_cast<Node*>(from->next);
to->next = new Node(*nextf);
from =
from->next;
to =
to->next;
}
}
slist<Value>&
operator=(const slist& x) {
if (&x != this)
{
slist<Value> tmp(x);
this->swap(tmp);
}
return
*this;
}
~slist()
{
list_node_base* p =
head.next;
while (p != 0)
{
Node* tmp
= static_cast<Node*>(p);
p =
p->next;
delete
tmp;
}
}
void swap(slist<Value>&
x) { std::swap(this->head.next, x.head.next);
}
public:
typedef
list_iterator<Value>
iterator;
iterator begin() { return
iterator(head.next); }
iterator end() { return
iterator(0); }
iterator before_begin() { return
iterator(&head); }
public:
iterator insert_after(iterator i,
const Value& x) {
Node* p = new
Node(x);
p->next =
i.node->next;
i.node->next =
p;
return
iterator(p);
}
iterator erase_after(iterator i)
{
Node* p =
static_cast<Node*>(i.node->next);
i.node->next =
p->next;
delete
*p;
return
iterator(i.node->next);
}
};
In some respects this is an ordinary STL container with an ordinary iterator
interface, but in other respects it is highly nontraditional. It differs from
std::list more than one might have
expected at first sight. While
std::list provides
O(1)
insert(i), which inserts an element
before i, and
erase(i), which deletes the element
that i points to, this class
instead provides O(1)
insert_after and
erase_after. If we wanted to
provide the more traditional versions in this design, they would have to be
O(N): essentially, they would take an
iterator argument, search for the preceding list node, and and then call
insert_after or
erase_after. (Or, of course, if we
didn't provide insert and
erase, it would be easy for users
to write their own functions that do exactly the same thing.) This is directly
analogous to the use model of C-style singly linked lists like
xmlEnumeration.
These restrictions aren't inevitable. It is perfectly possible to write an slist
with O(1)
insert and
erase, at the cost of making
iterators slightly more complicated. The general idea is quite simple: the
iterator points to a node before the one it
conceptually refers to. For example, the
operator* above could be replaced
by the following definition:
value_type& operator*() const
{
return
static_cast<list_node<value_type>*>(node->next)->val;
}
This permits "insert-before" semantics because in low-level implementation terms
insert(i, x) would insert the new
list node after
i.node, but from the user's view it
would insert the new element before the element referred to by
*i. One consequence is that
end() must become slightly more
complicated. Conceptually a past-the-end iterator points to a nonexistent
element, which in these terms would mean that
end().node->next, as opposed to
just end().node, is a null pointer,
and end().node is the last node in
the list. In this design, then,
slist will have to have two member
variables instead of one: it still needs
head, but it also needs a
last_node member variable so that
end() can return a node whose
next pointer is null. Finally,
insert and
erase also have to become slightly
more complicated to maintain the invariant that
last_node is always the last node
in the list.
We thus have two different designs, which can be characterized from an
implementation or a user point of view:
-
Does an iterator point to the list node of the element it refers to, or to
the preceding list node?
-
Is a past-the-end iterator represented as a null pointer, or as a pointer to
the last list node?
-
Does an slist object contain a single pointer, the list head, or does it
contain both a head pointer and a tail pointer?
-
Are the traditional STL definitions of insert and erase
O(1), or do programmers need to use
insert-after and erase-after if they want O(1) operations?
This is the most important design choice for a singly-linked list proposal.
This proposal chooses the version in which an iterator points to the list node
of the element it refers to, instead of the preceding node. The main
disadvantage of that design is that programmers have to use
insert_after and
erase_after instead of the more
familiar insert and
erase. The advantage, however, is
that all other operations are more efficient: iterator dereference is a single
pointer dereference, and the slist
object itself doesn't need to use a second word to store a tail pointer.
Admittedly these performance issues are small: they involve changes in small
constant factors, not changes of asymptotic complexity. I still consider them
important, however, because those constant factors are the only reason for using
slist instead of
std::list in the first place. The
main design goal for this proposal is that
slist should have zero overhead
relative to a hand-written C-style linked list like
xmlEnumeration,
and a design in which iterators point to the preceding list node does not
satisfy that goal. Long experience shows that for some uses the restrictions of
C-style or lisp-style singly linked lists are not onerous. For other uses,
std::list is still available.
A smaller question: should we provide O(N)
insert and erase, or omit those operations entirely? I have chosen the latter,
because providing operations with the same names as those in
std::list, but with much worse
performance, would invite bugs. It's also the more conservative choice, because
it's always easier to add features at a later date than to remove them. One
consequence of this fact is that
slist will not satisfy the Sequence
requirements of Table 67.
Size
The only way to calculate the size of a singly linked list is to walk through
it, which is an O(N) operation. All
existing STL containers have a
size() member function, and indeed
this is part of the container requirements of Table 65. We have three choices
for slist::size():
-
Provide an O(N)
slist::size(), essentially
equivalent to std::distance(begin(), end())
-
Provide an O(1)
slist::size(), where the
container keeps the node count as a separate member variable that it updates
whenever nodes are added or destroyed.
-
Don't provide slist::size() at
all.
I have chosen the third option.
Option 3 is superior to Option 1 because experience has shown that users find an
O(N)
size surprising. Users expect such
a simple member function to be fast; it's too tempting to write code like
if (l.size() > 1) { ... }
and so an O(N)
size invites serious performance
bugs. Since users can already get the effect of an
O(N)
size by writing
std:distance(l.begin(), l.end()),
leaving it out doesn't decrease
slist's functionality. This does
mean that we will have an STL container that doesn't satisfy the Container
requirements of table 65, but we already have precedent for that: the unordered
associative containers don't satisfy all of those requirements either. It seems
more important to have a useful set of container member functions than to worry
about strict conformance to this interface.
The choice between Option 3 and Option 2 is more a matter of judgment. I have
chosen Option 3 for the same reason that I chose insert-after instead of
insert-before: Option 3 is more consistent with the goal of zero overhead
compared to a hand-written C-style linked list. Maintaining a count doubles the
size of an slist object (one word
for the list head and one for the count), and it slows down every operation that
changes the number of nodes. In most cases this isn't a change in asymptotic
complexity (the one change in asymptotic complexity is is one of the forms of
splice), but it is nonzero
overhead. It's a cost that all users would have to pay for, whether they need
this feature or not, and, for users who care about maintaining a count, it's
just as easy to maintain it outside the list, by incrementing the count with
every insert and decrementing it with every erase, as it is to maintain the
count within the list.
Node-based algorithms
The standard list class includes a number of special functions that are
reminiscent of STL algorithms, including
remove,
unique, and
sort. These member functions don't
duplicate the freestanding STL algorithms:
std::remove operates by copying
elements from one location in a sequence to another, while
std::list::remove operates on the
list's node structure itself. The freestanding
std::remove can't create and
destroy nodes, since it only knows about iterators, but
list::remove can.
All of these algorithms can be implemented for singly-linked lists (Common Lisp
is an existence proof, as is the SGI
slist), and this proposal includes
all of them.
This is not the only reasonable decision. An alternative would be to define a
framework for node-based algorithms (probably based on the notion of iterators
with the ability to mutate the next/previous relationship) and then pull out all
of these operations from both list
and slist, defining them as generic
algorithms for arbitrary node-based structures. That alternative would be
elegant, and would be consistent with the principles of generic programming. I
am not proposing it at this time for two related reasons. First, it would mean
that this proposal wouldn't be a pure extension—it would modify an existing
component, list. Second, generic
node-based algorithms are experimental, while an
slist with special algorithms is
existing practice.
Proposed wording
Add the following class synopsis to section 23.2 [lib.sequences], and the rest
of the text as a new subsection of 23.2.
Header <slist> synopsis
namespace std
{
template <class T, class
Allocator = allocator<T> > class slist;
template <class T, class
Allocator>
bool operator==(const
slist<T,Allocator>& x, const slist<T,Allocator>& y);
template <class T, class
Allocator>
bool operator<
(const slist<T,Allocator>& x, const slist<T,Allocator>&
y);
template <class T, class
Allocator>
bool operator!=(const
slist<T,Allocator>& x, const slist<T,Allocator>& y);
template <class T, class
Allocator>
bool operator>
(const slist<T,Allocator>& x, const slist<T,Allocator>&
y);
template <class T, class
Allocator>
bool
operator>=(const slist<T,Allocator>& x, const
slist<T,Allocator>& y);
template <class T, class
Allocator>
bool
operator<=(const slist<T,Allocator>& x, const
slist<T,Allocator>& y);
template <class T, class
Allocator>
void
swap(slist<T,Allocator>& x, slist<T,Allocator>& y);
}
23.2.x Class template slist
An slist is a container that
supports forward iterators and allows constant time insert and erase operations
anywhere within the sequence, with storage management handled automatically.
Fast random access to list elements is not supported.
A slist satisfies all of the
requirements of a container (table 65), except that the
size() member function is not
provided. Descriptions are provided here only for operations on
slist that are not described in
that table or for operations where there is additional semantic information.
namespace std {
template <class T, class
Allocator = allocator<T> >
class slist {
public:
// types:
typedef typename
Allocator::reference reference;
typedef typename
Allocator::const_reference const_reference;
typedef
implementation_defined iterator; // See
23.1
typedef
implementation_defined const_iterator; // See 23.1
typedef
implementation_defined size_type; // See 23.1
typedef
implementation_defined difference_type;// See 23.1
typedef T value_type;
typedef Allocator
allocator_type;
typedef typename
Allocator::pointer pointer;
typedef typename
Allocator::const_pointer const_pointer;
// 23.2.x.1
construct/copy/destroy:
explicit slist(const
Allocator& = Allocator());
explicit
slist(size_type n, const T& value = T(),
const Allocator& = Allocator());
template <class
InputIterator>
slist(InputIterator first, InputIterator last,
const Allocator& = Allocator());
slist(const
slist<T,Allocator>&x);
~slist();
slist<T,Allocator>& operator=(const slist<T,Allocator>&x);
template <class
InputIterator>
void
assign(InputIterator first, InputIterator last);
void assign(size_type
n, const T& t);
allocator_type
get_allocator() const;
// 23.2.x.2
iterators:
iterator before_begin();
const_iterator before_begin() const;
iterator
begin();
const_iterator
begin() const;
iterator end();
const_iterator end()
const;
// capacity:
bool empty() const;
size_type max_size()
const;
// 23.2.x.3 element
access:
reference front();
const_reference
front() const;
// 23.2.x.4
modifiers:
void push_front(const
T& x);
void
pop_front();
iterator
insert_after(iterator position, const T& x);
void
insert_after(iterator position, size_type n, const T& x);
template <class
InputIterator>
void
insert_after(iterator position, InputIterator first,
InputIterator last);
iterator
erase_after(iterator position);
iterator
erase_after(iterator position, iterator last);
void
swap(slist<T,Allocator>&);
void clear();
// 23.2.x.5 slist
operations:
void
splice_after(iterator position, slist<T,Allocator>& x);
void
splice_after(iterator position, slist<T,Allocator>& x, iterator i);
void
splice_after(iterator position, slist<T,Allocator>& x, iterator
first,
iterator last);
void remove(const
T& value);
template <class
Predicate> void remove_if(Predicate pred);
void unique();
template <class
BinaryPredicate> void
unique(BinaryPredicate binary_pred);
void
merge(slist<T,Allocator>& x);
template <class
Compare> void merge(slist<T,Allocator>& x, Compare comp);
void sort();
template <class
Compare> void sort(Compare comp);
void reverse();
};
// Comparison
operators
template <class T, class
Allocator>
bool operator==(const
slist<T,Allocator>& x, const slist<T,Allocator>& y);
template <class T, class
Allocator>
bool operator< (const
slist<T,Allocator>& x, const slist<T,Allocator>& y);
template <class T, class
Allocator>
bool operator!=(const
slist<T,Allocator>& x, const slist<T,Allocator>& y);
template <class T, class
Allocator>
bool operator>
(const slist<T,Allocator>& x, const slist<T,Allocator>&
y);
template <class T, class
Allocator>
bool
operator>=(const slist<T,Allocator>& x, const
slist<T,Allocator>& y);
template <class T, class
Allocator>
bool
operator<=(const slist<T,Allocator>& x, const
slist<T,Allocator>& y);
// 23.2.x.6 specialized
algorithms:
template <class T, class
Allocator>
void
swap(slist<T,Allocator>& x, slist<T,Allocator>& y);
}
3.2.x.1 slist constructors, copy, and
assignment
explicit slist(const Allocator& =
Allocator());
Effects: Constructs an empty slist, using
the specified allocator.
Complexity: Constant.
explicit slist(size_type n, const T&
value= T(),
const Allocator& = Allocator());
Effects: Constructs an slist with n
copies of value, using the specified allocator.
Complexity: Linear in n.
template <class InputIterator>
slist(InputIterator first, InputIterator
last,
const
Allocator& = Allocator());
Effects: Constructs an slist equal to the
range [first,last).
Complexity: Linear in distance(first,
last)
template <class InputIterator>
void assign(InputIterator first,
InputIterator last);
Effects: clear();
insert_after(before_begin(), first, last);
void assign(size_type n, const T& t);
Effects: clear();
insert_after(before_begin(), n, t);
23.2.x.2 slist iterators
iterator before_begin();
const_iterator before_begin()
const;
Returns: A non-dereferenceable iterator
that, when incremented, is equal to the iterator returned by begin().
23.2.x.3 slist element access
reference front();
const_reference front() const;
Returns: *begin().
23.2.x.4 slist modifiers
None of the overloads of insert_after shall affect the validity of iterators and
reference, and erase_after shall invalidate only the iterators and references to
the erased elements. If an exception is thrown during insert_after there shall
be no effect. Insertion of n elements into an slist is linear in n, and the
number of calls to the copy constructor of T is exactly equal to n. Erasing n
elements from an slist is linear time in n and the number of calls to the
destructor of type T is exactly equal to n.
void push_front(const T& x);
Effects: insert_after(before_begin(), x);
void pop_front();
Effects: erase_after(before_begin());
iterator insert_after(iterator position,
const T& x);
Requires:
position is dereferenceable or equal to before_begin().
Effects:
Inserts a copy of x after
position.
Returns:
An iterator pointing to the copy of x.
void insert_after(iterator
position, size_type n, const T& x);
Requires:
position is dereferenceable or equal to before_begin().
Effects:
Inserts n copies of x after position.
template <class
InputIterator>
void insert_after(iterator position,
InputIterator first,
InputIterator last);
Requires:
position is dereferenceable or equal to before_begin().
first
and last are not iterators in
*this.
Effects:
Inserts copies of elements in [first, last) after position.
iterator
erase_after(iterator position);
Requires:
The iterator following position is dereferenceable.
Effects:
Erases the element pointed to by the iterator following
position.
Returns:
An iterator pointing to the element following the one that was erased, or
end() if no such element exists.
iterator
erase_after(iterator position, iterator last);
Requires:
All iterators in the range (position, last) are
dereferenceable.
Effects:
Erases the elements in the range (position, last).
Returns:
last
void clear();
Effects:
Erases all elements in the range [begin(), end()).
23.2.x.5 slist operations
void splice_after(iterator
position, slist<T,Allocator>& x);
Requires:
position is dereferenceable or equal to before_begin(). &x !=
this.
Effects:
Inserts the contents of x before position, and x becomes empty. Invalidates
all iterators and references to x.
Throws:
nothing
Complexity:
O(1)
void splice_after(iterator
position, slist<T,Allocator>& x, iterator i);
Requires:
position
is dereferenceable or equal to before_begin(). The iterator following i is a
dereferenceable iterator in x.
Effects:
Inserts the element following i into *this, following position, and removes it
from x. Invalidates only the iterators and references to the spliced
elements.
Throws:
nothing
Complexity: O(1)
void splice_after(iterator
position, slist<T,Allocator>& x, iterator first,
iterator last);
Requires:
position
is dereferenceable or equal to before_begin(). [first, last) is a valid range
in x, and all iterators in the range (first, last) are dereferenceable.
position is not an iterator in the range (first, last).
Effects:
Inserts elements in the range (first, last)after position and removes the
elements from x.
Invalidates
only the iterators and references to the spliced
elements.
void remove(const T&
value);
template <class Predicate> void
remove_if(Predicate pred);
Effects:
Erases all the elements in the list referred by a list iterator i for which
the following conditions hold: *i == value, pred(*i) is true. This operation
shall be stable: the relative order of the elements that are not removed is
the same as their relative order in the original list.
Throws:
Nothing unless an exception is thrown by the equality comparison or the
predicate.
Complexity:
Exactly distance(begin(), end()) applications of the corresponding predicate.
void unique();
template <class BinaryPredicate>
void unique(BinaryPredicate binary_pred);
Effects:
Eliminates all but the first element from every consecutive group of equal
elements referred to by the iterator i in the range [first + 1, last) for
which *i == *(i-1) (for the version of unique with no arguments) or pred(*i,
*(i - 1)) (for the version of unique with a predicate argument) holds.
Throws:
Nothing unless an exception in thrown by the equality comparison or the
predicate.
Complexity:
If the range (last - first) is not empty, exactly (last - first) - 1
applications of the corresponding predicate, otherwise no applications of the
predicate.
void
merge(slist<T,Allocator>& x);
template <class Compare> void
merge(slist<T,Allocator>& x, Compare comp);
Requires:
comp defines a strict weak ordering (25.3), and *this and x are both sorted
according to this ordering.
Effects:
Merges x into *this. This operation shall be stable:
for
equivalent elements in the two lists, the elements from *this shall always
precede the elements from x. x is empty after the merge.
If
an exception is thrown other than by a comparison there are no effects.
Complexity:
At most size() + x.size() - 1 comparisons.
void sort();
template <class Compare> void
sort(Compare comp);
Requires:
operator< (for the first version) or comp (for the second version) defines
a strict weak ordering (25.3).
Effects:
Sorts the list according to the operator<or a Compare function object. This
operation shall be stable: the relative order of the equivalent elements is
preserved. If an exception is thrown the order of the elements in *this is
indeterminate.
Complexity:
Approximately NlogN comparisons, where N == distance(begin(), end()).
void reverse();
Effects:
Reverses the order of the elements in the list.
Throws:
Nothing.
Complexity:
Linear time.
23.2.x.6 slist specialized algorithms
template <class T, class
Allocator>
void swap(slist<T,Allocator>& x,
slist<T,Allocator>& y);
Effects:
x.swap(y)