Doc No: N2448=07-0318
Revises: N2231=07-0091
Reply to: Matt Austern <austern@google.com>
Date: 2007-10-19

STL singly linked lists (revision 2)


The standard container class std::list is a doubly linked list. It provides bidirectional iterators, and the obvious implementation is a collection of list nodes each of which contains a pointer to both the following and the preceding node. Doubly linked lists are more flexible than singly linked lists, but they also have higher overhead in both space and time. In practice, many programs don't need the flexibility of doubly linked lists; singly linked lists are the primary data structure in many functional languages, and structs that are essentially hand-written singly linked list nodes are a common idiom in C APIs.

There is extensive experience with STL singly linked lists classes. Many C++ standard library implementations provide an slist class as an extension; it's similar to list, but it has forward iterators instead of bidirectional. The SGI, Dinkumware, and Metrowerks libraries all include slist. This proposal is roughly a subset of the SGI version. (And hence a subset of the GNU version, which is descended from SGI's.)

This proposal is a pure extension. It does not propose any changes to the list class, or to the container or iterator concepts. A future proposal might reasonably introduce new concepts for node-based containers, and might implement algorithms in terms of node-based concepts.

Revision history

N2448 is a minor revision of N2231. The main differences are:

Design decisions

Except where there is reason to differ, forward_list is modeled on list.

Since singly linked lists have strictly less functionality than doubly linked lists, the only reason to include them in the library is performance. It's important to notice, furthermore, that the performance difference isn't a matter of asymptotic complexity, but a matter of bytes and cycles. Inserting a new node is O(1) in either a singly linked list or a doubly linked list, but in a singly linked list it's faster. This fact influences most of the design decisions in this proposal. The main goal of this proposal is that forward_list should have zero space or time overhead relative to a hand-written C-style singly linked list.

Insert-before versus insert-after

There is one fundamental design decision that can be expressed in several different ways. From the point of view of the standard it is most naturally expressed as a question about the complexity of certain container operations, such as insert and erase, but it is more illuminating to think about this decision in terms of the implementation.

Consider a typical C singly-linked list node, such as this example from libxml2:
typedef struct _xmlEnumeration xmlEnumeration;
struct _xmlEnumeration {
    struct _xmlEnumeration   *next;     /* next one */
    const xmlChar            *name;     /* Enumeration name */
};
A list would just be represented as a pointer to the first list node, an xmlEnumeration*. An "iterator", i.e. a way of referring to a particular list element, would also just be a pointer to some list node. Traversing the list is a loop with a skeleton of this form:
while (p != 0) {
  ...
  p = p->next;
}
Inserting a new element into such a list is equally straightforward: create a new node, and fix up pointers:
xmlEnumeration *new = malloc(sizeof(xmlEnumeration));
new->name = strdup(name);
new->next = p->next;
p->next = new;
Note that the new node is inserted after p, not before. This is unavoidable given the definition of xmlEnumeration, since there is no way to find the node whose next pointer points to p.

It's easy to translate this general idea into a C++ class. The core of such a list class might be written something like this:
struct list_node_base {
  list_node_base() : next(0) { }
  list_node_base* next;
};

template <typename Value>
struct list_node : public list_node_base {
  list_node(const Value& v) : list_node_base(), val(v) { }
  Value val;
};

template <typename Value>
struct list_iterator {
  typedef Value value_type;
  typedef Value* pointer;
  typedef Value& reference;
  typedef ptrdiff_t difference_type;
  typedef std::forward_iterator_tag iterator_category;

  list_iterator(list_node_base* n) : node(n) { }

  value_type& operator*() const {
    return static_cast<list_node<value_type>*>(node)->val;
  }

  list_iterator& operator++() { node = node->next; return *this; }
  list_iterator operator++(int) {
    list_iterator tmp(*this); ++*this; return tmp;
  }

  bool operator==(const list_iterator<value_type>& x) const {
    return node == x.node;
  }
  bool operator!=(const list_iterator<value_type>& x) const {
    return node != x.node;
  }

  list_node_base* node;
};

template <typename Value>
class list {
private:
  typedef list_node<Value> Node;
  list_node_base head;
public:
  list() { head.next = 0; }

  list(const forward_list& x) : head() {
    const list_node_base* from = &x.head;
    list_node_base* to = &this->head;
    while (from->next != 0) {
      const Node* nextf = static_cast<Node*>(from->next);
      to->next = new Node(*nextf);
      from = from->next;
      to = to->next;
    }
  }

  list<Value>& operator=(const forward_list& x) {
    if (&x != this) {
      list<Value> tmp(x);
      this->swap(tmp);
    }
    return *this;
  }

  ~list() {
    list_node_base* p = head.next;
    while (p != 0) {
      Node* tmp = static_cast<Node*>(p);
      p = p->next;
      delete tmp;
    }
  }

  void swap(list<Value>& x) { std::swap(this->head.next, x.head.next); }

public:
  typedef list_iterator<Value> iterator;
  iterator begin() { return iterator(head.next); }
  iterator end() { return iterator(0); }
  iterator before_begin() { return iterator(&head); }

public:
  iterator insert_after(iterator i, const Value& x) {
    Node* p = new Node(x);
    p->next = i.node->next;
    i.node->next = p;
    return iterator(p);
  }

  iterator erase_after(iterator i) {
    Node* p = static_cast<Node*>(i.node->next);
    i.node->next = p->next;
    delete *p;
    return iterator(i.node->next);
  }
};

In some respects this is an ordinary STL container with an ordinary iterator interface, but in other respects it is highly nontraditional. It differs from std::list more than one might have expected at first sight. While std::list provides O(1) insert(i), which inserts an element before i, and erase(i), which deletes the element that i points to, this class instead provides O(1) insert_after and erase_after. If we wanted to provide the more traditional versions in this design, they would have to be O(N): essentially, they would take an iterator argument, search for the preceding list node, and and then call insert_after or erase_after. (Or, of course, if we didn't provide insert and erase, it would be easy for users to write their own functions that do exactly the same thing.) This is directly analogous to the use model of C-style singly linked lists like xmlEnumeration.

These restrictions aren't inevitable. It is perfectly possible to write a singly linked list with O(1) insert and erase, at the cost of making iterators slightly more complicated. The general idea is simple: the iterator points to a node before the one it conceptually refers to. For example, the operator* above could be replaced by the following definition:
  value_type& operator*() const {
    return static_cast<list_node<value_type>*>(node->next)->val;
  }
This permits "insert-before" semantics because in low-level implementation terms insert(i, x) would insert the new list node after i.node, but from the user's view it would insert the new element before the element referred to by *i. One consequence is that end() must become slightly more complicated. Conceptually a past-the-end iterator points to a nonexistent element, which in these terms would mean that end().node->next, as opposed to just end().node, is a null pointer, and end().node is the last node in the list. In this design, then, the list class would have to have two member variables instead of one: it still needs head, but it also needs a last_node member variable so that end() can return a node whose next pointer is null. Finally, insert and erase also have to become slightly more complicated to maintain the invariant that last_node is always the last node in the list.

We thus have two different designs, which can be characterized from an implementation or a user point of view:
This is the most important design choice for a singly-linked list proposal.

This proposal chooses the version in which an iterator points to the list node of the element it refers to, instead of the preceding node. We could continue to use the names insert and erase and just have them do something different, but that would be an invitation to bugs. Since the operations do something different, the should be called something different. I have chosen the names insert_after and erase_after, which are the same names used for these operations in the SGI slist class.

The main disadvantage of this design is that programmers have to use insert_after and erase_after instead of the more familiar insert and erase. The advantage, however, is that all other operations are more efficient: iterator dereference is a single pointer dereference, and the list object itself doesn't need to use a second word to store a tail pointer. Admittedly these performance issues are small: they involve changes in small constant factors, not changes of asymptotic complexity. I still consider them important, however, because those constant factors are the only reason for using a singly linked list instead of a doubly linked list the first place. The main design goal for this proposal is that forward_list should have zero overhead relative to a hand-written C-style linked list like xmlEnumeration, and a design in which iterators point to the preceding list node does not satisfy that goal. Long experience shows that for some uses the restrictions of C-style or lisp-style singly linked lists are not onerous. For other uses, std::list is still available.

A smaller question: should we provide O(N) insert and erase, or omit those operations entirely? The SGI STL chose the former. This proposal chooses the latter, because providing operations with the same names as those in std::list, but with much worse performance, would invite bugs. It's also the more conservative choice, because it's always easier to add features at a later date than to remove them. One consequence of this fact is that forward_list will not satisfy the Sequence requirements of Table 89.

An additional alternative that was proposed on the reflector: rename operator++ as operator--, and provide iterators that are decrementable but not incrementable. (There is no such iterator concept in the current iterator concept hierarchy.) Insert would still be insert-after, in that we would have an l.ins(p, x) operation that inserted x at a position following p in the traversal, but, since we've renamed the iterator that follows p to be the one obtained by operator--, this could arguably be described as inserting it "before" p.

I have rejected this alternative for a simple reason: it doesn't solve the problem. The only reason we care about the issue of insert-before versus insert-after semantics is that it would be nice for forward_list to model the same concepts as list and vector. And the only reason we care about concept modeling, in turn, is to make it possible to write generic container algorithms that can be used with forward_list as well as with other containers; for example, it would be nice if the BGL adjacency_list class could use be templatized on container type and could use either a list or a forward_list. But this renaming trick does not achieve that goal. It would give us a container that's even less like list and vector than the foward_list that I have proposed. Not only the container, but even its iterators, would be unusable with generic algorithms written with other containers in mind. This idea is clever, but it's a cure that's considerably worse than the disease.

Size

The only way to calculate the size of a singly linked list is to walk through it, which is an O(N) operation. All existing STL containers have a size() member function, and indeed this is part of the container requirements of Table 87. We have three choices for forward_list::size():
  1. Provide an O(N) forward_list::size(), essentially equivalent to std::distance(begin(), end())
  2. Provide an O(1) forward_list::size(), where the container keeps the node count as a separate member variable that it updates whenever nodes are added or removed.
  3. Don't provide forward_list::size() at all.

I have chosen the third option.

Option 3 is superior to Option 1 because experience has shown that users find an O(N) size surprising. Users expect such a simple member function to be fast; it's too tempting to write code like
if (l.size() > 1) { ... }
and so an O(N) size invites serious performance bugs. Since users can already get the effect of an O(N) size by writing std:distance(l.begin(), l.end()), leaving it out doesn't decrease forward_list's functionality. This does mean that we will have an STL container that fails to satisfy the Container requirements of table 87, but we already have precedent for that: the unordered associative containers don't satisfy all of those requirements either. It seems more important to have a useful set of container member functions than to worry about strict conformance to this interface.

The choice between Option 3 and Option 2 is more a matter of judgment. I have chosen Option 3 for the same reason that I chose insert-after instead of insert-before: Option 3 is more consistent with the goal of zero overhead compared to a hand-written C-style linked list. Maintaining a count doubles the size of a forward_list object (one word for the list head and one for the count), and it slows down every operation that changes the number of nodes. In most cases this isn't a change in asymptotic complexity (the one change in asymptotic complexity is in one of the forms of splice), but it is nonzero overhead. It's a cost that all users would have to pay for, whether they need this feature or not, and, for users who care about maintaining a count, it's just as easy to maintain it outside the list, by incrementing the count with every insert and decrementing it with every erase, as it is to maintain the count within the list.

Note that similar considerations apply to std::list::size().

Node-based algorithms

The std::list class includes a number of special functions that are reminiscent of STL algorithms, including remove, unique, and sort. These member functions don't duplicate the freestanding STL algorithms: std::remove operates by copying elements from one location in a sequence to another, while std::list::remove operates on the list's node structure itself. The freestanding std::remove can't create and destroy nodes, since it only knows about iterators, but list::remove can.

All of these algorithms can be implemented for singly-linked lists (Common Lisp is an existence proof, as is the SGI slist), and this proposal includes all of them.

This is not the only reasonable decision. An alternative would be to define a framework for node-based algorithms (probably based on the notion of iterators with the ability to mutate the next/previous relationship) and then pull out all of these operations from both list and forward_list, defining them as generic algorithms for arbitrary node-based structures. That alternative would be elegant, and would be consistent with the principles of generic programming. I am not proposing it at this time for two related reasons. First, it would mean that this proposal wouldn't be a pure extension—it would modify an existing component, std::list. Second, generic node-based algorithms are experimental, while a singly linked list class with special algorithms is existing practice.

Proposed wording

Add the following class synopsis to section 23.2 [lib.sequences], and the rest of the text as a new subsection of 23.2.

Header <forward_list> synopsis

namespace std {
  template <class T, class Allocator = allocator<T> > class forward_list;
  template <class T, class Allocator>
    bool operator==(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator< (const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator!=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator> (const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator>=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator<=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    void swap(forward_list<T,Allocator>& x, forward_list<T,Allocator>& y);
}

23.2.x Class template forward_list

A forward_list is a container that supports forward iterators and allows constant time insert and erase operations anywhere within the sequence, with storage management handled automatically. Fast random access to list elements is not supported.

A forward_list satisfies all of the requirements of a container (table 87), except that the size() member function is not provided. Descriptions are provided here only for operations on forward_list that are not described in that table or for operations where there is additional semantic information.

namespace std {
  template <class T, class Allocator = allocator<T> >
  class forward_list {
  public:
    // types:
    typedef typename Allocator::reference reference;
    typedef typename Allocator::const_reference const_reference;
    typedef implementation_defined iterator;       // See 23.1
    typedef implementation_defined const_iterator; // See 23.1
    typedef implementation_defined size_type;      // See 23.1
    typedef implementation_defined difference_type;// See 23.1
    typedef T value_type;
    typedef Allocator allocator_type;
    typedef typename Allocator::pointer pointer;
    typedef typename Allocator::const_pointer const_pointer;

    // 23.2.x.1 construct/copy/destroy:
    explicit forward_list(const Allocator& = Allocator());
    explicit forward_list(size_type n);
    explicit forward_list(size_type n, const T& value,
                   const Allocator& = Allocator());
    template <class InputIterator>
      forward_list(InputIterator first, InputIterator last,
            const Allocator& = Allocator());
    forward_list(const forward_list<T,Allocator>& x);
   
forward_list(forward_list<T,Allocator>&& x);
    ~forward_list();
    forward_list<T,Allocator>& operator=(const forward_list<T,Allocator>& x);
    forward_list<T,Allocator>& operator=(const forward_list<T,Allocator>&& x);
    template <class InputIterator>
      void assign(InputIterator first, InputIterator last);
    void assign(size_type n, const T& t);
    allocator_type get_allocator() const;

    // 23.2.x.2 iterators:
    iterator before_begin();
    const_iterator before_begin() const;
    iterator begin();
    const_iterator begin() const;
    iterator end();
    const_iterator end() const;

    const_iterator cbegin() const;
    const_iterator cbefore_begin() const;
    const_iterator cend() const;

    // capacity:
    bool empty() const;
    size_type max_size() const;
    void resize(size_type sz);
    void resize(size_type sz, value_type c);

    // 23.2.x.3 element access:
    reference front();
    const_reference front() const;

    // 23.2.x.4 modifiers:
    template <class... Args> void push_front(Args&&... args);
    void pop_front();
   
    iterator insert_after(const_iterator position, const T& x);
    template <class... Args> iterator emplace_after(const_iterator position, Args&&... args);
    void insert_after(const_iterator position, size_type n, const T& x);
    template <class InputIterator>
      void insert_after(const_iterator position, InputIterator first, InputIterator last);
    iterator erase_after(const_iterator position);
    iterator erase_after(const_iterator position, iterator last);
    void swap(forward_list<T,Allocator>&);
    void clear();

    // 23.2.x.5 forward_list operations:
    void splice_after(const_iterator position, forward_list<T,Allocator>& x);
    void splice_after(const_iterator position, forward_list<T,Allocator>& x, const_iterator i);
    void splice_after(const_iterator position, forward_list<T,Allocator>& x, const_iterator first, const_iterator last);

    void remove(const T& value);
    template <class Predicate> void remove_if(Predicate pred);

    void unique();
    template <class BinaryPredicate> void unique(BinaryPredicate binary_pred);

    void merge(forward_list<T,Allocator>&& x);
    template <class Compare> void merge(forward_list<T,Allocator>&& x, Compare comp);

    void sort();
    template <class Compare> void sort(Compare comp);

    void reverse();
  };

  // Comparison operators
  template <class T, class Allocator>
    bool operator==(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator< (const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator!=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator> (const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator>=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    bool operator<=(const forward_list<T,Allocator>& x, const forward_list<T,Allocator>& y);

  // 23.2.x.6 specialized algorithms:
  template <class T, class Allocator>
    void swap(forward_list<T,Allocator>& x, forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    void swap(forward_list<T,Allocator>&& x, forward_list<T,Allocator>& y);
  template <class T, class Allocator>
    void swap(forward_list<T,Allocator>& x, forward_list<T,Allocator>&& y);
}

3.2.x.1 forward_list constructors, copy, and assignment
 
explicit forward_list(const Allocator& = Allocator());
Effects: Constructs an empty forward_list, using the specified allocator.
Complexity: Constant.

explicit forward_list(size_type n);
explicit forward_list(size_type n, const T& value, const Allocator& = Allocator());
Effects: Constructs a forward_list with n copies of value, using the specified allocator.
Complexity: Linear in n.

template <class InputIterator>
forward_list(InputIterator first, InputIterator last, const Allocator& = Allocator());
Effects: Constructs a forward_list equal to the range [first, last).
Complexity: Linear in distance(first, last)

template <class InputIterator>
void assign(InputIterator first, InputIterator last);
Effects: clear(); insert_after(before_begin(), first, last);

void assign(size_type n, const T& t);
Effects: clear(); insert_after(before_begin(), n, t);

23.2.x.2 forward_list iterators

iterator before_begin();
const_iterator before_begin() const;
const_iterator cbefore_begin() const;
Returns: A non-dereferenceable iterator that, when incremented, is equal to the iterator returned by begin().

23.2.x.3 forward_list element access

reference front();
const_reference front() const;
Returns: *begin().

23.2.x.4 forward_list modifiers

None of the overloads of insert_after shall affect the validity of iterators and reference, and erase_after shall invalidate only the iterators and references to the erased elements. If an exception is thrown during insert_after there shall be no effect. Insertion of n elements into a forward_list is linear in n, and the number of calls to the copy constructor of T is exactly equal to n. Erasing n elements from a forward_list is linear time in n and the number of calls to the destructor of type T is exactly equal to n.

template <class... Args> void push_front(Args&&... args);
Effects: Inserts an object of type value_type constructed with value_type(std::forward<Args>(args)...) at the beginning of the list.

void pop_front();
Effects: erase_after(before_begin()).

iterator insert_after(const_iterator position, const T& x);
Requires: position is dereferenceable or equal to before_begin().
Effects: Inserts a copy of x after position.
Returns: An iterator pointing to the copy of x.

void insert_after(const_iterator position, size_type n, const T& x);
Requires: position is dereferenceable or equal to before_begin().
Effects: Inserts n copies of x after position.

template <class InputIterator>
void insert_after(const_iterator position, InputIterator first, InputIterator last);
Requires: position is dereferenceable or equal to before_begin(). first and last are not iterators in *this.
Effects: Inserts copies of elements in [first, last) after position.

template <class... Args>
iterator emplace_after(const_iterator position, Args&&... args);
Requires: position is dereferenceable or equal to before_begin().
Effects: Inserts an object of type value_type constructed with value_type(std::forward<Args>(args)...) after position.

iterator erase_after(const_iterator position);

Requires: The iterator following position is dereferenceable.
Effects: Erases the element pointed to by the iterator following position.
Returns: An iterator pointing to the element following the one that was erased, or end() if no such element exists.

iterator erase_after(const_iterator position, iterator last);
Requires: All iterators in the range (position, last) are dereferenceable.
Effects: Erases the elements in the range (position, last).
Returns: last

void clear();
Effects: Erases all elements in the range [begin(), end()).

23.2.x.5 forward_list operations

void splice_after(const_iterator position, forward_list<T,Allocator>& x);
Requires: position is dereferenceable or equal to before_begin(). &x != this.
Effects: Inserts the contents of x before position, and x becomes empty. Invalidates all iterators and references to x.
Throws: nothing
Complexity: O(1)

void splice_after(const_iterator position, forward_list<T,Allocator>& x, const_iterator i);
Requires: position is dereferenceable or equal to before_begin(). The iterator following i is a dereferenceable iterator in x.
Effects: Inserts the element following i into *this, following position, and removes it from x. Invalidates only the iterators and references to the spliced elements.
Throws: nothing
Complexity: O(1)

void splice_after(const_iterator position, forward_list<T,Allocator>& x, const_iterator first, const_iterator last);
Requires: position is dereferenceable or equal to before_begin(). [first, last) is a valid range in x, and all iterators in the range (first, last) are dereferenceable. position is not an iterator in the range (first, last).
Effects: Inserts elements in the range (first, last)after position and removes the elements from x. Invalidates only the iterators and references to the spliced elements.

void remove(const T& value);
template <class Predicate> void remove_if(Predicate pred);
Effects: Erases all the elements in the list referred by a list iterator i for which the following conditions hold: *i == value, pred(*i) is true. This operation shall be stable: the relative order of the elements that are not removed is the same as their relative order in the original list.
Throws: Nothing unless an exception is thrown by the equality comparison or the predicate.
Complexity: Exactly distance(begin(), end()) applications of the corresponding predicate.

void unique();
template <class BinaryPredicate> void unique(BinaryPredicate binary_pred);
Effects: Eliminates all but the first element from every consecutive group of equal elements referred to by the iterator i in the range [first + 1, last) for which *i == *(i-1) (for the version of unique with no arguments) or pred(*i, *(i - 1)) (for the version of unique with a predicate argument) holds.
Throws: Nothing unless an exception in thrown by the equality comparison or the predicate.
Complexity: If the range (last - first) is not empty, exactly (last - first) - 1 applications of the corresponding predicate, otherwise no applications of the predicate.

void merge(forward_list<T,Allocator>&& x);
template <class Compare> void merge(forward_list<T,Allocator>&& x, Compare comp);
Requires: comp defines a strict weak ordering (25.3), and *this and x are both sorted according to this ordering.
Effects: Merges x into *this. This operation shall be stable: for equivalent elements in the two lists, the elements from *this shall always precede the elements from x. x is empty after the merge. If an exception is thrown other than by a comparison there are no effects.
Complexity: At most size() + x.size() - 1 comparisons.

void sort();
template <class Compare> void sort(Compare comp);
Requires: operator< (for the first version) or comp (for the second version) defines a strict weak ordering (25.3).
Effects: Sorts the list according to the operator< or a Compare function object. This operation shall be stable: the relative order of the equivalent elements is preserved. If an exception is thrown the order of the elements in *this is indeterminate.
Complexity: Approximately N log N comparisons, where N == distance(begin(), end()).

void reverse();
Effects: Reverses the order of the elements in the list.
Throws: Nothing.
Complexity: Linear time.

23.2.x.6 forward_list specialized algorithms

template <class T, class Allocator>
   void swap(forward_list<T,Allocator>& x, forward_list<T,Allocator>& y);
template <class T, class Allocator>
   void swap(forward_list<T,Allocator>&& x, forward_list<T,Allocator>& y);
template <class T, class Allocator>
   void swap(forward_list<T,Allocator>& x, forward_list<T,Allocator>&& y);
Effects: x.swap(y)