Uninitialized algorithms for relocation

Document #: P3516R0 [Latest] [Status]
Date: 2025-01-11
Project: Programming Language C++
Audience: LEWG
Reply-to: Louis Dionne
<>
Giuseppe D’Angelo
<>

1 Introduction

P2786 is currently making progress through EWG and LEWG. In Poland, LEWG expressed a strong desire for that language feature to come with a proper user-facing library interface. This paper proposes such an interface. That interface is heavily influenced by the uninitialized algorithms that were originally proposed in P1144 which, after implementation experience in libc++, seem to be almost exactly what we need.

This paper aims to provide the high-level algorithms that all programmers will want to use, while purposefully not touching on some of the more contentious design points of P2786/P1144. The high-level algorithms proposed in this paper are compatible with essentially any definition of “trivial relocation” we might introduce in the language in C++26 or later.

2 Motivation

Today, many algorithms and containers use some flavor of assignment, either as an implementation detail or as part of their API promise. In cases where the use of assignment is merely an implementation detail, writing the code in terms of relocation can often result in cleaner code and improved performance (since move-construction is often faster than move-assignment).

Furthermore, in a future where the language itself provides a definition of relocation (and in particular trivial relocation), algorithms and containers written in terms of relocation today would benefit from a massive speedup for trivially relocatable types without requiring any change. Indeed, the algorithms introduced in this paper boil down to memmove (or memcpy if we can prove the absence of overlap) in trivial cases, which are fairly common.

3 Proposal

We propose adding algorithms std::uninitialized_relocate, std::uninitialized_relocate_n, and std::uninitialized_relocate_backward. We also add the equivalent ranges:: functions and the parallel algorithm overloads. At a high level, these algorithms relocate a range of objects from one memory location to another, which is a very useful primitive when writing containers and algorithms.

However, the library interface proposed here is intentionally independent of the details of the underlying language feature.

Design points:

4 Usage examples

In these examples, we ignore allocator support for simplicity, and we use iterator instead of const_iterator as the parameter types to lighten the code. The “real” code for implementing these functions does not look meaningfully different.

4.1 vector::emplace / vector::insert

Assignment based
Relocation based
template <class ...Args>
constexpr iterator
vector<T>::emplace(iterator position, Args&&... args) {
  if [[unlikely]] (size() == capacity()) {
    // slow path where we grow the vector
  }

  if (position == end()) {
    std::construct_at(std::to_address(position),
                      std::forward<Args>(args)...);
    ++end_;
  } else {
    T tmp(std::forward<Args>(args)...);

    // Create a gap at the end by moving the last
    // element into the "new last position".
    std::construct_at(std::to_address(end()),
                      std::move(back()));
    ++end_;

    // The element before the last one is moved-from:
    // shift everything else by one position to the
    // right to open a gap at the insert location.
    std::move_backward(position, end() - 2, end() - 1);

    // Now there's a gap at the insert location, move the
    // new element into it.
    *position = std::move(tmp);
  }
  return position;
}
template <class ...Args>
constexpr iterator
vector<T>::emplace(iterator position, Args&&... args) {
  if [[unlikely]] (size() == capacity()) {
    // slow path where we grow the vector
  }

  if (position == end()) {
    std::construct_at(std::to_address(position),
                      std::forward<Args>(args)...);
    ++end_;
  } else {
    T tmp(std::forward<Args>(args)...);

    // Create a gap inside the vector by relocating
    // everything to the right.
    try {
      std::uninitialized_relocate_backward(position, end(),
                                           end() + 1);
    } catch (...) {
      // basic guarantee: vector lost its tail
      end_ = std::to_address(position);
      throw;
    }

    // Move-construct the new value into the gap created above.
    try {
      std::construct_at(std::to_address(position), std::move(tmp));
    } catch (...) {
      // basic guarantee: vector lost its tail
      std::destroy(position + 1, end() + 1);
      end_ = std::to_address(position);
      throw;
    }

    ++end_;
  }
  return position;
}

A few things are worth noting:

4.2 vector::erase

Assignment based
Relocation based
constexpr iterator
vector<T>::erase(iterator first, iterator last) {
  if (first == last)
    return last;

  auto new_end = std::move(last, end(), first);
  std::destroy(new_end, end());

  end_ -= (last - first);
  return first;
}
constexpr iterator
vector<T>::erase(iterator first, iterator last) {
  if (first == last)
    return last;

  // Destroy the range being erased and relocate the
  // tail of the vector into the created gap.
  std::destroy(first, last);
  std::uninitialized_relocate(last, end(), first);

  end_ -= (last - first);
  return first;
}

Things worth noting:

4.3 vector::emplace_back / vector::push_back

Note that the vector growth policy presented here is more naive than what std::vector typically does. Also keep in mind that we must provide the strong exception safety guarantee here.

Move-construction based
Relocation based
template <class ...Args>
constexpr reference
vector<T>::emplace_back(Args&& ...args) {
  if [[likely]] (size() < capacity()) {
    std::construct_at(std::to_address(end()),
                      std::forward<Args>(args)...);
    ++end_;
    return back();
  }

  // Move or copy all the elements into a large
  // enough vector.
  vector<T> tmp;
  tmp.reserve((size() + 1) * 2);
  for (auto& element : *this) {
    [[assume(tmp.size() < tmp.capacity())]];
    tmp.emplace_back(std::move_if_noexcept(element));
  }

  [[assume(tmp.size() < tmp.capacity())]];
  tmp.emplace_back(std::forward<Args>(args)...);

  swap(tmp);
  return back();
}
template <class ...Args>
constexpr reference
vector<T>::emplace_back(Args&& ...args) {
  if [[likely]] (size() < capacity()) {
    std::construct_at(std::to_address(end()),
                      std::forward<Args>(args)...);
    ++end_;
    return back();
  }

  // Relocate all the elements into a large enough vector,
  // or copy if moving might throw.
  vector<T> tmp;
  tmp.reserve((size() + 1) * 2);
  if constexpr (is_nothrow_move_constructible_v<T>) {
    tmp.end_ = std::uninitialized_relocate(begin(), end(), tmp.begin_);
  } else {
    for (auto& element : *this) {
      [[assume(tmp.size() < tmp.capacity())]];
      tmp.emplace_back(element);
    }
  }

  // At this point, the original contents of the vector are in tmp, which is large enough.
  // Construct the new last element.
  try {
    [[assume(tmp.size() < tmp.capacity())]];
    tmp.emplace_back(std::forward<Args>(args)...);
  } catch (...) {
    swap(tmp);
    throw;
  }

  swap(tmp);
  return back();
}

Things worth noting:

5 Compatibility with trivial relocation proposals

If we had trivial relocation in the language, these algorithms would not need to change significantly, assuming that trivial relocation is equivalent to move + destroy when both are valid. We can’t imagine a trivial relocation design where that wouldn’t be the case.

If C++ gets a notion of trivial relocation that increases the expressive power over move + destroy (like P2786), then these uninitialized algorithms should be slightly generalized to support types that are not movable but that are still trivially relocatable. Thus, these algorithms would be able to operate either on types that are movable, or on types that are not movable but trivially relocatable instead. That would be a small backwards compatible change.

6 Wording

6.1 [algorithms.requirements]

Modify 26.2 [algorithms.requirements] as shown:

6.2 [specialized.algorithms.general]

Modify 26.11.1 [specialized.algorithms.general] as shown:

template<class T>
  constexpr void* voidify(T& obj) noexcept {
    return addressof(obj);
  }

+template<class Source, class Dest>
+ concept relocatable-from =
+   same_as<Source, Dest> &&
+   move_constructible<Dest>;
+
+template<class T>
+ requires relocatable-from<T, T>
+  constexpr T* relocate-at(T* dest, T* source) {
+      struct guard { T *t; ~guard() { destroy_at(t); } } g(source);
+      return ::new (voidify(*dest)) T(std::move(*source));
+  }

6.3 [special.mem.concepts]

Add at the end of 26.11.2 [special.mem.concepts]:

template<class I>
concept nothrow-bidirectional-iterator = // exposition only
  nothrow-forward-iterator<I> &&
  bidirectional_iterator<I> &&
  nothrow-sentinel-for<I, I>;
template<class R>
concept nothrow-bidirectional-range = // exposition only
  nothrow-forward-range<R> &&
  nothrow-bidirectional-iterator<iterator_t<R>>;

6.4 [memory.syn]

In 20.2.2 [memory.syn], add to the <memory> header synopsis (before [unique.ptr]):

template<class NoThrowInputIterator, class NoThrowForwardIterator>
  constexpr NoThrowForwardIterator
    uninitialized_relocate(NoThrowInputIterator first,    // freestanding
                           NoThrowInputIterator last,
                           NoThrowForwardIterator result);

template<class ExecutionPolicy, class NoThrowForwardIterator1, class NoThrowForwardIterator2>
  constexpr NoThrowForwardIterator2
    uninitialized_relocate(ExecutionPolicy&& exec,    // see [algorithms.parallel.overloads]
                           NoThrowForwardIterator1 first,
                           NoThrowForwardIterator1 last,
                           NoThrowForwardIterator2 result);

template<class NoThrowInputIterator, class Size, class NoThrowForwardIterator>
  constexpr NoThrowForwardIterator
    uninitialized_relocate_n(NoThrowInputIterator first,    // freestanding
                             Size n,
                             NoThrowForwardIterator result);

template<class ExecutionPolicy, class NoThrowForwardIterator1, class Size, class NoThrowForwardIterator2>
  constexpr NoThrowForwardIterator2
    uninitialized_relocate_n(ExecutionPolicy&& exec,    // see [algorithms.parallel.overloads]
                             NoThrowForwardIterator1 first,
                             Size n,
                             NoThrowForwardIterator2 result);

template<class NoThrowBidirectionalIterator1, class NoThrowBidirectionalIterator2>
  constexpr NoThrowBidirectionalIterator2
    uninitialized_relocate_backward(NoThrowBidirectionalIterator1 first,    // freestanding
                                    NoThrowBidirectionalIterator1 last,
                                    NoThrowBidirectionalIterator2 result);

template<class ExecutionPolicy, class NoThrowBidirectionalIterator1, class NoThrowBidirectionalIterator2>
  constexpr NoThrowBidirectionalIterator2
    uninitialized_relocate_backward(ExecutionPolicy&& exec, // see [algorithms.parallel.overloads]
                                    NoThrowBidirectionalIterator1 first,
                                    NoThrowBidirectionalIterator1 last,
                                    NoThrowBidirectionalIterator2 result);

namespace ranges {
  template<class I, class O>
    using uninitialized_relocate_result = in_out_result<I, O>;    // freestanding

  template<nothrow-input-iterator I, nothrow-sentinel-for<I> S1,
           nothrow-forward-iterator O, nothrow-sentinel-for<O> S2>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_result<I, O>
        uninitialized_relocate(I ifirst, S1 ilast, O ofirst, S2 olast);    // freestanding

  template<nothrow-input-range IR, nothrow-forward_range<I> OR>,
    requires relocatable-from<range_value_t<IR>, range_value_t<OR>>
      constexpr uninitialized_relocate_result<borrowed_iterator_t<IR>, borrowed_iterator_t<OR>>
        uninitialized_relocate(IR&& in_range, OR&& out_range);    // freestanding

  template<class I, class O>
    using uninitialized_relocate_n_result = in_out_result<I, O>;    // freestanding

  template<nothrow-input-iterator I,
           nothrow-forward-iterator O, nothrow-sentinel-for<O> S>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_n_result<I, O>
        uninitialized_relocate_n(I ifirst, iter_difference_t<I> n, O ofirst, S olast);    // freestanding

  template<class I, class O>
    using uninitialized_relocate_backward_result = in_out_result<I, O>;    // freestanding

  template<nothrow-bidirectional-iterator I, nothrow-sentinel-for<I> S1,
           nothrow-bidirectional-iterator O, nothrow-sentinel-for<O> S2>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_backward_result<I, O>
        uninitialized_relocate_backward(I ifirst, S1 ilast, O ofirst, S2 olast);    // freestanding

  template<nothrow-bidirectional-range IR, nothrow-bidirectional-range OR>
    requires relocatable-from<range_value_t<IR>, range_value_t<OR>>
      constexpr uninitialized_relocate_backward_result<borrowed_iterator_t<IR>, borrowed_iterator_t<OR>>
        uninitialized_relocate_backward(IR&& in_range, OR&& out_range);    // freestanding
}

6.5 [specialized.algorithms]

6.5.1 [special.relocate]

Append to 26.11 [specialized.algorithms] a new subclause, uninitialized_relocate [special.relocate], with these contents:

template<class NoThrowInputIterator, class NoThrowForwardIterator>
  constexpr NoThrowForwardIterator
    uninitialized_relocate(NoThrowInputIterator first,
                           NoThrowInputIterator last,
                           NoThrowForwardIterator result);
NoThrowForwardIterator first_result = result;
try {
  while (first != last) {
    relocate-at(addressof(*result), addressof(*first));
    ++first;
    ++result;
  }
} catch (...) {
  destroy(++first, last);
  destroy(first_result, result);
  throw;
}

return result;
namespace ranges {
  template<nothrow-input-iterator I, nothrow-sentinel-for<I> S1,
           nothrow-forward-iterator O, nothrow-sentinel-for<O> S2>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_result<I, O>
        uninitialized_relocate(I ifirst, S1 ilast, O ofirst, O2 olast);

  template<nothrow-input-range IR, nothrow-forward_range<I> OR>,
    requires relocatable-from<range_value_t<IR>, range_value_t<OR>>
      constexpr uninitialized_relocate_result<borrowed_iterator_t<IR>, borrowed_iterator_t<OR>>
        uninitialized_relocate(IR&& in_range, OR&& out_range);
}
O first_result = ofirst;
try {
  while (ifirst != ilast && ofirst != olast) {
    relocate-at(addressof(*ofirst), addressof(*ifirst));
    ++ifirst;
    ++ofirst;
  }
} catch (...) {
  destroy(++ifirst, ilast);
  destroy(first_result, ofirst);
  throw;
}

return {std::move(ifirst), ofirst};

6.5.2 [special.relocate_n]

Append to 26.11 [specialized.algorithms] a new subclause, uninitialized_relocate_n [special.relocate_n], with these contents:

template<class NoThrowInputIterator, class Size, class NoThrowForwardIterator>
  constexpr NoThrowForwardIterator
    uninitialized_relocate_n(NoThrowInputIterator first,
                             Size n,
                             NoThrowForwardIterator result);
NoThrowForwardIterator first_result = result;
try {
  while (n > 0) {
    relocate-at(addressof(*result), addressof(*first));
    ++first;
    ++result;
    --n;
  }
} catch (...) {
  destroy_n(++first, --n);
  destroy(first_result, result);
  throw;
}

return result;
namespace ranges {
  template<nothrow-input-iterator I,
           nothrow-forward-iterator O, nothrow-sentinel-for<O> S>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_n_result<I, O>
        uninitialized_relocate_n(I ifirst, iter_difference_t<I> n, O ofirst, S olast);
}
O first_result = ofirst;
try {
  while (n > 0 && ofirst != olast) {
    relocate-at(addressof(*ofirst), addressof(*ifirst));
    ++ifirst;
    ++ofirst;
    --n;
  }
} catch (...) {
  destroy_n(++ifirst, --n);
  destroy(first_result, ofirst);
  throw;
}

return {std::move(ifirst), ofirst};

6.5.3 [special.relocate_backward]

Append to 26.11 [specialized.algorithms] a new subclause, uninitialized_relocate_backward [special.relocate_backward], with these contents:

template<class NoThrowBidirectionalIterator1, class NoThrowBidirectionalIterator2>
  constexpr NoThrowBidirectionalIterator2
    uninitialized_relocate_backward(NoThrowBidirectionalIterator1 first,
                                    NoThrowBidirectionalIterator1 last,
                                    NoThrowBidirectionalIterator2 result);
NoThrowBidirectionalIterator2 result_last = result;
try {
  while (last != first) {
    --last;
    --result;
    relocate-at(addressof(*result), addressof(*last));
  }
} catch (...) {
  destroy(first, last);
  destroy(++result, result_last);
  throw;
}

return {last, result};
namespace ranges {
  template<nothrow-bidirectional-iterator I, nothrow-sentinel-for<I> S1,
           nothrow-bidirectional-iterator O, nothrow-sentinel-for<O> S2>
    requires relocatable-from<iter_value_t<I>, iter_value_t<O>>
      constexpr uninitialized_relocate_backward_result<I, O>
        uninitialized_relocate_backward(I ifirst, S1 ilast, O ofirst, S2 olast);

  template<nothrow-bidirectional-range IR, nothrow-bidirectional-range OR>
    requires relocatable-from<range_value_t<IR>, range_value_t<OR>>
      constexpr uninitialized_relocate_backward_result<borrowed_iterator_t<IR>, borrowed_iterator_t<OR>>
        uninitialized_relocate_backward(IR&& in_range, OR&& out_range);
}
O result_last = olast;
try {
  while (ilast != ifirst && olast != ofirst) {
    --ilast;
    --olast;
    relocate-at(addressof(*olast), addressof(*ilast));
  }
} catch (...) {
  destroy(ifirst, ilast);
  destroy(++olast, result_last);
  throw;
}

return {ilast, olast};