1. Introduction
This is a library paper to describe how size feedback can be used with allocators:
-
In the case of
, the language feature proposed in [P0901R5] could be used.std :: allocator -
In the case of custom allocators, the availability of the traits interface allows standard library containers to leverage this functionality.
2. Motivation
Consider code adding elements to
:
std :: vector < int > v = { 1 , 2 , 3 }; // Expected: v.capacity() == 3 // Add an additional element, triggering a reallocation. v . push_back ( 4 );
Many allocators only allocate in fixed-sized chunks of memory, rounding up
requests. Our underlying heap allocator received a request for 12 bytes (
) while constructing
. For several implementations, this request
is turned into a 16 byte region.
2.1. Why not realloc
poses several problems problems:
-
We have unclear strategies for growing a container. Consider the two options (
anda
) outlined in the comments of the example below. Assume, as with the § 2 Motivation example, that there is a 16 byte size class for our underlying allocator.b std :: vector < int > v = { 1 , 2 , 3 }; // Expected: v.capacity() == 3 // Add an additional element. Does the vector: // a) realloc(sizeof(int) * 4), adding a single element // b) realloc(sizeof(int) * 6), attempt to double, needing reallocation v . push_back ( 4 ); // Add another element. // a) realloc(sizeof(int) * 5), fails in-place, requires reallocation // b) size + 1 <= capacity, no reallocation required v . push_back ( 5 ); Option
requires every append (after the initial allocation) try toa
, requiring an interaction with the allocator each time.realloc Option
requires guesswork to probe the allocator’s true size. By doubling (the typical growth pattern forb
in libc++ and libstdc++), we skip past the existing allocation’s size boundary. While we can use a reduced growth rate, doing so requires more interactions with the allocator (calls tostd :: vector
), as well as copies (when we cannot resize in-place).realloc Both of these options add extra branching on the control flow for growing the buffer.
-
When the allocator cannot resize the buffer in-place, it allocates new memory and
's the contents. For non-trivially copyable types, this is disastrous. The language lacks support for reallocating in-place without moving. While this function could be implemented as magic library function, it could not interact with many underlyingmemcpy
implementations (libc, in the case of libc++ and libstdc++) used to buildmalloc
, nor could it interact with user-provided replacement functions.operator new In-place reallocation has been considered in [N3495] (throwing
when unable) and [P0894R1] (returningstd :: bad_alloc false
when unable). This has sometimes been referred to as an "extend" or an "expand" interface. Extending an existing allocation requires a round-trip to the allocator (and its requisite branch/call overheads) to determine whether the allocation could be expanded. As discussed previously, we need to probe for the true size of the backing object (until we find extend fails) until we give up and apply a multiplicative expansion factor.
For fixed-size allocators it makes more sense, and is much simpler for containers to adapt to, if the allocator is able to over-allocate on the initial request and inform the caller how much memory was made available.
2.2. Why not ask the allocator for the size
We could also explore APIs that answer the question: "If I ask for
bytes,
how many do I actually get?" [jemalloc] and [TCMalloc] call this
.
-
requires an extra call / size calculation. This requires we duplicate work, as the size calculation is performed as part of allocation itself.nallocx -
impairs telemetry for monitoring allocation request sizes. If anallocx
return value is cached, the user appears to be asking for exactly as many bytes asnallocx
said they would get.nallocx
See also [P0901R5]'s discussion "
: not as awesome as it looks."
3. Proposal
Wording relative to [N4868].
We propose a free function to return the allocation size simultaneously with an allocation.
Amend [memory.syn]:
namespace std { // ... elided ... // [allocator.traits], allocator traits template < class Alloc > struct allocator_traits ; template < class Pointer > struct allocation_result { Pointer ptr ; size_t count ; }; template < class Allocator > [[ nodiscard ] constexpr allocation_result < typename allocator_traits < Allocator >:: pointer > allocate_at_least ( Allocator & a , size_t n ); // [default.allocator], the default allocator template < class T > class allocator ; template < class T , class U > constexpr bool operator == ( const allocator < T >& , const allocator < U >& ) noexcept ; // ... elided ... }
Amend [allocator.traits.types] to ensure structured binding compatibility:
The class template
has the template parameters, data members, and special members specified above. It has no base classes or members other than those specified.
allocation_result
Insert [allocator.traits.other]:
template < class Allocator > [[ nodiscard ]] constexpr allocation_result < typename allocator_traits < Allocator >:: pointer > allocate_at_least ( Allocator & a , size_t n );
- Returns:
if that expression is well-formed; otherwise,
a . allocate_at_least ( n ) .
{ a . allocate ( n ), n }
Amend [default.allocator.general]:
namespace std { template < class T > class allocator { public : using value_type = T ; using size_type = size_t ; using difference_type = ptrdiff_t ; using propagate_on_container_move_assignment = true_type ; using is_always_equal = true_type ; constexpr allocator () noexcept ; constexpr allocator ( const allocator & ) noexcept ; template < class U > constexpr allocator ( const allocator < U >& ) noexcept ; constexpr ~ allocator (); constexpr allocator & operator = ( const allocator & ) = default ; [[ nodiscard ]] constexpr T * allocate ( size_t n ); [[ nodiscard ]] constexpr allocation_result < T *> allocate_at_least ( size_t n ); constexpr void deallocate ( T * p , size_t n ); }; }
Amend [allocator.members]:
Except for the destructor, member functions of the default allocator shall not introduce data races as a result of concurrent calls to those member functions from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.
[[ nodiscard ]] constexpr T * allocate ( size_t n );
- Mandates:
is not an incomplete type ([basic.types]).
T - Returns: A pointer to the initial element of an array of
n .
T - Throws:
if
bad_array_new_length , or
numeric_limits < size_t >:: max () / sizeof ( T ) < n if the storage cannot be obtained.
bad_alloc - Remarks: The storage for the array is obtained by calling
, but it is unspecified when or how often this function is called. This function starts the lifetime of the array object, but not that of any of the array elements.
:: operator new [[ nodiscard ]] constexpr allocation_result < T *> allocate_at_least ( size_t n );
- Mandates:
is not an incomplete type ([basic.types]).
T - Returns:
, where
allocation_result < T *> { ptr , count } is a pointer to the initial element of an array of
ptr
count and
T .
count >= n - Throws:
if
bad_array_new_length , or
numeric_limits < size_t >:: max () / sizeof ( T ) < n if the storage cannot be obtained.
bad_alloc - Remarks: The storage for the array is obtained by calling
, but it is unspecified when or how often this function is called. This function starts the lifetime of the array object, but not that of any of the array elements.
:: operator new constexpr void deallocate ( T * p , size_t n );
Preconditions:is a pointer value obtained from
p .
allocate equals the value passed as the first argument to the invocation of
n which returned
allocate .
p - Preconditions:
- If
is memory that was obtained by a call to
p , let ret be the value returned and req be the value passed as the first argument to that call.
allocate_at_least is equal to
p and
ret . ptr is a value such that .
n - Otherwise,
is a pointer value obtained from
p .
allocate equals the value passed as the first argument to the invocation of
n which returned
allocate .
p - Effects: Deallocates the storage referenced by
.
p - Remarks: Uses
, but it is unspecified when this function is called.
:: operator delete
Amend [allocator.requirements.general], [tab:cpp17.allocator]. The return type
is used in the table for consistency with
but should be resolved according to the LWG issue.
a . allocate ( n )
X :: pointer Memory is allocated for an array of
n and such an object is created but array elements are not constructed.
T [Example 1: When reusing storage denoted by some pointer value
,
p can be used to implicitly create a suitable array object and obtain a pointer to it. — end example]
launder ( reinterpret_ cast < T *> ( new ( p ) byte [ n * sizeof ( T )]))
may throw an appropriate exception.
allocate [Note 1: If n == 0, the return value is unspecified. — end note]See Note C, below.
a . allocate ( n , y )
X :: pointer Same as . The use of
a . allocate ( n ) is unspecified, but it is intended as an aid to locality.
y
a . allocate ( n )
a . allocate_at_least ( n )
allocation_result < X :: pointer > Returns: where
allocation_result < X :: pointer > { ptr , count } is memory allocated for an array of
ptr
count and such an object is created but array elements are not constructed, such that
T .
count >= n may throw an appropriate exception. See Note C, below.
allocate_at_least See Note D, below.
a . deallocate ( p , n ) (not used)
Preconditions:is a value returned by an earlier call to
p that has not been invalidated by an intervening call to
allocate .
deallocate matches the value passed to
n to obtain this memory.
allocate - Preconditions:
- If
is memory that was obtained by a call to
p , let ret be the value returned and req be the value passed as the first argument of that call.
a . allocate_at_least is equal to
p and
ret . ptr is a value such that .
n - Otherwise,
is a pointer value obtained from
p .
allocate equals the value passed as the first argument to the invocation of
n which returned
allocate .
p has not been invalidated by an intervening call to
p .
deallocate - Throws: Nothing.
... elided ...
Note C: If
, the return value is unspecified.
n == 0 Note D: An allocator need not support
, but no default is provided in
allocate_at_least . If an allocator has an
allocator_traits member, it shall satisify the requirements.
allocate_at_least
Amend [version.syn]
#define __cpp_lib_allocate_at_least PLACEHOLDER // also in <memory>
4. Design Considerations
4.1. allocate
selection
There are multiple approaches here:
-
Return a pointer-size pair, as presented.
-
Overload
and return via a reference parameter. This potentially hampers optimization depending on ABI, as the value is returned via memory rather than a register.allocate
4.2. Size Return Value
In [Prague], LEWG discussed the return value. For compatibility with the existing allocator APIs, which work in units of objects rather than bytes, this proposal chooses to continue to return an integer number of objects.
Additionally, for types with non-trivial alignment requirements, we must allocate storage for objects, rather than bytes, as raw bytes do not convey the appropriate alignment needs to the allocator.
For example: In the
case, many implementations use 3 pointers
for representing the state of the container (begin, end, and capacity). If we
preserved the precise value returned by the underlying allocator, we may not be
able to legally form the capacity pointer. For these implementations,
replacing the capacity pointer with a capacity in bytes would be an ABI break.
4.3. deallocate
changes
We now require flexibility on the size we pass to
. For container
types using this allocator interface, they are faced with the choice of storing both the original size request as well as the provided size (requiring
additional auxillary storage), or being unable to reproduce one during the call
to
.
As the true size is expected to be useful for the
of a
or
instance, the returned size is available without additional storage
requirements. The original (requested) size is unavailable, so we relax the
size parameter requirements for these allocations.
In [P0901R5], we anticipate use cases where the user does not store enough
information about the size to return a precise, byte-level count. We may be
allocating
's and receive enough storage to allocate 3.5 objects. Requiring
callers return the precise size may be impractical (
frequently
stores its capacity as a
) without additionally auxillary storage.
In this paper, we may be similarly unable to return the true size back. In the [Prague] review of this paper, we discussed cases where we rounded down to
the multiple of the size being allocated. Alternatively, custom data
structures may use
to obtain storage, but quantize that. In
Abseil’s [Cord], data is stored in chunks with a length-encoding tag at the
start of each chunk. These tags quantize the capacity in
that can be represented to multiples of 16- and 32-bytes. This makes the
encoded length more compact, but means we cannot reproduce the size the
allocator provided us precisely.
4.4. Interaction with polymorphic_allocator
is implemented using virtual functions. Adding new
methods, such as the proposed allocate API would require taking an ABI break.
4.5. Zero-Sized Requests
In Prague, LEWG discussed the behavior of
.
This maximizes implementation freedom.
-
The return value may not be dereferenceable (even if not
).nullptr -
Other implementations, such as a stack allocator, could just return the current offset.
5. Revision History
5.1. R5 → R6
Applied wording feedback from the reflector.
-
In [memory.syn] added missing template parameter and
.[[ nodiscard ]] -
In [memory.traits.other] added missing template parameter.
-
In [allocator.members]:
-
Removed
.a . -
Clarified
is the value of the argument to the call.req -
Used math operators, not C++ operators.
-
-
Removed unneeded parentheses around
.allocate -
Fixed wording in [tab:cpp17.allocator].
-
Inserted line break between "Note C" and "Note D".
-
Changed "note D" from must to shall.
5.2. R4 → R5
Applied wording feedback from [LWGNovTelecon].
-
Generally, replaced
withtypename T class T -
In [allocator.traits], removed change as redundant with [memory.syn]
-
In [allocator.traits.type]:
-
Fixed insert editing on [allocator.traits.type]
-
Added note (in paper) for structured binding intent of [allocator.traits.type] wording
-
-
In [allocator.traits.other]
-
Moved
to new stable heading, as it is not a member ofallocate_at_least allocator_traits -
Fixed otherwise clause of
allocate_at_least -
Added
to[[ nodiscard ]] allocate_at_least
-
-
In [allocator.members], applied math font in
editsdeallocate -
In [allocator.members], removed shall in
deallocate -
In [allocator.members], applied code font to
callsallocate -
In [allocator.members], removed structured binding in
textdeallocate -
In [allocator.members], bulleted cases for clarity.
-
In [allocator.requirements.general]
-
Moved Note 1 from
to be "Note C" shared withallocate allocate_at_least -
Moved "Note C" to "Note D" in the defaults column.
-
Removed value from return type for
allocate_at_least -
Added what
actually returns in third columnallocate_at_least -
Removed
qualification from (now) "Note D"std :: -
Removed definition of
(the helper) in (now) "Note D" rather than redefining it.allocate_at_least -
Applied modifications from [allocator.members]
to table.deallocate -
Required that if a
member is provided, it must meet the requirements.allocate_at_least
-
5.3. R3 → R4
-
Applied wording feedback from Pablo Halpern and David Stone
-
Added discussion about changes to deallocation
-
Proposed feature test macro
5.4. R2 → R3
Applied LEWG feedback from [Prague].
-
Renamed
tosized_ptr_t
, based on discussion in Prague that this library proposal (which returns a count of storage for objects) have a distinct type from the underlying language proposal ([P0901R5], which returns bytes). This was one of the names bikeshedded during the review of P0901R5 ([P0901R5-Review]).allocation_result -
Renamed
(the returned size) ton
, based on discussion in Prague to clarify that is measured in objects, rather than bytes. Additionally, augmented discussion of thecount
changes for why implementations may not be able to preserve the precise size returned by the underlying allocator.deallocate
Poll: We prefer to return number of bytes.
SF F N A SA 0 0 6 6 5
-
Changed the template parameter to pointer type to allow for fancy pointers.
POLL: Change the template parameter to sized_ptr_t to pointer type (to support fancy ptr).Unanimous consent
-
Added design intent and wording that
be compatible with structured bindings. This wording was adapted from the proposed resolution of [LWG3373].allocation_result -
Added
constexpr -
Removed default template parameter from
allocation_result -
Based on LEWG feedback, changed
to deduce return type from the allocator. Since the pointer type is chosen by the allocator, the underlying object type is less relevant. This also removed the default template parameter.allocate_at_least
POLL: We want to deduce the object type from the allocator (alloc_least_n [sic]).
SF F N A SA 4 8 1 0 0
POLL: We like the design, please revise and do initial wording review (with Pablo, and David Stone) send to LWG (tentatively Ready).
SF F N A SA 6 12 0 0 0
5.5. R1 → R2
Applied LEWG feedback from [Cologne].
Poll: We want a feature like this.
SF F N A SA 2 9 2 0 0
-
As users may specialize
, this revision moves to a free functionstd :: allocator_traits
as suggested via a LEWG poll.std :: allocate_at_least
Poll: Name choice
5
allocate_with_size 4
overallocate 2
allocate_with_oversubscribe 6
allocate_oversize 14
allocate_at_least Poll: Prefer a struct rather than a requirement of structured bindings.
SF F N A SA 2 8 0 2 1