1. Motivation
Performance sensitive code is impacted by the cost of initializing and
manipulating strings. When writing data into a
, a
programmer is faced with an unhappy choice:
-
Pay for extra initialization —
, which zero initializes, followed by copy.resize -
Pay for extra copies — Populate a temporary buffer, copy it to the string.
-
Pay for extra "bookkeeping" —
followed by small appends, each of which checks capacity and null terminates.reserve
The effects of these unhappy choices can all be measured at scale, yet C++'s hallmark is to leave no room between the language (or in this case the library) and another language.
LEWGI polled on this at the [SAN] Meeting:
We should promise more committee time to [P1072R1], knowing that our time is scarce and this will leave less time for other work?
Unanimous consent
LEWG at the [SAN] Meeting:
We want to solve this problem
SF F N A SA 17 5 0 0 0 Willing to solve this for string without solving it for vector
SF F N A SA 6 9 2 1 0
2. Proposal
This proposal addresses the problem by adding
:
template<typename Operation> constexpr void resize_and_overwrite(size_type n, Operation op);
Effects: Alters the value of
as follows:
* this Invokes
- — If
, erases the last
n <= size () elements.
size () - n - — If
, appends
n > size () default-initialized elements.
n - size () .
erase ( begin () + op ( data (), n ), end ()) Remarks:
If
throws the behavior is undefined.
op Let
before the call to
o = size () .
resize_and_overwrite Let
.
m = op ( data (), n )
otherwise the behavior is undefined.
m <= n If
,
m > o shall replace ([expr.ass]) the values stored in the character array
op . Until replaced, the values may be indeterminate [basic.indet] [Note:
[ data () + o , data () + m ) may not be
* ( data () + o ) . - end note]
CharT ()
may write to
op . Any value written will be replaced with CharT() after
data () + n + 1 returns. [Note: This facilitiates interoperation with functions that write a trailing null. - end note ]
op When
is called
op is in an unspecified state.
* this shall not bind or in any way access
op .
* this
shall not allow its first argument to be accessible after it returns.
op
In order to enable
, this proposal makes it
implementation-defined whether
uses
and
to
construct and destroy the "char-like objects" that it controls. See § 5.2 Allocator Interaction and § 8 Wording below for more details.
3. Implementation
includes a private implementation of a prior version of this
proposal (based on an earlier revision) and uses it to avoid a dynamic
allocation in
[LIBC++].
4. Examples
4.1. Stamping a Pattern
Consider writing a pattern several times into a string:
std :: string GeneratePattern ( const std :: string & pattern , size_t count ) { std :: string ret ; ret . reserve ( pattern . size () * count ); for ( size_t i = 0 ; i < count ; i ++ ) { // SUB-OPTIMAL: // * Writes 'count' nulls // * Updates size and checks for potential resize 'count' times ret . append ( pattern ); } return ret ; }
Alternatively, we could adjust the output string’s size to its final size,
avoiding the bookkeeping in
at the cost of extra initialization:
std :: string GeneratePattern ( const std :: string & pattern , size_t count ) { std :: string ret ; const auto step = pattern . size (); // SUB-OPTIMAL: We memset step*count bytes only to overwrite them. ret . resize ( step * count ); for ( size_t i = 0 ; i < count ; i ++ ) { // GOOD: No bookkeeping memcpy ( ret . data () + i * step , pattern . data (), step ); } return ret ; }
With this proposal:
std :: string GeneratePattern ( const std :: string & pattern , size_t count ) { std :: string ret ; const auto step = pattern . size (); // GOOD: No initialization ret . resize_and_overwrite ( step * count , [ & ]( char * buf , size_t n ) { for ( size_t i = 0 ; i < count ; i ++ ) { // GOOD: No bookkeeping memcpy ( buf + i * step , pattern . data (), step ); } return step * count ; }); return ret ; }
4.2. Interacting with C
Consider wrapping a C API while working in terms of C++'s
vocabulary. We anticipate over-allocating, as computation of the length of
the data is done simultaneously with the computation of the contents.
extern "C" { int compress ( void * out , size_t * out_size , const void * in , size_t in_size ); size_t compressBound ( size_t bound ); } std :: string CompressWrapper ( std :: string_view input ) { std :: string compressed ; // Compute upper limit of compressed input. size_t size = compressBound ( input . size ()); // SUB-OPTIMAL: Extra initialization compressed . resize ( size ); int ret = compress ( &* compressed . begin (), & size , input . data (), input . size ()); if ( ret != OK ) { throw ... some error ... } // Set actual size. compressed . erase ( size ); return compressed ; }
With this proposal:
extern "C" { int compress ( void * out , size_t * out_size , const void * in , size_t in_size ); size_t compressBound ( size_t bound ); } std :: string CompressWrapper ( std :: string_view input ) { std :: string compressed ; // Compute upper limit of compressed input. size_t size = compressBound ( input . size ()); int ret ; // GOOD: No initialization compressed . resize_and_overwrite ( size , [ & ]( char * buf , size_t n ) { ret = compress ( buf , & size , input . data (), input . size ()); return * size ; }); if ( ret != OK ) { throw ... some error ... } return compressed ; }
5. Design Considerations
5.1. Method vs. Alternatives
During the [SAN] Meeting LEWG expressed a preference for
implementing this functionality as a new method on
(as
proposed in [P1072R0]) rather than a standalone "storage buffer"
type (one option in [P1072R1]):
Method on string vs storage_buffer type:
Strong Method Method Neutral Type Strong Type 9 2 3 2 2
During the [KOA] Meeting, LEWG expressed a preference for not weakening
's invariants after
returns. Under this
revision:
-
A default-initialized buffer is only accessible to
.op -
Under penalty of undefined behavior,
must assign (replace) the values in resulting buffer. The default initialized portion (op
) above is erased and inaccessible to the program without further violation of class invariants.[ data () + m , data () + n ) -
This proposal follows the "lambda" option discussed in [SAN].
Several other alternatives were discussed at [SAN], [KOA], and on the reflector. Please see § 6 Alternatives Considered below for more details.
5.2. Allocator Interaction
Unlocking the desired optimizations requires some change to
's interaction with allocators. This proposal does what
we think is the simplest possible change: remove the requirement that
call
or
.
This restriction should be acceptable because
is
defined to only hold "non-array trivial standard-layout" types.
[strings.general] p1:
This Clause describes components for manipulating sequences of any non-array trivial standard-layout (6.7) type. Such types are called char-like types, and objects of char-like types are called char-like objects or simply characters.
Removing calls to
and
is compatible with
as long as
is false
,
which should be the case in practice.
Along the way, this proposal clarifies ambiguous language in [string.require] and [container.requirements.general] by:
-
Stating explicitly that
is allocator-aware.basic_string -
Introducing the term "allocator-aware" earlier in [container.requirements.general].
-
Not attempting to mention other "components" in [container.requirements.general]. The other allocator-aware containers (just
andbasic_string
?) can independently state that they are "allocator-aware" in their own clauses.regex :: match_results
libstdc++ and msvc allow strings of non-trivial type. That might force
those libraries to continue to support
and
as
"extensions". On the other hand, libc++ disallows non-trivial
s, so any such extensions are non-portable. See godbolt.org.
5.3. Undefined Behavior
exposes users to UB if they read indeterminate
([basic.indet]) values in the string buffer passed to
. Despite
this foot-gun,
is appealing because:
-
Uninitialized buffers are not uncommon in C++. See for example
or the recently adoptednew T [ 20 ]
.make_unique_for_overwrite -
Dynamic analyzers like Memory Sanitizer [MSan] and Valgrind [Valgrind] can catch uninitialized reads.
The version of this proposal seen by LEWG in [Cologne] throws
when
returns a value greater than
. The authors now
recommend and LEWG confirmed via [LEWGNovTelecon] to use UB. For the
to
have written greater than
bytes, it would have had to overrun the provided
buffer (already UB).
The authors further recommend UB if
were to throw.
is motivated by measured performance
improvements that are small individually but significant at scale.
Providing exception guarantees would incur overhead that is in tension
with the original motivation.
5.4. Bikeshedding
What do we want to call this method?
-
(as proposed). This is consistent withresize_and_overwrite
[P1973R1]. Unlike themake_unique_for_overwrite
version, this concludes with the buffer being overwritten (as far asmake_unique
specifies), rather than requiring further action be taken.op -
(as proposed). This is consistent withresize_default_init
[P1020R1], adopted in San Diego. This name was removed from the working paper subsequently and renaemdmake_unique_default_init
.make_unique_for_overwrite -
(as proposed in R0). "Uninitialized" is different from "default-initialized", which is what we want to specify. Also,resize_uninitialized
is already used elsewhere in the Library to mean "not constructed at all", as inuninitialized
; so this would be an overloading of the term.uninitialized_copy
6. Alternatives Considered
6.1. Tag Type
At the [SAN] Meeting, LEWG showed support for a tag argument type:
Approval (vote for as many as you find acceptable)
13 Go back to resize_uninitialized 15 Do tag type (default_initialize) for c’tor / resize() 12 Continue with storage_buffer (as per R2 of this paper) 7 Crack open with a lambda 7 RAII separate type
For example:
std :: string GeneratePattern ( const std :: string & pattern , size_t count ) { const auto step = pattern . size (); // GOOD: No initialization std :: string ret ( step * count , std :: string :: default_init ); for ( size_t i = 0 ; i < count ; i ++ ) { // GOOD: No bookkeeping memcpy ( ret . data () + i * step , pattern . data (), step ); } return ret ; }
Benefits:
-
A constructor that takes a tag type simplifies some use cases, like the example above.
-
Matches existing practice in Boost. See § 7.7 Boost.
Drawbacks:
-
Feature creep / complexity — A tag type invites generalizing for all allocator-aware container types. We agree that this is desirable and even has implementation experience in Boost. However default initialization is not enough. There is also "implicit construction" (see [P1010R1], [P0593R2]) and "relocation" (see [P1144R0], [P1029R1]). Neither of these features are yet in the language. It is too early to generalize. Note that the second poll quoted in § 1 Motivation shows support for solving this problem for
but not[ basic_ ] string
.vector -
In reflector discussion of
[Zero], there was a preference for avoiding tag types. The standard library hasmake_unique_default_init
, notcopy_backward
with a tag, andcopy
, rather thancount_if
with a predicate.count
Conclusion:
LEWG should explore tags for allocator-aware containers, but that work should not block near-term enablement of efficient [std::]string builders.
6.2. Non-Lambda Approach
In [P1072R0] and [P1072R3] of this proposal, the authors considered a
method
/
which left a
default-initialized buffer accessible to users of the
instance
after the method returned. This method was rejected in [KOA], due to the
weakened class invariants.
For illustration:
extern "C" { int compress ( void * out , size_t * out_size , const void * in , size_t in_size ); size_t compressBound ( size_t bound ); } std :: string CompressWrapper ( std :: string_view input ) { std :: string compressed ; // Compute upper limit of compressed input. size_t size = compressBound ( input . size ()); // GOOD: No initialization compressed . resize_default_init ( size ); int ret = compress ( &* compressed . begin (), & size , input . data (), input . size ()); if ( ret != OK ) { throw ... some error ... } // Suppose size is the value of size before the call to compress and size' // is the value of size after the call to compress. // // If size' < size, then: // std::cout << compressed[size' + 1] // ...triggers a read of uninitialized data. // Set actual size. compressed . erase ( size ); return compressed ; }
6.3. Standalone Type: storage_buffer
In [P1072R1], we considered
, a standalone type
providing a
method (similar to the
method
proposed here) and
(to promise, under penalty of UB, that the
buffer had been initialized).
At the [SAN] Meeting, this approach received support from LEWGI in light of the [post-Rapperswil] email review indicating support for a distinct type. This approach was rejected by the larger LEWG room in San Diego Meeting Diego.
The proposed type would be move-only.
std :: string GeneratePattern ( const std :: string & pattern , size_t count ) { std :: storage_buffer < char > tmp ; const auto step = pattern . size (); // GOOD: No initialization tmp . prepare ( step * count + 1 ); for ( size_t i = 0 ; i < count ; i ++ ) { // GOOD: No bookkeeping memcpy ( tmp . data () + i * step , pattern . data (), step ); } tmp . commit ( step * count ); return std :: string ( std :: move ( tmp )); }
For purposes of the container API,
corresponds to the committed portion of the buffer. This leads to more consistency when working with (and
explicitly copying to) other containers via iterators, for example:
std :: storage_buffer < char > buf ; buf . prepare ( 100 ); * fill in data * buf . commit ( 50 ); std :: string a ( buf . begin (), buf . end ()); std :: string b ( std :: move ( buf )); assert ( a == b );
Benefits:
-
By being move-only, we would not have the risk of copying types with trap representations (thereby triggering UB).
-
Uninitialized data is only accessible from the
type. For an API working withstorage_buffer
, no invariants are weakened. Crossing an API boundary with abasic_string
is much more obvious than a "storage_buffer
with possibly uninitialized data." Uninitialized bytes (the promise made bybasic_string
) never escape into the valid range ofcommit
.basic_string
Drawbacks:
-
requires introducing an extra type to the standard library, even though its novel functionality (fromstorage_buffer
andstring
) is limited to the initialization abilities.vector -
is often implemented with a short-string optimization (SSO) and an extra type would need to implement that (likely by additional checks when moving to/from thebasic_string
) that are often unneeded.storage_buffer
6.4. Externally-Allocated Buffer Injection
In [P1072R1], we considered that
could "adopt" an
externally
'd buffer. At the [SAN] Meeting, we
concluded that this was:
-
Not critical
-
Not constrained in the future
-
Overly constraining to implementers. Allowing users to provide their own buffers runs into the "offset problem". Consider an implementation that stores its
andsize
inline with its data, socapacity
.sizeof ( container ) == sizeof ( void * ) class container { struct Rep { size_t size ; size_t capacity ; }; Rep * rep_ ; }; If using a
-style implementation, the mismatch in offsets requires an O(N) move to shift the contents into place and trigger a possible reallocation.Rep -
Introducing new pitfalls. It would be easy to mix
andnew []
inadvertently.allocator :: allocate
7. Related Work
7.1. Google
Google has a local extension to
called
which is wrapped as
.
-
[Abseil] uses this to avoid bookkeeping overheads in
andStrAppend
.StrCat -
-
In decompression, the final size of the output buffer is known before the contents are ready.
-
During compression, an upperbound on the final compressed size is known, allowing data to be efficiently added to the output buffer (eliding
's checks) and the string to be shrunk to its final, correct size.append
-
-
[Protobuf] avoids extraneous copies or initialization when the size is known before data is available (especially during parsing or serialization).
7.2. MongoDB
MongoDB has a string builder that could have been implemented in terms of
as a return value. However, as explained by Mathias Stearn, zero
initialization was measured and was too costly. Instead a custom string builder
type is used:
E.g.: https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/unicode/string.h
/** * Strips diacritics and case-folds the utf8 input string, as needed to support * options. * * The options field specifies what operations to *skip*, so kCaseSensitive * means to skip case folding and kDiacriticSensitive means to skip diacritic * striping. If both flags are specified, the input utf8 StringData is returned * directly without any processing or copying. * * If processing is performed, the returned StringData will be placed in * buffer. buffer’s contents (if any) will be replaced. Since we may return * the input unmodified the returned StringData’s lifetime is the shorter of * the input utf8 and the next modification to buffer. The input utf8 must not * point into buffer. */ static StringData caseFoldAndStripDiacritics ( StackBufBuilder * buffer , StringData utf8 , SubstrMatchOptions options , CaseFoldMode mode );
(Comments re-wrapped.)
7.3. VMware
VMware has an internal string builder implementation that avoids
due, in part, to
's zero-writing behavior. This is similar in spirit to
the MongoDB example above.
7.4. Discussion on std-proposals
This topic was discussed in 2013 on std-proposals in a thread titled "Add
basic_string::resize_uninitialized (or a similar mechanism)":
https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/XIO4KbBTxl0
7.5. DynamicBuffer
The [N4734] (the Networking TS) has dynamic buffer types.
7.6. P1020R1
See also [P1020R1] "Smart pointer creation functions for default initialization". Adopted in San Diego.
7.7. Boost
Boost provides a related optimization for vector-like containers, introduced in [SVN r85964] by Ion Gaztañaga.
E.g.: boost/container/vector.hpp:
//! <b>Effects</b>: Constructs a vector that will use a copy of allocator a //! and inserts n default initialized values. //! //! <b>Throws</b>: If allocator_type’s allocation //! throws or T’s default initialization throws. //! //! <b>Complexity</b>: Linear to n. //! //! <b>Note</b>: Non-standard extension vector ( size_type n , default_init_t ); vector ( size_type n , default_init_t , const allocator_type & a ) ... void resize ( size_type new_size , default_init_t ); ...
These optimizations are also supported in Boost Container’s
,
,
,
, and
.
7.8. Thrust
The Thrust library has "a RAII-type
which has a vector-like interface and a constructor with a tag
parameter that indicates its elements should not be initialized." -
[Bryce Adelstein Lelbach].
E.g. thrust/thrust/detail/temporary_array.inl:
template < typename T , typename TemporaryArray , typename Size > __host__ __device__ typename thrust :: detail :: disable_if < avoid_initialization < T >:: value >:: type construct_values ( TemporaryArray & a , Size n ) { a . default_construct_n ( a . begin (), n ); } // end construct_values()
8. Wording
Wording is relative to [N4878].
Motivation for some of these edits can be found in § 5.2 Allocator Interaction.
8.1. [basic.string]
In [basic.string], in the synopsis, add
:
... // 21.3.3.5, capacity constexpr size_type size() const noexcept; constexpr size_type length() const noexcept; constexpr size_type max_size() const noexcept; constexpr void resize(size_type n, charT c); constexpr void resize(size_type n); template<typename Operation> constexpr void resize_and_overwrite(size_type n, Operation op); constexpr size_type capacity() const noexcept; constexpr void reserve(size_type res_arg); constexpr void shrink_to_fit(); constexpr void clear() noexcept; [[nodiscard]] constexpr bool empty() const noexcept;
In [string.require] clarify that
is an
allocator-aware container, and then add an exception for
and
.
In every specialization
, the type
basic_string < charT , traits , Allocator > shall name the same type as
allocator_traits < Allocator >:: value_type
charT . Every object of type, and the typeuses an object of type
basic_string < charT , traits , Allocator > to allocate and free storage for the contained
Allocator objects as needed. The Allocator object used is obtained as described in [container.requirements.general]. In every specialization
charT
basic_string < charT , traits , Allocator > shall satisfy the character traits requirements ([char.traits]). [Note: The program is ill-formed if
traits is not the same type as
traits :: char_type . — end note]
charT is an allocator-aware container as described in [container.requirements.general], except that it is implementation-defined whether
basic_string and
allocator_traits :: construct are used to construct and destroy the contained
allocator_traits :: destroy objects.
charT References, pointers, and iterators referring to the elements of a
sequence may be invalidated by the following uses of that
basic_string object:
basic_string
...
In [string.capacity]:
constexpr void resize(size_type n, charT c);
- Effects: Alters the value of
as follows:
* this
- — If
, erases the last
n <= size () elements.
size () - n - — If
, appends
n > size () copies of
n - size () .
c constexpr void resize(size_type n);
- Effects: Equivalent to
.
resize ( n , charT ()) template<typename Operation> constexpr void resize_and_overwrite(size_type n, Operation op);
Effects: Alters the value of
as follows:
* this Invokes
- — If
, erases the last
n <= size () elements.
size () - n - — If
, appends
n > size () default-initialized elements.
n - size () .
erase ( begin () + op ( data (), n ), end ()) Remarks:
If
throws the behavior is undefined.
op Let
before the call to
o = size () .
resize_and_overwrite Let
.
m = op ( data (), n )
otherwise the behavior is undefined.
m <= n If
,
m > o shall replace ([expr.ass]) the values stored in the character array
op . Until replaced, the values may be indeterminate [basic.indet] [Note:
[ data () + o , data () + m ) may not be
* ( data () + o ) . - end note]
CharT ()
may write to
op . Any value written will be replaced with CharT() after
data () + n + 1 returns. [Note: This facilitiates interoperation with functions that write a trailing null. - end note ]
op When
is called
op is in an unspecified state.
* this shall not bind or in any way access
op .
* this
shall not allow its first argument to be accessible after it returns.
op constexpr size_type capacity() const noexcept;...
8.2. [container.requirements.general]
In [container.requirements.general] clarify the ambiguous "components affected by this subclause" terminology in p3. Just say "allocator-aware containers".
- All of the containers defined in this Clause except
are allocator-aware containers.
array For the components affected by this subclause that declare an allocator_typeObjectsobjectsstored inthese componentsallocator-aware containers, unless otherwise specified, shall be constructed using the functionand destroyed using the function
allocator_traits < allocator_type >:: rebind_traits < U >:: construct (20.10.9.3), where
allocator_traits < allocator_type >:: rebind_traits < U >:: destroy is either
U or an internal type used by the container. ...
allocator_type :: value_type
We can then simplify p15:
All of the containers defined in this Clause and in 21.3.3 exceptAllocator-aware containers meet the additional requirements described in Table 78.meet the additional requirements of an allocator-aware container, as
array
9. History
9.1. R6 → R7
-
Rebased on [N4878].
-
Fixed CompressWrapper example.
-
Allowed writing one past the end.
9.2. R5 → R6
LEWG reviewed R5 via [LEWGNovTelecon].
POLL:should be able to work in constexpr context.
resize_default_init No objection to unanimous consent. Attendance: 23
POLL: We prefer UB over throwing an exception if the operation reports it wrote more bytes than it has access to.No objection to unanimous consent. Attendance: 21
POLL: We want to renamefor consistency, in a mailing list review.
resize_default_init
SF F N A SA 3 12 0 1 0
Modifications since R5:
-
Indicated poll outcomes for UB when
reports an invalid value.op -
Renamed to
. This is based on naming discussions forresize_and_overwrite
being ultimately renamed tostd :: make_unique_default_init
, but differs slightly because the memory accessible has been overwritten. Overwriting the buffer is an integral part of the operation performed.std :: make_unique_for_overwrite -
Applied
constexpr
9.3. R4 → R5
A draft of this revision was presented to LEWG at the [Cologne] meeting.
Forward to LWG for C++23
SF F N A SA 10 5 1 0 0
-
Rebased on [N4830].
-
Proposed UB instead of
.throwing std :: range_error
9.4. R3 → R4
-
Applied feedback from the [KOA] meeting.
-
Moved to using a callback routine, rather than leaving invariants weakened from the time
returned until values in the buffer were assigned.resize_default_init
9.5. R2 → R3
-
Applied Jonathan Wakely’s editorial comments on § 8 Wording.
-
Rebased on [N4791].
-
Editorial changes to § 1 Motivation and § 6.1 Tag Type to support our design choice.
-
Added the reference to [LIBC++] in § 3 Implementation.
9.6. R1 → R2
Applied feedback from [SAN] Meeting reviews.
-
Reverted design to "option A" proposed in [P1072R0].
-
Switched from
toresize_uninitialized
.resize_default_init -
Added discussion of alternatives considered.
-
Specified allocator interaction.
-
Added wording.
9.7. R0 → R1
Applied feedback from LEWG [post-Rapperswil] Email Review:
-
Shifted to a new vocabulary types:
/storage_buffer storage_node -
Added presentation of
as a new container typestorage_buffer -
Added presentation of
as a node-like typestorage_node
-
-
Added discussion of design and implementation considerations.
-
Most of [P1010R1] Was merged into this paper.
10. Acknowledgments
Thanks go to Arthur O’Dwyer for help with wording and proof
reading, to Jonathan Wakely for hunting down the language that
makes
allocator-aware, and to Glen Fernandes, Corentin Jabot, Billy O’Neal, and Mathias Stearn for
design discussion. Special thanks to Eric Fiselier for providing
the implmentation.