Is std::string
's lack of a non-const .data()
member function an oversight or an intentional design based on pre-C++11 std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const .data()
member function for std::string
to improve uniformity in the standard library and to help C++ developers write correct code.
P0272R1 added a node for Annex C.4 wrt. the implications of this change and fixed minor formatting issues.
This paper brings to discussion an issue that was originally brought forward in LWG issue 2391 by Michael Bradshaw. It was moved to a LEWG issue in 2015 and hasn't been looked at since. This is being elevated to a paper in the hopes that it will finally be addressed.
C libraries occasionally include routines that have char *
parameters. One example is the lpCommandLine
parameter of the CreateProcess
function in the Windows API. Because the data()
member of std::string
is const, it cannot be used to make std::string
objects work with the lpCommandLine
parameter. Developers are tempted to use .front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when programName
is empty, the programName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString[] = {'\0'};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const .data()
member, as there is with std::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString[] = {'\0'};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const .data()
std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
The wording here was taken directly from LWG issue 2391.
Change class template basic_string synopsis, [basic.string], as indicated:
namespace std { template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string { public: […] // 21.4.7, string operations: const charT* c_str() const noexcept; const charT* data() const noexcept; charT* data() noexcept; allocator_type get_allocator() const noexcept; […] }; }
Add the following sequence of paragraphs following [string.accessors] p3, as indicated:
charT* data() noexcept;
Returns: A pointer
p
such thatp + i == &operator[](i)
for eachi
in[0,size()]
. Complexity: Constant time. Requires: The program shall not alter the value stored atp + size()
.
Add the following section after Annex C.4.3
Change: const .data()
member added.
Rationale: The lack of a const .data()
differed from the similar member of std::vector
. This change regularizes behavior for this International Standard.
Effect on original feature: Overloaded functions which have differing code paths for char*
and const char*
arguments will execute differently when called with a non-const string's .data()
member in this International Standard.