Document number: P1679R3
Date: 2020-06-13
Project: WG21, Library Working Group
Authors: Wim Leflere wim.leflere@gmail.com, Paul Fee paul.f.fee@gmail.com
This paper proposes to add member function contains
to class templates basic_string and basic_string_view. This function checks, whether or not a string contains a given substring.
Small wording update based on LWG feedback
Rephrased case sensitivity section
Feature test macro added
Small wording update based on LWG feedback
Wording added
Made all functions constexpr, after std::string
was made constexpr 1
Merged content from P1657R0 2
Initial version
Checking, whether or not a given string contains a given substring is a common task, that is missing from the standard library.
Standard libraries of many other programming languages include routines for performing such a check, for example:
in
, which calls an object's __contains__(self, item)
method 3contains
method 4Contains
method 5contains
method 6And so on.
Also, some C++ libraries (other than the standard library) that implement a string type include such methods. For example, Qt library has classes QString 7 and QStringRef (analogous to std::string_view) which have contains member functions.
C++ will be easier to teach to people coming from other languages as they may already be familiar with the contains
method in the other language's string class.
A range of options exist for substring checking.
std::string haystack = "no place for needles";
Using the C library:
if (strstr(haystack.c_str(), "needle"))
Using the C++ standard library:
if (haystack.find("needle") != std::string::npos)
Using Boost algorithms library 8:
if (boost::contains(haystack, "needle"))
The proposed changes would provide a concise, unambiguous method for substring checking in which the intent is clearly expressed.
if (haystack.contains("needle"))
The 'standard' 9 way of checking if a string contains a substring is to use the find
member function.
if (str.find(substr) != std::string::npos)
std::cout << "found!\n";
But using find
requires that one extra step of thinking when writing it.
You're trying to do something positive (check if contains) but you have to do something negative (check inequality).
And one extra step when reading the code.
Are we looking for the actual position? Or checking if the string contains a substring? Or checking if the string doesn't contain a substring?
A contains
member function would make the intention of the programmer more clear and make the code more readable.
if (str.contains(substr))
std::cout << "found!\n";
The proposed change would improve teachability of C++ for beginners as the contains
function better matches the intention of the programmer. And because it is a simpler construct to write and remember than using find
.
The string contains
function would complete the three string checking musketeers, together with the string prefix and suffix check, starts_with
and ends_with
10.
Python uses the in
operator, such as:
if 'needle' in haystack:
Adopting a similar approach in C++ would involve a new keyword. A new keyword risks breaking backwards compatibility with code already using in
for other purposes, such as variable names. Hence changes to the standard library are preferred.
This proposal adds member function contains
to class templates basic_string and basic_string_view.
Another option considered was to add a free function contains
to namespace std, as in Boost 8.
The drawback of a free function is that the order of parameters of a free function is ambiguous, contains(string, substring)
vs contains(substring, string)
.
A member function offers consistency with other popular languages, such as Java, C# and Rust. It's also consistent with starts_with
and ends_with
10.
Containers such as set
and map
have a contains
method. For multiset
and multimap
containers, this offers a performance boost over the count
method since contains
can return on the first match. With set
and map
, the benefits are API consistency with multiset
and multimap
along with clearer expression of intent by returning a bool rather than a count.
std::set<foo> haystack;
if (haystack.count(needle)) { /* found */ }
if (haystack.contains(needle)) { /* clearly found */ }
The proposed contains
method for substring checks is not directly analogous to the container operation. Rather than search for a member within the container, a contains
operation on a string means to search for a substring.
Since the proposed method is being called on a string (or string_view) object, the context is clear. The same method name reuse can be seen with the find
method provided by both containers and string objects. Likewise, the Python in
operator performs substring searches on string and membership searches on containers.
The starts_with
, ends_with
and find
methods are case sensitive.
Likewise the proposed contains
member function is also case sensitive.
Some libraries offer case insensitive searches.
For example, Boost string algorithms provides icontains
11.
Qt's QString::contains
takes a parameter that defaults to case sensitive, but allows case insensitivity to be specified.
However, case sensitivity is a complex topic for character sets beyond ASCII. Therefore the scope of this proposal is limited to case sensitive substring checks.
The starts_with
and ends_with
methods each have three overloads. This proposal has the same set of overloads:
// basic_string:
constexpr bool contains(basic_string_view<charT, traits> str) const noexcept;
constexpr bool contains(charT ch) const noexcept;
constexpr bool contains(const charT* str) const;
// basic_string_view:
constexpr bool contains(basic_string_view<charT, traits> str) const noexcept;
constexpr bool contains(charT ch) const noexcept;
constexpr bool contains(const charT* str) const;
An overload accepting a basic_string
is not required since basic_string
has a non-explicit conversion operator to basic_string_view
.
In [basic.string], add:
constexpr bool contains(basic_string_view<charT, traits> x) const noexcept;
constexpr bool contains(charT x) const noexcept;
constexpr bool contains(const charT* x) const;
After [string.ends.with], add:
basic_string::contains [string.contains]
constexpr bool contains(basic_string_view<charT, traits> x) const noexcept;
constexpr bool contains(charT x) const noexcept;
constexpr bool contains(const charT* x) const;
Effects: Equivalent to: return basic_string_view<charT, traits>(data(), size()).contains(x);
In [string.view.template], add:
constexpr bool contains(basic_string_view x) const noexcept;
constexpr bool contains(charT x) const noexcept;
constexpr bool contains(const charT* x) const;
In [string.view.ops], add:
constexpr bool contains(basic_string_view x) const noexcept;
constexpr bool contains(charT x) const noexcept;
constexpr bool contains(const charT* x) const;
Effects: Equivalent to: return find(x) != npos;
In [version.syn], add:
#define __cpp_lib_string_contains YYYYMML // also in <string>, <string_view>
Adjust the placeholder value as needed so as to denote this proposal’s date of adoption.