ISO/IEC JTC1 SC22 WG21
N3599
Richard Smith
2013-03-13
N2765 added the ability for users to define their own literal suffixes. Several forms of literal operators are available, with one notable omission: there is no template form of literal operator for character and string literals. N2750 justifies this restriction based on two factors:
rawform of string literal, in which
"Hello, " L"Worl\u0044!"
is distinguishable from
L"Hello, World!"
but this interacted badly with phases of translation, and
Neither of these is still true, and we now have evidence that a literal operator template for string literals would be valuable; indeed, in one codebase where literal operators are not yet permitted, this form of literal operator has been requested more frequently than any of the forms which C++11 permits.
With literal operator templates, it is possible to write a type-safe
printf
facility:
// A tuple of types.
template<typename ...Ts> struct types {
template<typename T> using push_front = types<T, Ts...>;
template<template<typename...> class F> using apply = F<Ts...>;
};
// Select a type from a format character.
template<char K> struct format_type_impl;
template<> struct format_type_impl<'d'> { using type = int; };
template<> struct format_type_impl<'f'> { using type = double; };
template<> struct format_type_impl<'s'> { using type = const char *; };
// ...
template<char K> using format_type = typename format_type_impl<K>::type;
// Build a tuple of types from a format string.
template<char ...String>
struct format_types;
template<>
struct format_types<> : types<> {};
template<char Char, char ...String>
struct format_types<Char, String...> : format_types<String...> {};
template<char ...String>
struct format_types<'%', '%', String...> : format_types<String...> {};
template<char Fmt, char ...String>
struct format_types<'%', Fmt, String...> :
format_types<String...>::template push_front<format_type<Fmt>> {};
// Typed printf-style formatter.
template<typename ...Args> struct formatter {
int operator()(Args ...a) {
return std::printf(str, a...);
}
const char *str;
};
template<typename CharT, CharT ...String>
typename format_types<String...>::template apply<formatter>
operator""_printf() {
static_assert(std::is_same<CharT, char>(), "can only use printf on narrow strings");
static const CharT data[] = { String..., 0 };
return { data };
}
void log_bad_guess(const char *name, int guess, int actual) {
"Hello %s, you guessed %d which is too %s\n"_printf(
name, guess, guess < actual ? "low" : "high");
}
This is not possible with the existing support for string literal operators, because the type of the literal cannot depend on the contents of the string.
By a similar mechanism to the type-safe printf, literal operator templates allow the user to validate that a string literal conforms to a specific syntax or structure during translation.
class SpecialString {
public:
constexpr static bool IsValidString(const char *str) { /* ... */ }
explicit SpecialString(const char *str) : str(str) { assert(IsValidString(str); }
const char *get() { return str; }
private:
struct Checked {};
SpecialString(Checked, const char *str) : str(str) {}
template<typename CharT, CharT ...> friend SpecialString operator""_special();
const char *str;
};
template<typename CharT, CharT ...String> SpecialString operator""_special() {
constexpr static CharT data[] = { String..., 0 };
static_assert(SpecialString::IsValidString(data), "not a valid string");
return SpecialString(SpecialString::Checked(), data);
}
Again, this is not possible with the existing support for string literal operators, because the literal's value is not available in constant expressions within the literal operator.
Some commerical applications desire to obfuscate some of their string
literals, so that (for instance) running the Unix strings
command
on their binary does not reveal potentially-sensitive information, such as
features the customer has not paid for, or diagnostic messages which are
specific to another customer. With a literal operator template, this is possible
without disrupting the flow or readability of the client code.
template<typename CharT> struct encoded_string {
operator std::basic_string<CharT>() { /* ... decode ... */ }
// ...
}
namespace {
template<typename CharT, CharT ...String> struct encode {
static constexpr CharT data[] = { String ^ 0xa3 ..., 0 };
};
template<typename CharT, CharT ...String> const CharT encode::data[];
template<typename CharT, CharT ...String> static encoded_string<CharT> operator""_hidden() {
return encode<CharT, String...>::data;
}
}
void report_secret_thing() {
my_ostream << "secret thing happened"_hidden << std::endl;
}
Access to the contents of a string literal as a template parameter pack allows string data to be deduplicated during translation, which in turn permits value comparisons to be performed rapidly by comparing the addresses of strings. Without literal operator templates, this requires either runtime overhead to perform the interning, or for the programmer to explicitly construct an object to hold the canonical value of a string. These costs can be avoided with a literal operator template:
std::map<std::string, const char*> intern_map;
template<char ...String> struct register_intern {
static constexpr char intern[] = { String..., 0 };
static register_intern register_;
register_intern() { intern_map[intern] = intern; }
};
template<char ...String> register_intern<String...> register_intern<String...>::register_;
template<typename CharT, CharT ...String> constexpr const char *operator""_intern() {
static_assert(std::is_same<CharT, char>(), "can only intern narrow strings");
return (®ister_intern<String...>::register_,
register_intern<String...>::intern);
}
static_assert("foo"_intern == "foo"_intern, "");
Qt defines the macros SIGNAL
and SLOT
, which
encode a method signature in order to allow it to be dynamically invoked:
#define SIGNAL(x) "1" #x
#define SLOT(x) "2" #x
// ...
QObject::connect(sender, SIGNAL(thingHappened(int)),
receiver, SLOT(onThingHappened(int)));
Before the results of the SIGNAL
and SLOT
macro can
be used, they must first be canonicalized (by removing spaces, canonicalizing
the location of the const
keyword, and so on). With a literal
operator template, this canonicalization can be performed during
translation.
#define SIGNAL(x) #x ## _qt_signal
#define SLOT(x) #x ## _qt_slot
Add a new form of literal operator template for a cooked string literal:
template<typename CharT, CharT ...String>
This form will be used if a non-template literal operator for the string literal is not available. The first template argument will be the element type of the string, and the remaining arguments are the code units in the string literal (excluding its terminating null character).
N2750 expressed a concern that users may wish to use a raw
form of
string literal. The form proposed herein is a cooked
literal operator; no
raw form is proposed. If users wish to capture the contents of a string literal
as written, a literal operator template can be combined with a raw string
literal:
R"(.*\.\(org\|com\|net\))"_regexp
No literal operator template is proposed for character literals. The author does not wish to encourage the use of multi-byte character literals, and for single-byte character literals, the feature would have extremely limited utility. Indeed, no use cases are known for this feature, and any possible cases could be addressed by using a string literal instead of a character literal.
The term of art literal operator template is split into two terms, numeric literal operator template and string literal operator template. The term literal operator template is retained and refers to either form.
Replace literal operator template
with numeric literal operator
template
in [lex.ext] (2.14.8)/3 and [lex.ext] (2.14.8)/4:
[...] Otherwise, S shall contain a raw literal operator or a numeric literal operator template (13.5.8) but not both. [...] Otherwise (S contains a numeric literal operator template), L is treated as a call of the form [...]
Change in [lex.ext] (2.14.8)/5:
If L is a user-defined-string-literal, let C be the element type of the string literal as determined by its encoding-prefix, let str be the literal without its ud-suffix, and let len be the number of code units in str (i.e., its length excluding the terminating null character). If S contains a literal operator with parameter typesconst C *
andstd::size_t
, theTheliteral L is treated as a call of the formoperator "" X(str, len)
Otherwise, S shall contain a string literal operator template (13.5.8), and L is treated as a call of the form
operator "" X<C, e's1', e's2', ... e'sk'>()
where e is empty when the encoding-prefix isu8
and is otherwise the encoding-prefix of the string literal, and str contains the sequence of code units s1s2...sk (excluding the terminating null character).
Change in [over.literal] (13.5.8)/5:
The declaration of a literal operator template shall have an empty parameter-declaration-clause and its template-parameter-list shall haveA numeric literal operator template is a literal operator template whose template-parameter-list has a single template-parameter that is a non-type template parameter pack (14.5.3) with element typechar
. A string literal operator template is a literal operator template whose template-parameter-list comprises a type template-parameter C followed by a non-type template parameter pack with element type C. The declaration of a literal operator template shall have an empty parameter-declaration-clause and shall declare either a numeric literal operator template or a string literal operator template.