Document #: | P2585R0 |
Date: | 2022-05-14 |
Project: | Programming Language C++ |
Audience: |
LEWG |
Reply-to: |
Barry Revzin <barry.revzin@gmail.com> |
[P2286R8] adds support for formatting any range whose underlying type is formattable. Additionally, it adds support for different kinds of formatting that users can opt into, while also providing a default choice for associating containers that is more suited to what those containers represent.
For example, simply using "{}"
as the specifier, we get the following outputs:
Expression
|
Output
|
---|---|
std::vector<std::pair<int, int>>{{1, 2}, {3, 4}} |
[(1, 2), (3, 4)] |
std::set<std::pair<int, int>>{{1, 2}, {3, 4}} |
{(1, 2), (3, 4)} |
std::map<int, int>{{1, 2}, {3, 4}} |
{1: 2, 3: 4} |
In each case, we have a range over a pair of ints, but we have three different outputs - as appropriate for the different kinds of containers.
However, this distinction is a result of [P2286R8] explicitly providing formatters for all the standard library map and set containers, and applying those changes to them. This is something that users can do for their own containers as well, but which also means that it is something users have to do - if this is the behavior they want. For instance, the containers in Boost or Abseil just format like normal ranges:
Expression
|
Output
|
---|---|
boost::container::flat_set<int>{1, 2, 3} |
[1, 2, 3] |
absl::flat_hash_map<int, int>{{1, 2}, {3, 4}} |
[(1, 2), (3, 4)] |
This output isn’t wrong, per se. It’s just that it’s not ideal. And, sure, Abseil could certainly add the code necessary to make this happen. Which, at a bare minimum, would be:
template <class K, class V, class... Rest> requires std::formattable<const K, char> && std::formattable<const V, char> struct std::formatter<absl::flat_hash_map<K, V, Rest...>, char> : std::range_formatter<std::pair<const K, V>, char> { constexpr formatter() { this->set_brackets("{", "}"); this->underlying().set_brackets({}, {}); this->underlying().set_separator(": "); } };
Now, this isn’t a lot of code, nor is it especially complex. But it is writing code to do what seems like it should be the default behavior.
And, indeed, this is what {fmt}
does. By default, formatting a map (whether it’s a std::map
or an absl::flat_hash_map
) does give you a string like {1: 2, 3: 4}
rather than [(1, 2), (3, 4)]
. {fmt}
determines whether a type is map-like simply by checking if it has a member type named mapped_type
(here). Similarly, {fmt}
determines whether a type is set-like if it has a member type named key_type
(and it is not map-like) (here).
We can do something similar in the standard library.
There are several kinds of range formatting styles:
{k1: v1, k2: v2}
{v1, v2}
[v1, v2]
abc
"\naan"
We can introduce an enum class for all of these kinds:
Note that range_format_kind::disabled
is 0
so that converting to bool
checks if it’s enabled.
And a variable template (allowed to be specialized) to determine the right kind:
template <class R> inline constexpr auto format_kind = unspecified; template <input_range R> inline constexpr range_format_kind format_kind<R> = []{ if constexpr (requires { typename R::key_type; typename R::mapped_type; } and is-2-tuple<range_reference_t<R>>) { return range_format_kind::map; } else if constexpr (requires { typename R::key_type; }) { return range_format_kind::set; } else { return range_format_kind::sequence; } }();
format_kind<U>
for non-range types U
is unspecified
to allow for future evolution in case we want to do this for non-range types. Fow now, it doesn’t cost anything.
As far as heuristics go, this is a pretty safe bet. While we’ve previously tried to do a heuristic for view
and then had to walk it back a lot, here the stakes are much lower. Worst-case, we’re just talking about getting the default formatting wrong - for some definition of wrong - though we’d still format all the elements. And there probably aren’t too many range types floating around that define key_type
and mapped_type
but aren’t maps? Note that we’re being especially safe here by checking not just R::key_type
and R::mapped_type
but also that the reference type is a two-tuple (by which I mean that it’s either a specialization of std::pair
or a specialization of std::tuple
whose size is 2
). This added check ensures that we don’t accidentally end up with an ill-formed default whereas the previous default could have worked fine (on the off chance that we have some type with a key_type
and a mapped_type
yet is somehow a range of int
).
Note that we’re deliberately not making any sort of guess about string
. Not looking at convertibility to string
or string_view
or char const*
or anything of the sort. If users want some container to be formatted as a string, they can do so by explicitly specializing format_kind
- which isn’t that much effort at all. The problem is that coming up with a heuristic would complicate format_kind
, since we can’t just check convertibility to string_view
, we have to specifically check convertibility to basic_string_view<charT>
, and now suddenly we have this extra charT
parameter. Is it worth it? I’m not sure it is.
If we then change the way the standard library does formatting based on format_kind<R>
, then with no additional code changes necessary, all the Boost and Abseil containers would start being formatted in the same kind of way that the standard library ones are today. That and many other user-defined associative containers.
Moreover, for user-defined containers, specializing a variable template is quite a bit less work than almost anything else you can do. So even for containers where the heuristic is wrong, this design offers an easier way forward to get the desired behavior.
While this design as-is has not been implemented, {fmt}
’s approach is pretty comparable.
This is in terms of [P2286R8]. The wording here chooses to define formatter<R, charT>
as inheriting from an exposition-only default-range-formatter<format_kind<R>, R, charT>
rather than a bunch of specializations for each format_kind<R>
case because this allows future extensions where uses can create more constrained specializations of formatter
for enabled ranges much easier.
Change 22.14.1 [format.syn]:
namespace std { // ... // [format.formatter], formatter template<class T, class charT = char> struct formatter; // [format.range.formatter], class template range_formatter + enum class range_format_kind { + disabled, + map, + set, + sequence, + string, + debug_string + }; + + template<class R> + inline constexpr unspecified format_kind = unspecified; + + template<ranges::input_range R> + inline constexpr range_format_kind format_kind<R> = see below; template<class T, class charT = char> requires same_as<remove_cvref_t<T>, T> && formattable<T, charT> class range_formatter; + template<range_format_kind K, ranges::input_range R, class charT> + struct default-range-formatter; // exposition only template<ranges::input_range R, class charT> - requires (!same_as<remove_cvref_t<ranges::range_reference_t<R>>, R>) + requires (format_kind<R> != range_format_kind::disabled) && formattable<ranges::range_reference_t<R>, charT> - struct formatter<R, charT>; + struct formatter<R, charT> : default-range-formatter<format_kind<R>, R, charT> { }; // ... }
Add to [format.range]:
a For a type
R
,format_kind<R>
is defined as follows:
- (a.1) If
same_as<remove_cvref_t<ranges::range_reference_t<R>>, R>
istrue
,format_kind<R>
isrange_format_kind::disabled
. [Note: This prevents constraint recursion for ranges whose reference type is the same range type. For example,std::filesystem::path
is a range ofstd::filesystem::path
. -end note ]- (a.2) Otherwise, if the qualified-id
R::key_type
is valid and denotes a type:
- (a.2.1) If the qualified-id
R::mapped_type
is valid and denotes a type, letU
beremove_cvref_t<ranges::range_reference_t<R>>
. If eitherU
is a specialization ofpair
orU
is a specialization oftuple
andtuple_size_v<U> == 2
,format_kind<R>
isrange_format_kind::map
.- (a.2.2) Otherwise,
format_kind<R>
isrange_format_kind::set
.- (a.3) Otherwise,
format_kind<R>
isrange_format_kind::sequence
.b Remarks: Pursuant to [namespace.std], users may specialize
format_kind
for cv-unqualified program-defined types. Such specializations shall be usable in constant expressions ([expr.const]) and have typeconst range_format_kind
.
And later:
namespace std { - template<ranges::input_range R, class charT> - requires (!same_as<remove_cvref_t<ranges::range_reference_t<R>>, R>) - && formattable<ranges::range_reference_t<R>, charT> - struct formatter<R, charT> { + template <ranges::input_range R, class charT> + struct default-range-formatter<range_format_kind::sequence, R, charT> { private: using maybe-const-r = fmt-maybe-const<R, charT>; range_formatter<remove_cvref_t<ranges::range_reference_t<maybe-const-r>>, charT> underlying_; // exposition only public: constexpr void set_separator(basic_string_view<charT> sep); constexpr void set_brackets(basic_string_view<charT> opening, basic_string_view<charT> closing); template <class ParseContext> constexpr typename ParseContext::iterator parse(ParseContext& ctx); template <class FormatContext> typename FormatContext::iterator format(maybe-const-r& elems, FormatContext& ctx) const; }; }
13 [Note: The
(!same_as<remove_cvref_t<ranges::range_reference_t<R>>, R>)
constraint prevents constraint recursion for ranges whose reference type is the same range type. For example,std::filesystem::path
is a range ofstd::filesystem::path
. -end note ]14 Effects: Equivalent to
underlying_.set_separator(sep)
;
Change the the wording for associative containers as follows:
1 For each of
map
,multimap
,unordered_map
, andunordered_multimap
, the library provides the following formatter specialization wheremap-type
is the name of the template:namespace std { - template <class charT, class Key, formattable<charT> T, class... U> - requires formattable<const Key, charT> - struct formatter<map-type<Key, T, U...>, charT> + template <ranges::input_range R, class charT> + struct default-range-formatter<range_format_kind::map, R, charT> { { private: - using maybe-const-map = fmt-maybe-const<map-type<Key, T, U...>, charT>; // exposition only - range_formatter<remove_cvref_t<ranges::range_reference_t<maybe-const-map>>, charT> underlying_; // exposition only + using maybe-const-map = fmt-maybe-const<R, charT>; // exposition only + using element-type = remove_cvref_t<ranges::range_reference_t<maybe-const-map>>; // exposition only + range_formatter<element-type, charT> underlying_; // exposition only public: constexpr formatter(); template <class ParseContext> constexpr typename ParseContext::iterator parse(ParseContext& ctx); template <class FormatContext> typename FormatContext::iterator format(maybe-const-map& r, FormatContext& ctx) const; }; }
2 Mandates: Either:
3 Effects: Equivalent to:
underlying_.set_brackets(STATICALLY-WIDEN<charT>("{"), STATICALLY-WIDEN<charT>("}")); underlying_.underlying().set_brackets({}, {}); underlying_.underlying().set_separator(STATICALLY-WIDEN<charT>(": "));
4 Effects: Equivalent to
return underlying_.parse(ctx);
template <class FormatContext> typename FormatContext::iterator format(maybe-const-map& r, FormatContext& ctx) const;
5 Effects: Equivalent to
return underlying_.format(r, ctx);
6 For each of
set
,multiset
,unordered_set
, andunordered_multiset
, the library provides the following formatter specialization whereset-type
is the name of the template:namespace std { - template <class charT, class Key, class... U> - requires formattable<const Key, charT> - struct formatter<set-type<Key, U...>, charT> + template <ranges::input_range R, class charT> + struct default-range-formatter<range_format_kind::set, R, charT> { { private: - range_formatter<Key, charT> underlying_; // exposition only + using maybe-const-set = fmt-maybe-const<R, charT>; // exposition only + range_formatter<remove_cvref_t<ranges::range_reference_t<maybe-const-set>>, charT> underlying_; // exposition only public: constexpr formatter(); template <class ParseContext> constexpr typename ParseContext::iterator parse(ParseContext& ctx); template <class FormatContext> typename FormatContext::iterator - format(const set-type<Key, U...>& r, FormatContext& ctx) const; + format(maybe-const-set& r, FormatContext& ctx) const; }; }
7 Effects: Equivalent to:
8 Effects: Equivalent to
return underlying_.parse(ctx);
9 Effects: Equivalent to
return underlying_.format(r, ctx);
Also add this partial specialization to handle string types:
template <range_format_kind K, ranges::input_range R, class charT> requires (K == range_format_kind::string || K == range_format_kind::debug_string) struct default-range-formatter<K, R, charT> { { private: formatter<basic_string<charT>, charT> underlying_; // exposition only public: template <class ParseContext> constexpr typename ParseContext::iterator parse(ParseContext& ctx); template <class FormatContext> typename FormatContext::iterator format(see below& str, FormatContext& ctx) const; };
1 Mandates:
same_as<remove_cvref_t<range_reference_t<R>>, charT>
istrue
.2 Effects: Equivalent to:
template <class FormatContext> typename FormatContext::iterator format(see below& r, FormatContext& ctx) const;
3 The type of
r
isconst R&
ifinput_range<const R>
istrue
andR&
otherwise.4 Effects: Equivalent to
return underlying_.format(basic_string<charT>(from_range, r), ctx);
Thanks to Jeff Garland for initially pointing out this distinction, Tomasz Kamiński for suggesting this approach, Victor Zverovich for having more or less already implemented it, and Tim Song for everything else.
[P2286R8] Barry Revzin. 2022-05-13. Formatting Ranges.
https://wg21.link/p2286r8