Language-based variants extend and enhance the sum type capabilities offered by C++. They do not replace std::variant
, which still has viable use cases, but they do address a number of the drawbacks of a standard library approach. This paper proposes a syntax that extends C++ to make variants a language-level feature.
P0095R2. Split the original paper into individual proposals, keeping this paper only for proposed language variant syntax. Pattern matching for built-in types and opt-in syntax for pattern matching were split into separate papers.
P0095R1. Merged in blog post developments. Added nullptr
patterns, @
patterns, and pattern guards. A mechanism for dealing with assignment was also added. Wording as it relates to patterns was added. Made expression and statement inspect
s use a single keyword.
C++ Language Support for Pattern Matching and Variants blog post. Sketched out several ideas on how a more extensive pattern matching feature would look. Discussed an extension mechanism which would allow any type to act tuple-like or variant-like. lvariant
is used instead of enum union
based on feedback in Kona.
Kona 2015 Meeting. There was discussion on whether or not a partial pattern-matching solution would be sufficient for incorporation of a language-based variant. While exploration of a partial solution had consensus at 5-12-8-2-0, exploration of a full solution had a strong consensus at 16-6-5-1-0. The question was also asked whether or not we want a language-based variant and the result was 2-19-6-0-1.
P0095R0. The initial version of this paper presented in Kona. It motivated the need for a language-based variant and sketched a basic design for such a feature with the minimal pattern matching required.
Standard library variants have provided type safety and expressiveness to the sum type support in C++. This is a good result that has enabled important functional idioms to be used, but as a tool std::variant
is made less applicable by the limitations placed upon it by the language.
In addition the authors feel that standard library variants are a complicated feature to explain to notice programmers, and fraught with pitfalls and potential bugs.
This paper presents a design for language-level variants that addresses the shortcomings of a pure standard library variant, in a syntax that the authors feel will be elegant for creating sum type solutions and intuitive for C++ programmers at all levels of experience.
The following snippet illustrates our proposed syntax.
// This lvariant implements a value representing the various commands
// available in a hypothetical shooter game.
lvariant command {
std::size_t set_score; // Set the score to the specified value
std::monostate fire_missile; // Fire a missile
unsigned fire_laser; // Fire a laser with the specified intensity
double rotate; // Rotate the ship by the specified degrees.
};
// Output a human readable string corresponding to the specified 'cmd' command
// to the specified 'stream'.
std::ostream& operator<<( std::ostream& stream, const command cmd ) {
return inspect( cmd ) {
set_score value =>
stream << "Set the score to " << value << ".\n"
fire_missile m =>
stream << "Fire a missile.\n"
fire_laser intensity:
stream << "Fire a laser with " << intensity << " intensity.\n"
rotate degrees =>
stream << "Rotate by " << degrees << " degrees.\n"
};
}
// Create a new command 'cmd' that sets the score to '10'.
command cmd = command::set_score( 10 );
The current library-based variants solve an important need, but they are too complicated for novice users. We describe difficult corner cases, the pitfalls of using types as tags, and the difficulty of writing portable code using a library based variant. All of these problems suggest the necessity of a language-based variant feature in C++.
Basic struct
types that have independently varying member variables1 have a close relationship to the std::tuple
class. Consider the following two types:
// point type as a struct
struct point {
double x;
double y;
double z;
};
// point type as a tuple
using point = std::tuple< double, double, double >;
It is clear that both point
types above can represent a 3D mathematical point. The difference between these two types is, essentially, the tag which is used to discriminate between the three elements. In the struct
case, an identifier is used (x, y, and z), and in the std::tuple
case, an integer index is used (0, 1, and 2).
Although these two point
implementations are more-or-less interchangeable, it is not always preferable to use a struct
instead of a std::tuple
nor vise-versa. In particular, we have the following general recommendations:
std::tuple
must be used.struct
should be used.std::tuple
is preferable.struct
for improved code clarity.Some may argue that through use of std::get
, which allows one to fetch a member of a tuple by type, one can achieve all the benefits of a struct
by using a tuple instead. To take advantage of this feature, one needs to ensure that each inner type has its own distinct type. This can be accomplished through use of a wrapper. For example:
struct x { double value; };
struct y { double value; };
struct z { double value; };
using point = std::tuple< x, y, z >;
Now one could use std::get<x>
to fetch the ‘x’ value of the tuple, std::get<y>
for ‘y’ and so on.
Should we use this approach everywhere and deprecate the use of struct
in any context? In the authors’ opinion we should not. The use of wrapper types is much more complicated to both read and understand than a plain struct
. For example, the wrapper types that were introduced, such as the ‘x’ type, make little sense outside of their corresponding tuples, yet they are peers to it in scope. Also, the heavy syntax makes it difficult to understand exactly what is intended by this code.
What does all this have to do with lvariants? The lvariant is to std::variant
as struct
is to std::tuple
. A variant type that represents a distance in an x direction, a y direction, or a z direction (mathematically called a “copoint”) has a similar look and feel to the std::tuple
version of point
.
struct x { double value; };
struct y { double value; };
struct z { double value; };
using copoint = std::variant< x, y, z >;
This copoint implementation has the same drawbacks that the std::tuple
implementation of points has. An lvariant version of copoint
, on the other hand, is easier to grok and doesn’t require special tag types at all.
Some variation of the following example is common when illustrating a std::variant
type:
void f( std::variant< double, std::string> v ) {
if( std::holds_alternative< double >( v ) {
std::cout << "Got a double " << std::get< double >( v ) << std::endl;
}
else {
std::cout << "Got a string " << std::get< std::string >( v ) << std::endl;
}
};
This illustrates how quickly variants can be disassembled when they are simple, but it is hardly representative of how complex variant types are used. The primary problem in the above snippet is that there are no compile-time guarantees that ensure all of the n
alternatives are covered. For the more general scenario, a visit
function is provided.2
struct f_visitor {
void operator()( const double d ) {
std::cout << "Got a double " << d << std::endl;
}
void operator()( const std::string & s ) {
std::cout << "Got a string " << s << std::endl;
}
};
void f( std::variant< double, std::string > v ) {
std::visit( f_visitor(), v );
};
Aside from the unsightly verbosity of the above code, the mechanism by which this works makes the visitor’s operator()
rules work by SFINE, which is a significant developer complication. Using a template parameter as part of a catch-all clause is going to necessarily produce strange error messages.
struct f_visitor {
template< typename T >
void operator()( const T & t ) {
// oops
std::cout << "I got something " << t.size() << std::endl;
}
};
void f( std::variant< double, std::string > v ) {
// Unhelpful error message awaits. Erroneous line won't be pointed out.
std::visit( f_visitor(), v );
};
While the utility of type selection and SFINE for visitors is quite clear for advanced C++ developers, it presents significant hurdles for the beginning or even intermediate developer. This is especially true when it is considered that the visit
function is the only way to guarantee a compilation error when all cases are not considered.
Using types as accessors with a std::variant
works for many use cases, but not all. If there is a repeated type the only options are to either use wrapper types or to work with the real underlying discriminator, an integer index. To illustrate the problems with using the index, consider the following implementation of copoint:
Use of both std::get<double>
and the standard std::visit
are impossible due to the repeated double
type in the variant. Using the numeric index to work around the issue brings its own problems, however. Consider the following visitor:
struct visit_f {
void operator()( std::integral_constant<std::size_t, 0>, double d ) {
std::cout << d << " in x" << std::endl;
};
void operator()( std::integral_constant<std::size_t, 1>, double d ) {
std::cout << d << " in y" << std::endl;
};
void operator()( std::integral_constant<std::size_t, 2>, double d ) {
std::cout << d << " in z" << std::endl;
};
};
Here we introduce yet another advanced C++ feature, compile-time integrals. In the opinion of the author, this is unfriendly to novices. The problem of duplicated types can be even more insidious, however…
Consider the following code:
On most platforms, this code will compile and run without a problem. However, if std::size_t
happens to be typedef
’d to be the same type as unsigned
on a particular platform, a compilation error will ensue. The only two options for fixing the error are to fall back to using the index or to make custom wrapper types, and this is assuming one can edit the library type.
Also notable is that working with third party libraries that are free to change their underlying types creates abstraction leaks when used with a library-based variant.
// Is this code future proof? Not likely. Looks like a foot-gun to me.
using database_handle = std::variant< ORACLE_HANDLE, BERKELEY_HANDLE >;
Because lvariants require identifiers as tags, they aren’t susceptible to this problem:
lvariant
The definition of an lvariant has the same syntax as a union
, but with an lvariant
keyword as in the following example:
// This lvariant implements a value representing the various commands
// available in a hypothetical shooter game.
lvariant command {
std::size_t set_score; // Set the score to the specified value
std::monostate fire_missile; // Fire a missile
unsigned fire_laser; // Fire a laser with the specified intensity
double rotate; // Rotate the ship by the specified degrees.
};
Each member declaration consists of a type followed by its corresponding identifier.
An lvariant has a default constructor if its first field also has a default constructor. A default constructed lvariant is set to the first fields’s default constructed value.
Assignment at construction can be used to set the lvariant to a particular value. The lvariant is used as a namespace when specifying specific alternatives.
lvariant instances can also be assigned in the course of a program’s execution.
Extracting values from an lvariant is acomplished with a new inspect
keyword. While pattern matching is covered in an accompanying paper P1308, we provide some basic examples below for exposition purposes.
inspect( cmd ) {
set_score value =>
stream << "Set the score to " << value << ".\n";
fire_missile m =>
stream << "Fire a missile.\n";
fire_laser intensity =>
stream << "Fire a laser with " << intensity << " intensity.\n";
rotate degrees =>
stream << "Rotate by " << degrees << " degrees.\n";
}
As with library-based variants, the behavior of assignment when an exception is thrown is of considerable concern. We propose the following for lvariants:
This provides a safe default and supports users of differing philosophies.
The “I’m broken. You deal with it.” philosophy allows the lvariant
to get into a state where the only valid operations are assignment and destruction. This is accomplished by overriding the assignment operator and allowing the ‘std::valueless_by_exception’ exception to pass through to callers.
lvariant Foo {
PossiblyThrowingMoveAssignmentType field1;
std::string field2;
// Possibly throw a 'std::valueless_by_exception' exception which makes this
// object only assignable and destructable.
Foo& operator=(const Foo& rhs);
Foo& operator=(const Foo&& rhs); // implementation skipped
};
Foo& Foo::operator=(const Foo& rhs)
{
// This can possibly throw a 'std::valueless_by_exception' exception.
lvariant(*this) = rhs;
}
The “exception are for the weak” philosophy essentially terminates the program if there’s an exception on assignment. This is accomplished by marking the assignment operator noexcept
.
lvariant Foo {
PossiblyThrowingMoveAssignmentType field1;
std::string field2;
Foo& operator=(const Foo& rhs) noexcept;
Foo& operator=(const Foo&& rhs) noexcept; // implementation skipped
};
Foo& operator=(const Foo& rhs) noexcept
{
lvariant(*this) = rhs;
}
The “embrace emptiness” philosophy switches to a special empty state if there’s an exception on assignment. This is accomplished by handling the std::valueless_by_exception
exception within the assignment operator.
lvariant Foo {
PossiblyThrowingMoveAssignmentType field1;
std::string field2;
std::monostate empty;
Foo& operator=(const Foo& rhs);
Foo& operator=(const Foo&& rhs); // implementation skipped
};
Foo& operator=(const Foo& rhs)
{
try {
lvariant(*this) = rhs;
}
catch(std::valueless_by_exception&) {
lvariant(*this) = Foo::empty();
}
}
lvariant
sPattern matching is the easiest way to work with lvariant
s. Consider the following binary tree with int
leaves.
Say we need to write a function which returns the sum of a tree
object’s leaf values. Variant patterns are just what we need. A pattern which matches an alternative consists of the alternative’s name followed by a pattern for its associated value.
int sum_of_leaves( const tree & t ) {
return inspect( t ) {
leaf i => i
branch b => sum_of_leaves(*b.first) + sum_of_leaves(*b.second)
};
}
Assuming we can pattern match on the std::pair
type, which is discussed in the companion paper, this could be rewritten as follows.
int sum_of_leaves( const tree & t ) {
return inspect( t ) {
leaf i => i
branch [left, right] => sum_of_leaves(*left) + sum_of_leaves(*right)
};
}
We conclude that types-as-tags are for astronauts, but variants are for everyone. None of the library implementations thus far proposed are easy enough to be used by beginners; a language feature is necessary. In the authors’ opinion a library-based variant should complement a language-based variant, but not replace it. And with language-based variants comes pattern matching, another highly desirable feature in the language.
Thanks to Vicente Botet Escribá, John Skaller, Dave Abrahams, Bjarne Stroustrup, Bengt Gustafsson, and the C++ committee as a whole for productive design discussions. Also, Yuriy Solodkyy, Gabriel Dos Reis, and Bjarne Stroustrup’s prior research into generalized pattern matching as a C++ library has been very helpful.
before | after |
---|---|
See The C++ Core Guidelines rule C.2.↩
Compare that code to the same for an lvariant:
lvariant double_or_string {
double with_double;
std::string with_string;
};
void f( double_or_string v ) {
switch( v ) {
case with_double d:
std::cout << "Got a double " << d << std::endl;
case with_string s:
std::cout << "Got a string " << s << std::endl;
}
}
↩