Proposal for C2y
WG14 3316
Title: The `void`-_which-binds_, v2: typesafe parametric polymorphism
Author, affiliation: Alex Celeste, Perforce
Date: 2024-08-19
Proposal category: New feature
Target audience: Compiler/tooling developers, library developers, application developers
C has a built-in mechanism for handling generically-typed data: void *
. Unfortunately, this has
a problem: it achieves genericity by simply discarding all type information!
We propose a mechanism to define native functions and data structures that are both polymorphic and strongly-typed, as commonly seen in functional programming languages or those that have adopted functional idioms.
The mechanism does not impose any runtime or ABI burden.
Reply-to: Alex Celeste (aceleste@perforce.com)
Document No: N3316
Revises: N2853
Date: 2024-08-30
- add discussion of function pointer usability issues, proposal for conversion
- add qualifier-generic binding in response to Committee request
- add discussion of container types
- small fixes, typo changes, completely reworked normative wording
- rebase proposed changes against latest draft
- original proposal
C can define polymorphic functions by using void *
, but this comes at the cost of complete type
erasure.
This means that while a function can be defined to operate on any type of data, it trades the ability to do so for the lost ability to statically check that the operations it wants to perform on that data are actually valid for its concrete type. It cannot guarantee that two operands to be compared originate from the same kind of data, and it cannot even guarantee that the comparison operator intends to work with them if they are!
In many other languages, expressing the necessary constraints is easy. In many languages influenced by ML, "type variables" can be used in the function signature, and in C++, templates can be used to parameterize a function over types, although templates are not parametric in quite the same way (templates are not polymorphic functions, but instead instantiate to monomorphic functions).
We propose a matched pair of two attributes which can appertain to a pointed-to void
, which force
the pointer to either bind to a concrete type (for operator implementations and container type
definitions), or to bind consistently with other void
pointers within the same toplevel
function signature. Because these are attributes, they impose no runtime cost, and if they are
removed from a correct program, ensure it will remain binary-identical to the type-safe version.
This has the convenient impact that implementation conformance can be achieved by doing nothing.
Other languages, especially those descending from or influenced by ML, have the notion of "type variables", by which function (and other) types can be parameterized: at the point of use, an operand type is substituted into the type of the function, and if the same type variable appears in two places in the function type, it has to have the same binding to a type deduced from the operands, enforcing type safety.
So in an ML-like or Haskell-like language, a function can have a type like this:
map :: forall a. ([a], (a -> a)) -> [a]
...which we can understand to take some sequence of elements of type a
, a callback which
transforms objects of type a
to other objects also of type a
, and returns another sequence
which (we will assume) contains the results of each transformation. We can't call this function
with a callback that operates on objects of a type not compatible with the objects in the input
buffer.
Informatively, a similar C++ signature might be:
template <typename A>
auto map (Seq <A> const &, Function <A (A)> const &) -> Seq <A>;
Though it is important to remember that C++ templates are not functions; they merely instantiate
into monomorphic functions, so this is not the same. Polymorphic functions in Haskell and ML
have a truly polymorphic type for a single identity; the same function can be referenced, passed
around, and still used polymorphically at a distance. This property is also true of C functions
that use void *
- they can be referenced by pointer and still applied to different argument
types - and is the one we wish to preserve.
To express this idiomatically in C, we could use void *
, and the function signature would
probably look something like:
void map (void * in
, void * out
, void (* op) (void const *, void *)
, void * (* step) (void *));
(either the step
function, or a step size and some crafty casts to char *
, is required to
actually navigate the lists in the absence of size information about the elements)
Nothing prevents us from providing a mistyped op
- or even a mistyped step
:
not even navigating a polymorphic sequence is statically assured to make sense!
Surprisingly, this is actually possible to enforce (with a lot of supporting verbosity) in pure
C11. However, the technique is too verbose and fragile for real-world use and does not scale at
all. Examples are provided at the end of an untyped program which can miscompile, a typechecked
C11 program, and an equivalent typechecked program using the void
-which-binds, showing the
relative verbosity and the improvements in readabilty of the full feature.
We propose to add four new attributes which can appertain to the type void
, when it appears as the
referenced type of a pointer type used as either a function parameter or a function return. These
are:
[[bind_var (id)]]
[[bind_type (type)]]
[[qual_var (id)]]
[[qual_with (id)]]
where id
is an identifier that introduces a type variable, and type
is a concrete type name
other than void
(it does not otherwise need to be complete).
The names of these attributes are tentative.
Syntactically, because bind_var
and bind_type
appertain to the type void
itself, they appear
after it in a pointer type specifier. They may also appertain to a typedef of void
. qual_var
and qual_with
also appertain to the pointed-to type, but are not limited to void
.
(note that the syntax requires them to appear at the end of the specifier list, so if the type is
qualified-void
e.g. void const
, they have to appear void const [[here]]
and not
void [[here]] const
, which forces the use of "west const" const void [[here]]
if the void
and the attribute originate from the same macro)
Both of the bind
attributes add constraints which restrict which pointer types the given
parameter or return value may be converted to without an explicit cast. Effectively both of these
attributes add virtual qualifiers that associate with the pointed-to type and therefore cannot be
discarded except by explicit cast-off.
qual_with
adds a similar constraint based on the qualifier deduced by qual_var
, enforcing that
the pointer type annotated with it must only be converted as if it had a deduced qualifier.
The simpler attribute is bind_type
. This adds the constraints that the pointer with the
annotated referenced type can only implicitly convert to or from a (suitably qualified) pointer to
type
, or be assigned to another (suitably qualified) pointer to void
with the same annotation.
For example:
void const [[bind_type (float)]] *
float_op_impl (void const [[bind_type (float)]] * p1, void [[bind_type (float)]] * p2)
{
float const * fp1 = p1; // OK, bound type
float * fp2 = p2; // OK
int const * ip1 = p1; // new constraint violation
void const * vp1 = p1; // new constraint violation
int const * ip2 = p2; // new constraint violation
void const * vp2 = p2; // new constraint violation
fp2 = p1; // (existing constraint violation)
ip2 = (int *)p2; // OK, explicit cast
vp2 = (void *)p2; // OK, explicit cast-off of bind_type
return fp1; // OK, bound type
return vp1; // new constraint violation
return (float const *)ip1; // ...OK at least by these rules
return p1; // OK, same bind_type
}
This attribute is not really intended to be used in polymorphic functions themselves, but rather in
the operators passed to the polymorphic function, which are monomorphic in themselves but usually
will require a signature that explicitly works with void *
(because int * (*) (int *)
and
void * (*) (void *)
are not convertible or guaranteed to have similar representations), in order
to be passed to the actual polymorphic function (but see below for more on both of these points).
This attribute is used on the parameters and return type of the actual polymorphic function. It is also appertained to the parameters and return type of any callback operators accepted by the polymorphic function, within its own signature.
bind_var
adds the constraint that whatever the referenced types of the pointers that are
converted to or from (possibly-qualified) void *
in the argument types and returned value
conversion, must be consistent between all void
pointers annotated with the same id
, within
each instance of a call to that function.
In other words it introduces a type variable and all void *
annotated with the same type
variable must convert to or from the same (ignoring qualifiers) pointer type.
For example:
void [[bind_var (A)]] *
select_first (void [[bind_var (A)]] * a, void [[bind_var (A)]] * b, void [[bind_var (B)]] * c)
{
return a; // OK, a must have bound to an A*
return b; // OK, b must have bound to an A*
return c; // new constraint violation, c may have bound to a different type
}
int * ip;
float * fp;
ip = select_first (ip, ip, fp); // OK, a and b are both A
fp = select_first (fp, fp, ip); // same
ip = select_first (ip, ip, ip); // OK, A and B can both be the same, it just doesn't know that
ip = select_first (fp, fp, ip); // new constraint violation, mismatched A between return/param
select_first (ip, fp, fp); // new constraint violation, mismatched A between params 1 & 2
The two attributes combine: bind_type
can bind its type variable to the type introduced by a
bind_var
annotation, to allow operator callbacks to be type-checked against data arguments to a
higher-order polymorphic function:
void map (
void [[bind_var (A)]] * in
, void [[bind_var (A)]] * out
, void (* op) (void const [[bind_type (A)]] *, void [[bind_type (A)]] *)
, void [[bind_type (A)]] * (* step) (void [[bind_type (A)]] *)
);
The distinction is that bind_var
deduces a value for the type variable from context, while
bind_type
enforces that the conversion is consistent when there is no input context (it is for
conversions that will occur within the function body, from typed-void
to typed-void
, and can
therefore be used in declarations local to the definition).
bind_type
therefore works with untyped callbacks and enforces a type for their arguments within
the function's scope only. If the callback had typed-void
parameters already, bind_var
could
also appear here and deduce a type from the callback's existing bind_type
attributes (which
would still have to be consistent with the type deduced from in
and out
in this particular
example).
In the above declaration of a polymorphic map (implementation is in the Examples section), all of
the void *
type components have been annotated as having to bind to the same concrete object
type.
Therefore:
int i1[10], i2[10];
float f1[10], f2[10];
void int_op (void const [[bind_type (int)]] *, void [[bind_type (int)]] *);
void [[bind_type (int)]] * int_step (void [[bind_type (int)]] *);
void float_op (void const [[bind_type (float)]] *, void [[bind_type (float)]] *);
void [[bind_type (float)]] * float_step (void [[bind_type (float)]] *);
map (i1, i2, int_op, int_step); // OK, all void * bind to int *
map (f1, f2, float_op, float_step); // OK, all void * bind to float *
map (i1, i2, float_op, float_step); // new constraint violation - operator types do not match data
map (i1, f2, int_op, float_step); // new constraint violation - operator and data types mismatch
Finally, if a polymorphic function is called from within another polymorphic function, the type
variables bind to the type variables of the void *
pointers passed directly to it, rather than
to void
:
// yikes!
void map2 (
void [[bind_var (A)]] * in1
, void [[bind_var (A)]] * out1
, void (* op1) (void const [[bind_type (A)]] *, void [[bind_type (A)]] *)
, void [[bind_type (A)]] * (* step1) (void [[bind_type (A)]] *)
, void [[bind_var (B)]] * in2
, void [[bind_var (A)]] * out2
, void (* op2) (void const [[bind_type (B)]] *, void [[bind_type (B)]] *)
, void [[bind_type (B)]] * (* step2) (void [[bind_type (B)]] *)
) {
map (in1, out1, op1, step1); // OK, all void * bind to A *
map (in2, out2, op2, step2); // OK, all void * bind to B *
map (in1, out1, op2, step2); // new constraint violation - operator types do not match data
map (in1, out2, op1, step2); // new constraint violation - operator and data types mismatch
}
Within the function body, void
-pointers annotated by bind_var
may only implicitly convert to
other void
-pointers with the same type variable. Conversions to other types, including pointers
to void
without the same type variable (inc. no type variable), require an explicit cast.
A void
-pointer annotated by bind_type
with a type variable that was introduced by bind_var
has the same constraint (consistent with the constraint if it had named a concrete type).
qual_var
is very similar to bind_var
, in that it introduces a type-level variable that binds
to part of the type that the appertained pointed-to type was implicitly converted from. However,
instead of introducing a variable for the pointed-to object type, it introduces a variable for
the qualification of the pointed-to object type.
Like bind_var
, qual_var
can bind to the qualifiers of the type named in a bind_type
attribute,
or to the qualifier variable named by another qual_var
attribute if one polymorphic function
calls another polymorphic function. Just like bind_var
, if two qual_var
attributes declare the
same qualifier variable in the same scope, they must bind to exactly the same qualifiers on the
pre-conversion type.
Unlike bind_var
, which works with void
as the object type and therefore has no other object
information to work with, it is possible for a pointer to be declared as pointing to a qualified
type and also to qual_var
. This is actually necessary: if a pointer is declared as pointing to
unqualified void, only pointers to unqualified object types would be able to convert to it,
so the listed qualifiers on the pointed-to type actually describe the maximum qualification that
qual_var
can bind (i.e. it can bind to less qualification than whatever is declared using
qualifier specifiers).
This same restriction doesn't exist in the return type (or whatever other "output" type is annotated
by qual_with
), which can (and generally should) be a pointer to an unqualified object type, and
must either be immediately implicitly converted back to a pointer with at-least as much
qualification of the pointed-to type annotated by the corresponding qual_var
, or discarded (in
practice functions using this are likely to be logically nodiscard
).
This attribute implements a feature which is already in C23 but currently uses ad-hoc signatures, namely the qualifier-generic search functions. These provide the best examples:
char [[qual_with (Q)]] * strchr (const char [[qual_var (Q)]] * s, int c);
The strchr
function is already defined to return a pointer with the same qualification as its
argument (not as its formal parameter), and this signature is able to capture that: if the
argument applied to s
was char const *
, Q
binds const
, and therefore the return type
char Q *
must be converted to char const *
; or if the argument to s
was char *
, Q
binds nothing and char Q *
is allowed to convert to char *
or a pointer to a more qualified
char
. However, Q
can never bind to volatile
because the parameter is not a pointer to
a volatile
-qualified type, so no such pointer can be used as an argument value.
As is demonstrated by the example, qual_var
can be used with complete object types, unlike
bind_var
, because it captures a different part of the pointed-to type.
The relationship between qual_var
and qual_with
is symmetric to the relationship between
bind_var
and bind_type
: qual_with
does not deduce any qualifiers from the conversion,
because its purpose is to ensure that appropriate qualification is added by the "output" conversion.
The "operator" functions that make use of the bind_type
attribute suffer from an unfortunate
usability problem, which is that although the concrete type of the pointed-to object is known by
the function definition, the pointers cannot be dereferenced because they still have an incomplete
C language type: the attribute does not change the meaning of the annotated C code, so there is no
ability added to use the pointers "conveniently", without converting the pointer type.
This is hardly insurmountable but makes the code much less elegant to read:
// compar functions for qsort:
int cmp_float_void (void const [[bind_type(float)]] * lv
, void const [[bind_type(float)]] * rv) {
float const * lhs = lv;
float const * rhs = rv; // this verbosity isn't great
return *lhs < *rhs ? -1
: *lhs > *rhs ? +1
: 0;
}
int cmp_float_simple (float const * lhs, float const * rhs) {
return *lhs < *rhs ? -1
: *lhs > *rhs ? +1
: 0;
}
Consequently we propose that it should be permitted to cast a pointer to any given function type,
to a pointer to a function type with the same signature but with some or all of the argument types
or the return type substituted for (equivalently-qualified) pointers to void
, annotated by
bind_type
with the substituted type name, if they were pointers to complete object types in the
original signature. This would allow:
qsort (base, n, sz, (Compar)cmp_float_simple);
even though cmp_float_simple
doesn't have the same signature as the compar
for qsort
and is
not a compatible function type.
We propose that this permission should only be made available to (possibly parenthesized/generically selected) identifiers that declare a function, and not to generalized function pointers. The reason for the restriction is that the conversion is not guaranteed to be a no-op on targets where data pointers can have different representations, and therefore function types that differ only in parameter and return pointer types can potentially have different ABIs. This primarily affects the embedded space. A cast of an identifier which designates a specific function can be implemented by a compiler-generated shim function wrapping that specific function in the new signature, but this cannot implement generalized (non-constant) function pointers.
(Compar)cmp_float_simple
// compiler inserts the static shim:
static int @shim$cmp_float_simple (void const [[bind_type(float)]] * lv
, void const [[bind_type(float)]] * rv) {
return cmp_float_simple (lv, rv);
}
(Compar)some_dynamic_pointer
// static shim is not possible
There is an existing requirement in 6.3.2.3 that when a pointer to a function is converted to a pointer to a function of another type and back again, the result compares equal to the original pointer. This is achievable with compiler-inserted shims, since the compiler (or at least, the linker) would be able to construct a list of the generated shim functions and their originals to check for in a back-conversion. This is not free, but on platforms where the conversion is not trivial anyway, is potentially an acceptable cost as users are less likely to be freely converting between function types in that situation. (on common desktop platforms where the shim is not needed anyway, the conversion back will always be a no-op too)
We would propose that the bind_var
and qual_var
attributes apply immediately to the functions
qsort
and bsearch
in <stdlib.h>
and to memchr
in <string.h>
; and that the qual_var
attribute apply immediately to the functions strchr
, strpbrk
, strrchr
and strstr
from
<string.h>
; and to the functions wcschr
, wcspbrk
, wcsrchr
, wcsstr
and wmemchr
from
<wchar.h>
:
void [[qual_with (Q), bind_type (T)]] * bsearch (void const * key
, void const [[qual_var (Q), bind_var (T)]] * base
, size_t nmemb
, size_t size
, int (*compar)(void const [[bind_type (T)]] *, void const [[bind_type (T)]] *));
void qsort (void [[bind_var (T)]] * base
, size_t nmemb
, size_t size
, int (*compar)(void const [[bind_type (T)]] *, void const [[bind_type (T)]] *));
void [[qual_with (Q)]] * memchr (void const [[qual_var (Q)]] * s, int c, size_t n);
char [[qual_with (Q)]] * strchr (char const [[qual_var (Q)]] * s, int c);
char [[qual_with (Q)]] * strpbrk (char const [[qual_var (Q)]] * s1, const char * s2);
char [[qual_with (Q)]] * strrchr (char const [[qual_var (Q)]] * s, int c);
char [[qual_with (Q)]] * strstr (char const [[qual_var (Q)]] * s1, const char * s2);
wchar_t [[qual_with (Q)]] * wcschr (wchar_t const [[qual_var (Q)]] * s, wchar_t c);
wchar_t [[qual_with (Q)]] * wcspbrk (wchar_t const [[qual_var (Q)]] * s1, const wchar_t *s2);
wchar_t [[qual_with (Q)]] * wcsrchr (wchar_t const [[qual_var (Q)]] * s, wchar_t c);
wchar_t [[qual_with (Q)]] * wcsstr (wchar_t const [[qual_var (Q)]] * s1, const wchar_t *s2);
wchar_t [[qual_with (Q)]] * wmemchr (wchar_t const [[qual_var (Q)]] * s, wchar_t c, size_t n);
(i.e. none of the functions deduce object type or qualification from the return conversion, and
neither bsearch
nor qsort
deduce object type from compar
, even though doing so is possible,
only from base
)
The names and exact placement of the attributes can be subject to further discussion. In practice we expect that users would wrap them inside macros anyway, for instance:
#define Auto(T) void [[bind_var (T)]]
#define Void(T) void [[bind_type (T)]]
void map (
Auto (A) * in
, Auto (A) * out
, void (* op) (const Void (A) *, Void (A) *)
, Void (A) * (* step) (Void (A) *)
);
void int_op (const Void (int) *, Void (int) *);
(again note that the choice to deduce from both in
and op
, but not out
or step
, is a design
decision and all four could be deduced from, or even just one of the four; wherever the type
information exists)
Overloads and _Generic
do not provide a suitable alternative to this feature. The ad-hoc
polymorphism provided by _Generic
operates on a fixed set of known types. The operator name that
it can be used to construct is a second-class language feature that does not have a single address.
To pass such an operator around, it must be deconstructed somehow and a non-generic overload
selected.
This functionality is completely orthogonal; where an ad-hoc polymorphic "overloaded" function provides different suitable implementations for a number of different types, a parametrically polymorphic function is completely unaware of the concrete type of its arguments and always provides the exact same implementation regardless of it.
Templates are similarly orthogonal: they do not create a first-class polymorphic language feature (remaining polymorphic when passed by value), but instantiate into separate monomorphic functions. They can also specialize, which betrays the principle of always providing the exact same body implementation down to the instruction level.
A different proposal targeting C2y suggests adding runtime-polymorphic types to the language.
This feature can be used to implement essentially any algorithm that could be expressed using
parametric polymorphism, but does not perform any type-checking. Essentially it expands the
functionality of void *
itself to allow whole object footprints to become untyped. In general
we feel like these proposals are best suited to target different use cases - N3212 is better
suited to use cases where the object type really is intended to be dynamic, such as relying
on a dynamic object footprint, rather than functions where it is simply abstracted and can be
checked for consistency against the other arguments. It also does not propose any way to improve
type safety, so does not help with any case where the existing void *
feature is already enough
to implement an algorithm but is not enough to be sure it is correct. The sort
examples from
N3212 are an area which we feel this proposal handles better, while the atomic
examples are
a use case that this proposal cannot handle at all and benefit very well from its dynamic
approach.
The implementation impact is lower than it seems.
Firstly, there is no ABI impact whatsoever from this feature. Because the constraints are
applied by attributes, they do not change the representation type of the pointers to generic data,
which remain void *
at all times. There is no binary difference between a program that chooses
to make use of this typechecking feature and one that does not.
This is different from C++, where a template function instantiates into multiple implementations with semantically separate identities and which all have different signature types. There remains only one polymorphic function, with one type, and whose type remains polymorphic even after it is referenced indirectly.
This follows directly from the principle that a correct program with standard attributes remains correct and has the exact same observable meaning if the attributes are removed.
Secondly, if polymorphic type checking proves too difficult for a smaller implementation, since these constraints are only introduced by attribute - the implementation can choose to be unable to diagnose violations of the constraints and will still be conforming, because it will always compile a correct program to exactly the same meaning as a higher-end compiler that does understand the attributes. Therefore, implementations are eased-in to needing to support polymorphism, and do not need to provide full checking in order to conform.
It is possible and entirely plausible that only analysis tools and not primary compilers would ever bother to implement the constraints for these attributes.
Thirdly, implementation experience with Helix QAC suggests that ML-style type inference with unification is "not too difficult" for a small team (one) to implement in a short amount of time. We found that C could easily be extended with back-propagating inferred pointer types, which are used to drive a number of different type-based (proprietary) analyses. (In QAC these properties are mostly inferred from usage rather than specified explicitly by user attributes.)
ML subset compilers are frequently implemented as student projects, so this is a widely understood feature in the broader compiler community.
A request that immediately arose from the first version of this proposal was that bind_var
should
also be available for use in structures, to make container types slightly more resilient.
If bind_var
and friends are adopted for use with function types, it therefore follows that the
logical next line of development would be to define the attributes in order to allow syntax along
the lines of the following:
[[bind_var (T)]] struct LinkedList {
struct LinkedList [[bind_type (T)]] * next;
void [[bind_type (T)]] * data;
};
Such a structure definition is the same type as a generic linked list that can hold elements of
any type, but enforces that the type held in data
is consistent with the declaration of the pointer
to the head, and that the type of the node pointed to by next
will similarly be consistent with the
type of this node. This would enforce that the list only contains elements of a single element type
without having to instantiate or metaprogram-up a specific LinkedList
for the member type. This has
the advantage that containers of distinct (but internally-consistent) element types can be handled
by algorithm functions (map
etc.) that are themselves fully polymorphic; if LinkedList
needed to
be instantiated into multiple struct types, the handling functions would themselves also need to
have a distinct instance for each concrete type, ending up with something like C++ templates instead
of a proper parametrically-polymorphic system with only one generic function and one generic
container type.
Note that unlike polymorphic functions, which deduce bindings for the type variable from use, polymorphic types would need to have the type explicitly declared:
struct LinkedList [[bind_type (int)]] * head = malloc (sizeof (struct LinkedList));
Note it is not possible to write auto [[bind_type (int)]] * head =
in the current version of the
language because of syntax limitations on auto
(the star is an implementation-defined extension).
This could be a separate improvement that composes well.
We do not propose wording for this feature at this time, though overall the development should be simpler than the basis established for function types, and the extension seems reasonable.
Although the feature described by N3212 generally fits better against other use cases, it has
room to compose well with void
-which-binds in situations where a function accepts more than one
dynamically-typed argument. For instance:
void generic_from_to (_Type T
, _Var(T) const * from
, _Var(T) * to) {
T temporary;
// work using the temporary, from, and to
}
could avoid having, and needing to check, a dependent type in the signature by using void *
:
void generic_from_to (_Type T
, void const [[bind_var (A)]] * from
, void [[bind_var (A)]] * to) {
T temporary; // can still do this bit
// same work because `_Var(X) *` is defined to be a `void *` in ABI
}
The proposed changes are based on the latest public draft of C2y, which is N3220. Bolded text is new text when inlined into an existing sentence.
Modify 6.7.13 "Attributes":
Modify 6.7.13.2 to add the four new standard attribute names:
The identifier in a standard attribute shall be one of:
bind_type
bind_var
deprecated
fallthrough
maybe_unused
nodiscard
noreturn
_Noreturn
qual_var
qual_with
unsequenced
reproducible
(ED: not in alphabetical order?)
Note that 6.7.13.2 paragraph 3 and footnote 173 remain true for the new attributes without changes.
Add two new sections after 6.7.13.8:
6.7.13.9 The
bind_var
andbind_type
attributesConstraints
The
bind_var
attribute shall be applied to the referenced type of a pointer in the return or parameter types of a function declaration or function pointer.
bind_var
shall have an argument clause, which shall have the form
(
identifier)
The
bind_type
attribute shall be applied to the referenced type of a pointer in the return or parameter types of a function declaration or function pointer, or to the referenced type of a pointer type specified within the scope of a function that usesbind_var
in its signature.
bind_type
shall have an argument clause, which shall have the form
(
type-name)
or
(
identifier)
where identifier is an identifier introduced by abind_var
attribute appearing in the function signature.The
bind_var
andbind_type
attributes shall not be applied to any type other than possibly-qualifiedvoid
. footnote)footnote) they can therefore apply to aliases for
void
created bytypedef
.Semantics
The
bind_var
andbind_type
attributes indicate that a pointer declared as a possibly-qualified pointer tovoid
is intended to be used with statically-checked types.A pointer that specifies the
bind_var
attribute is a binding pointer. A binding pointer with an identifier A as its attribute argument is binding to A.A pointer that specifies the
bind_type
attribute is a constrained pointer. A constrained pointer with an identifier A as its attribute argument is constrained to A. A constrained pointer with a type name T as its attribute argument is constrained to T.Within the scope of its declaration, a binding pointer shall only be converted to a pointer binding to the same identifier, or to a pointer constrained to the same identifier, except by means of an explicit cast.
Within the scope of its declaration, a constrained pointer shall only be converted to a pointer binding to the same identifier, or to a pointer constrained to the same identifier or type name, or to a pointer to (possibly-qualified)
void
, except by means of an explicit cast.For a function call expression, if a parameter of the called function, or the return type, is or derives from a binding pointer; the corresponding type ignoring qualification in the type of the argument, or the type that the return type is converted or assigned to, determines the actual type. If the actual type is a binding pointer or constrained pointer in the calling scope, the actual type is the identifier or type name it is binding or constrained to; otherwise, the actual type is the referenced type; or
void
if the corresponding type in the calling context is not a pointer.For each binding pointer in the type of the called function that binds to the same identifier, the actual type shall be the same footnote).
footnote) two types that are compatible might not be the same type.
For each constrained pointer in the type of the called function that is constrained to an identifier, the actual type shall be the same as the actual type of all binding pointers in the type of the called function that bind to that identifier.
For each constrained pointer in the type of the called function that is constrained to a type name, the actual type shall be compatible with the constrained type, or
void
.The
__has_c_attribute
conditional inclusion expression (6.10.2) shall return the value ######L when givenbind_var
orbind_type
as the pp-tokens operand.If an identifier has been bound by
bind_var
in the current scope, and is also visible in the current scope as a typedef name, thebind_type
attribute interprets it as the identifier bound bybind_var
.NOTE there is no restriction against two binding pointers with distinct identifiers binding to or from the same type outside the function, but internally the function treats them as unrelated.
Recommended Practice
The
bind_var
attribute should be used in the parameters or return type of a polymorphic function to indicate that two separate generic pointers in its signature are intended to work on the same concrete type, in any given invocation.This attribute allows usage of the polymorphic function to be type-checked against the concrete types of its arguments and destination, as well as the expected types to be operated on by any callback operators passed to it.
An implementation is encouraged to emit a diagnostic if the identifier introduced by
bind_var
has also been declared in any name space in the current scope when the attribute is encountered.An implementation is encouraged to emit a diagnostic if two declarations of a function do not make consistent use of the
bind_var
andbind_type
attributes.The
bind_type
attribute should be used with a type name in the parameters or return type of a callback operator that is declared with a signature that uses pointers tovoid
because it is intended to work with polymorphic higher-order functions (likeqsort
), but that is itself intended to operate on values with a concrete type.The
bind_type
attribute used with an identifier should be used to indicate when the concrete type of the pointed-to object is expected to be the same as the concrete type pointed to by a binding pointer. This can for instance communicate that a return type is the same as a parameter type, but that it doesn't contribute to "deduction" and be discarded or cast to some other type.EXAMPLE 1 This polymorphic function returns a reference to an element within the first buffer. It cannot correctly return a reference to an element within the second buffer:
int i1[10]; float f1[10]; void [[bind_var (A)]] * first (void [[bind_var (A)]] * p1 , void [[bind_var (B)]] * p2) { return p1; // correct // return p2; // constraint violation } int * ip = first (i1, f1); // OK float * fp = first (f1, i1); // OK ip = first (i1, i1); // also OK, the function just doesn't know A and B are compatible here fp = first (i1, f1); // constraint violation, A and A do not match
EXAMPLE 2
qsort
is a polymorphic function that needs to accept a compare function that will work on its base array, so the following signature can communicate this:void my_qsort (void [[bind_var (T)]] * base , size_t nmemb , size_t size , int (*compar)(void const [[bind_var (T)]] * , void const [[bind_var (T)]] *));) my_qsort (f1, 10, sizeof (float), float_compare); // OK my_qsort (i1, 10, sizeof (int), float_compare); // constraint violation, A and A do not match
EXAMPLE 3 This callback is intended to be passed to
qsort
to sort an array offloat
values. Its operating parameter types are float pointers, but the signature ofqsort
requires them to be declared as pointers tovoid
:int float_compare (const void [[bind_type (float)]] * l , const void [[bind_type (float)]] * r) { const float * fl = l; const float * fr = r; return *fl < *fr ? -1 : *fl > *fr ? +1 : 0; }
EXAMPLE 4 This polymorphic function only intends to deduce the type
A
from itsin
parameter, and requiresout
to match its type exactly:void move_bytes (size_t sz , void const [[bind_var (A)]] * in , void [[bind_type (A)]] * out); int const a[10] = { ... }; int b[10]; move_bytes (sizeof a, &a, &b); // Note that the actual type of A is int, // not int const, so these match
This can be helpful for communicating intent more clearly. This function declares that the returned pointer type has the same concrete object type as the parameter, without deducing the binding from it, which (for instance) allows it to be cast immediately to another type or converted to
void *
:void [[bind_type (A)]] * do_step (size_t sz, void [[bind_var (A)]] * in); int i[10]; void * p = do_step (sizeof (int), &i[0]); // OK - the converted result // doesn't contribute to deduction so // the (nop) conversion to void* is OK // even though the actual type is int*
6.7.13.10 The
qual_var
andqual_with
attributesConstraints
The
qual_var
attribute shall be applied to the referenced type of a pointer in the parameter types of a function declaration or function pointer.The
qual_with
attribute shall be applied to the referenced type of a pointer in the return or parameter types of a function declaration or function pointer.The
qual_var
andqual_with
attributes shall have an argument clause, which shall have the form
(
identifier)
The
qual_with
attribute shall only appear in the declaration of a function that also contains aqual_var
attribute with the same argument clause.Semantics
The
qual_with
attribute applied to a pointed-to type indicates that it is intended to be treated as if it had the same qualification as the type that is converted to the parameter type with thequal_var
attribute with a matching identifier, within the arguments for a given function call expression.If the return type of the function contains the
qual_with
attribute, its return value shall not be used in a way that would discard any qualification existing before implicit conversion, from the type at the corresponding position in the type of the argument value assigned to a parameter containing thequal_var
attribute with a matching identifier.If two
qual_var
attributes appear in the function's parameter types, the qualification associated with the identifier is the combined qualification of the types at the corresponding positions in both argument types, before implicit conversion.The
__has_c_attribute
conditional inclusion expression (6.10.2) shall return the value ######L when givenqual_var
orqual_with
as the pp-tokens operand.Recommended practice
The
qual_var
andqual_type
arguments should be used to indicate that the result of a function call has the same qualification as the input(s) before implicit conversion. This is useful for functions that intended to return a pointer to an unqualified element if given an unqualified array, and to a qualified element if given a qualified array.An implementation is encouraged to emit a diagnostic if the identifier introduced by
qual_var
has also been declared in any name space in the current scope when the attribute is encountered.EXAMPLE 1 The canonical example is the search functions from the Standard Library, which may be used to search both
const
and non-const
arrays:// declared in string.h // if the input pointer is to const, the result is to const void [[qual_with (Q)]] * memchr (void const [[qual_var (Q)]] * s, int c, size_t n); char [[qual_with (Q)]] * strchr (char const [[qual_var (Q)]] * s, int c); char const a1[100] = ... char a2[100] = ... char const * pcc; char * pcm; pcc = strchr (a1, '6'); // valid: a1 is pointer to const, result goes to pointer to const pcm = strchr (a2, '6'); // valid: a2 is unqualified, result goes to unqualified pcc = strchr (a2, '6'); // valid: a2 is unqualified, result goes to pointer to const pcm = strchr (a1, '6'); // diagnostic: a1 is pointer to const, result goes to unqualified
EXAMPLE 2 Using
qual_var
on a pointer to an unqualified type doesn't make sense:char [[qual_with (Q)]] * get_elem (char [[qual_var (Q)]] *, size_t at); char const a1[100] = ... char a2[100] = ... char const * p1 = get_elem (a1, 6); // error: can't pass a1 to this parameter! char * p2 = get_elem (a2, 10); // valid, but the attributes don't add anything
EXAMPLE 3 When two parameters use the same identifier, the qualification associated with the identifier by
qual_with
is combined from both parameters:void [[qual_with (Q)]] * one_of (void const volatile [[qual_var (Q)]] * l , void const volatile [[qual_var (Q)]] * r); void const * pcv; void volatile * pvv; void const volatile * pcvv; pcv = one_of (pcv, pcv); // valid, only const qualified pcv = one_of (pcv, pvv); // diagnostic, lost volatile qualification pvv = one_of (pcv, pvv); // diagnostic, lost const qualification pcvv = one_of (pcv, pvv); // valid
Modify 6.3.2.3, paragraph 8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If the return type or **any parameter type of the converted type is a pointer to possibly-qualified void, and the ** corresponding return type or parameter type of the type before conversion is a pointer to an identically-qualified object type, and the expression being converted is an identifier declared as having function type footnote), using the converted pointer to call the underlying **function has the same effect as a call through a pointer of the original type. Otherwise, ** if a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
footnote) as opposed to an identifier declared as having pointer to function type.
Modify 7.24.5.1 "The bsearch
generic function", changing the signature in the synopsis:
#include <stdlib.h> void [[qual_with (Q), bind_type (T)]] * bsearch (void const * key , void const [[qual_var (Q), bind_var (T)]] * base , size_t nmemb , size_t size , int (*compar)(void const [[bind_type (T)]] *, void const [[bind_type (T)]] *));
Replace "the bsearch
generic function" with "the bsearch
function" in paragraph 2 and
paragraph 4, and the clause title.
Leave paragraph 5 as-is for clarity.
Remove paragraph 6 "The external declaration ... is visible" and footnote 355.
Modify 7.24.5.2 "The qsort
function", changing the signature in the synopsis:
#include <stdlib.h> void qsort (void [[bind_var (T)]] * base , size_t nmemb , size_t size , int (*compar)(void const [[bind_type (T)]] *, void const [[bind_type (T)]] *));
Remove 7.26.5 "Search functions", paragraph 2 "The external declarations ...", paragraph 3, and footnote 365.
Modify 7.26.5.2 "The memchr
generic function", changing the signature in the synopsis:
#include <string.h> void [[qual_with (Q)]] * memchr (void const [[qual_var (Q)]] * s, int c, size_t n);
Replace "the memchr
generic function" with "the memchr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.26.5.3 "The strchr
generic function", changing the signature in the synopsis:
#include <string.h> char [[qual_with (Q)]] * strchr (char const [[qual_var (Q)]] * s, int c);
Replace "the strchr
generic function" with "the strchr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.26.5.5 "The strpbrk
generic function", changing the signature in the synopsis:
#include <string.h> char [[qual_with (Q)]] * strpbrk (char const [[qual_var (Q)]] * s1, const char * s2);
Replace "the strpbrk
generic function" with "the strpbrk
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.26.5.6 "The strrchr
generic function", changing the signature in the synopsis:
#include <string.h> char [[qual_with (Q)]] * strrchr (char const [[qual_var (Q)]] * s, int c);
Replace "the strrchr
generic function" with "the strrchr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.26.5.8 "The strstr
generic function", changing the signature in the synopsis:
#include <string.h> char [[qual_with (Q)]] * strstr (char const [[qual_var (Q)]] * s1, const char * s2);
Replace "the strstr
generic function" with "the strstr
function" in paragraph 2 and
paragraph 3, and the clause title.
Remove 7.31.4.5 "Wide string search functions", paragraph 2 "The external declarations ...", paragraph 3, and footnote 413.
Modify 7.31.4.5.2 "The wcschr
generic function", changing the signature in the synopsis:
#include <wchar.h> wchar_t [[qual_with (Q)]] * wcschr (wchar_t const [[qual_var (Q)]] * s, wchar_t c);
Replace "the wcschr
generic function" with "the wcschr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.31.4.5.4 "The wcspbrk
generic function", changing the signature in the synopsis:
#include <wchar.h> wchar_t [[qual_with (Q)]] * wcspbrk (wchar_t const [[qual_var (Q)]] * s1, const wchar_t *s2);
Replace "the wcspbrk
generic function" with "the wcspbrk
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.31.4.5.5 "The wcsrchr
generic function", changing the signature in the synopsis:
#include <wchar.h> wchar_t [[qual_with (Q)]] * wcsrchr (wchar_t const [[qual_var (Q)]] * s, wchar_t c);
Replace "the wcsrchr
generic function" with "the wcsrchr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.31.4.5.7 "The wcsstr
generic function", changing the signature in the synopsis:
#include <wchar.h> wchar_t [[qual_with (Q)]] * wcsstr (wchar_t const [[qual_var (Q)]] * s1, const wchar_t *s2);
Replace "the wcsstr
generic function" with "the wcsstr
function" in paragraph 2 and
paragraph 3, and the clause title.
Modify 7.31.4.5.5 "The wmemchr
generic function", changing the signature in the synopsis:
#include <wchar.h> wchar_t [[qual_with (Q)]] * wmemchr (wchar_t const [[qual_var (Q)]] * s, wchar_t c, size_t n);
Replace "the wmemchr
generic function" with "the wmemchr
function" in paragraph 2 and
paragraph 3, and the clause title.
Would WG14 like to add something along the lines of the bind_type
, bind_var
, qual_var
and
qual_with
attributes specified in N3316 to C2y?
Would WG14 like to allow function designator expressions to be cast to more-generic function pointers, as specified in N3316, in C2y?
Would WG14 like to change the specification of the generic search and sort functions bsearch
,
qsort
, memchr
, strchr
, strpbrk
, strrchr
, strstr
, wcschr
, wcspbrk
, wcsrchr
,
wcsstr
and wmemchr
to include the bind_var
and qual_var
attributes instead of QVoid
et al.?
Would WG14 like to see something along the lines of bind_var
and bind_type
for structure
or container types?
C2y latest public draft
Parametric polymorphism
Towards type-checked polymorphism
Polymorphic Types
// basic array mapper
// we can use it with mis-typed arguments
typedef void (* Mutate) (void *, void *);
typedef void * (* Step) (void *);
void addOneInt (void * in, void * out) { *(int *)out = *(int *)in + 1; }
void addOneFloat (void * in, void * out) { *(float *)out = *(float *)in + 1.0f; }
void * step_float (void * p) { return (float *)p + 1; }
void * step_int (void * p) { return (int *)p + 1; }
int ia[10];
float fa[10];
void map (void * array_in, void * array_out, int size, Mutate mut, Step step) {
void * in = array_in;
void * out = array_out;
for (int i = 0; i < size; ++ i, in = step (in), out = step (out)) {
mut (in, out);
}
}
void incrArrays (void) {
map (ia, ia, 10, addOneInt, step_int);
map (fa, fa, 10, addOneFloat, step_float);
// oh no: this also compiles, because of void*
map (fa, fa, 10, addOneInt, step_float);
map (ia, fa, 10, addOneInt, step_float);
map (ia, fa, 10, addOneInt, step_float);
map (ia, ia, 10, addOneInt, step_float);
map (fa, fa, 10, addOneInt, step_int);
map (ia, ia, 10, addOneFloat, step_int);
map (ia, fa, 10, addOneFloat, step_int);
map (ia, ia, 10, addOneFloat, step_int);
map (ia, ia, 10, addOneFloat, step_int);
map (ia, ia, 10, addOneFloat, step_float);
}
// basic array mapper
// enhanced with type checking despite accepting arrays of any type - checks
// that the operand kind of the mapped function matches the array element type
// i.e. map :: ([T], T -> T) -> [T]
typedef void (* Mutate) (void *, void *);
typedef void * (* Step) (void *);
void addOneInt (void * in, void * out) { *(int *)out = *(int *)in + 1; }
void addOneFloat (void * in, void * out) { *(float *)out = *(float *)in + 1.0f; }
void * step_float (void * p) { return (float *)p + 1; }
void * step_int (void * p) { return (int *)p + 1; }
int ia[10];
float fa[10];
void map_impl (void * array_in, void * array_out, int size, Mutate mut, Step step) {
void * in = array_in;
void * out = array_out;
for (int i = 0; i < size; ++ i, in = step (in), out = step (out)) {
mut (in, out);
}
}
#define FunctionDescriptor(Type, Func) union { Func func; Type T; }
#define same_type(A, B) _Generic(1 ? (A) : (B) \
, void *: 0 \
, void const *: 0 \
, void volatile *: 0 \
, void const volatile *: 0 \
, default: 1)
#define check_same_type(A, B) _Static_assert (same_type (A, B), "types must match");
#define check_array_size
#define map(in, out, size, mut, step) do { \
check_same_type ((in), (out)); \
\
check_same_type ((in), &(mut).T); \
check_same_type ((in), &(step).T); \
\
map_impl ((in), (out), (size), (mut).func, (step).func); \
} while (0)
typedef FunctionDescriptor (int, Mutate) MutInt;
typedef FunctionDescriptor (int, Step) StepInt;
typedef FunctionDescriptor (float, Mutate) MutFloat;
typedef FunctionDescriptor (float, Step) StepFloat;
MutInt addOneInt_g;
StepInt stepInt_g;
MutFloat addOneFloat_g;
StepFloat stepFloat_g;
void incrArrays (void) {
map (ia, ia, 10, addOneInt_g, stepInt_g);
map (fa, fa, 10, addOneFloat_g, stepFloat_g);
// no longer compile!
// map (fa, fa, 10, addOneInt, step_float);
// map (ia, fa, 10, addOneInt, step_float);
// map (ia, fa, 10, addOneInt, step_float);
// map (ia, ia, 10, addOneInt, step_float);
// map (fa, fa, 10, addOneInt, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, fa, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_float);
}
// basic array mapper
// enhanced with type checking despite accepting arrays of any type - checks
// that the operand kind of the mapped function matches the array element type
// i.e. map :: ([T], T -> T) -> [T]
#define Auto(T) void [[bind_var (T)]]
#define Void(T) void [[bind_type (T)]]
#define Mutate(T, N) void (* N) (Void (T) *, Void (T) *)
#define Step(T, N) Void (T) * (* N) (Void (T) *)
void addOneInt (Void (int) * in, Void (int) * out) { *(int *)out = *(int *)in + 1; }
void addOneFloat (Void (float) * in, Void (float) * out) { *(float *)out = *(float *)in + 1.0f; }
Void (float) * step_float (Void (float) * p) { return (float *)p + 1; }
Void (int) * step_int (Void (int) * p) { return (int *)p + 1; }
int ia[10];
float fa[10];
void map (Auto (A) * array_in, Auto (A) * array_out, int size, Mutate(A, mut), Step (A, step)) {
Void (A) * in = array_in;
Void (A) * out = array_out;
for (int i = 0; i < size; ++ i, in = step (in), out = step (out)) {
mut (in, out);
}
}
void incrArrays (void) {
map (ia, ia, 10, addOneInt, step_int);
map (fa, fa, 10, addOneFloat, step_float);
// no longer compile!
// map (fa, fa, 10, addOneInt, step_float);
// map (ia, fa, 10, addOneInt, step_float);
// map (ia, fa, 10, addOneInt, step_float);
// map (ia, ia, 10, addOneInt, step_float);
// map (fa, fa, 10, addOneInt, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, fa, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_int);
// map (ia, ia, 10, addOneFloat, step_float);
}