Document Number: P3248R1.
Date: 2024-06-16.
Reply to: Gonzalo Brito Gadeschi <gonzalob _at_ nvidia.com>.
Authors: Gonzalo Brito Gadeschi.
Audience: SG1, SG22, EWG, LEWG.
Require [u]intptr_t
Changelog
- R1:
- Add "Header file inconsistency between C and C++" discussion to "Design" section.
- Add context of C programming language efforts to require
[u]intptr_t
.
- Add clarifications with respect to Memory Tagging.
- Add C23 specification of
[u]intptr_t
.
- Add impact analysis on conforming and non-conforming implementations.
- R0: initial draft.
Motivation
Proposals like P2835 and P3125 use [u]intptr_t
as an integer type capable of holding a pointer value in their APIs. However, [u]intptr_t
being optional forces sub-optimal design choices such as making APIs optional or introducing workarounds.
The potential absence of [u]intptr_t
compromises the portability of high-level software and attempts to address this introduce software engineering overheads and potential portability bugs, as seen in libvlc PR#1519.
This proposal advocates for requiring [u]intptr_t
in C++ to ensure that all C++ code can rely on integer types capable of holding a pointer value.
Status quo
C Programming language semantics of [u]intptr_t
The ISO/IEC 9899:2023 Working Draft specifies [u]intptr_t
semantics as follows:
7.22.1.4 Integer types capable of holding object pointers
-
The following type designates a signed integer type, other than a bit-precise integer type, with the property that any valid pointer to void
can be converted to this type, then converted back to pointer to void
, and the result will compare equal to the original pointer
intptr_t
The following type designates an unsigned integer type, other than a bit-precise integer type, with the property that any valid pointer to void
can be converted to this type, then converted back to pointer to void
, and the result will compare equal to the original pointer:
uintptr_t
These types are optional.
Other sections of the specification provide additional operations that preserve [u]intptr_t
values:
memcpy
- I/O functions like
fprintf
/fscanf
on [u]intptr_t
.
ISO/IEC CD TS 6010 - A provenance-aware memory object model for C (N3005) explores extending these guarantees.
C++'s [expr.reinterpret.cast#5] brings C [u]intptr_t
semantics into C++ as follows:
A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value ([basic.compound]); mappings between pointers and integers are otherwise implementation-defined.
C++ [cstdio.syn#1] imports frpintf
/fscanf
from C.
Requiring [u]intptr_t
in the C Programming Language
The C programming language proposal N2889 explored requiring [u]intptr_t
. It was rejected for C23 but adopted into ISO/IEC CD TS 6010 - A provenance-aware memory object model for C (N3005) to enable C to gain experience with the proposal. There is consensus that this the right approach, but there is not enough implementation experience.
Impact analysis
A survey found ubiquitous support for [u]intptr_t
in conforming C++ implementations (*):
- C++ Standard Library implementations assume
[u]intptr_t
are available: libstdc++, libc++, and Microsoft STL.
- C++ Compilers supporting
[u]intptr_t
on all targerts, including those with non-standard pointer sizes: GCC, Clang, MSVC.
- C++ Platform ABIs specify the size and alignment of pointers and the calling convention of Integer types, fixing the ABI of
[u]intptr_t
. Extended integer types avoid breaking the ABI of intmax_t
when introducing a wider [u]intptr_t
(this used to be a problem, see N2889).
We did not find any conforming implementation that is inconsistent in C and C++ with respect to the availability of [u]intptr_t
: all implementations found provide these types in the headers of both programming languages.
We did not find any conforming implementation that:
- would stop conforming if C++ were to require
[u]intptr_t
, or
- does not already provide
[u]intptr_t
.
Therefore, we conclude that C++ requiring [u]intptr_t
:
- does not regress current implementation support, and
- does not require any implementation effort,
for any currently conforming implementation.
(*) many C++ implementations are not conforming in one way or antoher, but here we focus on pointers.
Full support for [u]intptr_t
cannot be expected on platforms that lack full support for pointers. All the non-conforming implementations found, are non-conforming with respect to pointer support. For example, their I/O functions (fprintf
/fscanf
) or memcpy
to unaligned addresses do not uphold pointer round-trips (e.g. via %p
) validity requirements.
We evaluate the impact on these implementations in terms of what "partial" support for [u]intptr_t
can be provided and at what effort (e.g. at least to document which partial support, if any, is provided).
We found that the following non-conforming platforms would not be impacted by C++ requiring [u]intptr_t
:
We found that the following non-conforming platforms may be impacted by C++ requiring [u]intptr_t
:
- IBM i (see also IBM AS/400): uses PowerPC AS Tagged Memory Extensions. Its ILE C++ compiler already documents standards compliance limitations, including lack
[u]intptr_t
(even though these types are currently optional). Whether it can implement [u]intptr_t
is to be determined, but if it can, whether it does so may depend on other factors like customer demand.
- Elbrus has memory tagging capabilities: in protected mode, pointers are 128-bit wide and include a memory address, an object size, and an offset, but
[u]intptr_t
is only 64-bit wide and does not support ptr2int2ptr round-trips. CMake and libfmt support its compiler, and the latter employs a fallback in case [u]intptr_t
is not available. Whether it can implement [u]intptr_t
is to be determined, and if it can, whether it does so may depend on other factors.
Header file inconsistency between C and C++
On a platform in which the C implementation does not provide this type (only non-conforming implementations found), the <stdint.h>
header does not provide this type in the C programming language (e.g. when processed by a C compiler).
Per [support.c.header.other.1], in C++ <stdint.h>
has the same content as <cstdint>
. In a platform in which [u]intptr_t
is not available to C via <stdint.h>
, it is required to be available to C++ via both <stdint.h>
and <cstdint>
:
#include <stdint.h>
intptr_t val;
#include <stdint.h>
intptr_t val;
#include <cstdint>
std::intptr_t val;
This disconnect in platforms in which the C implementation does not provide [u]intptr_t
may impact developer productivity in those platforms.
Design
Design alternatives:
- C++ requires
[u]intptr_t
.
- C++ adds new integer types - different from
[u]intptr_t
- capable of holding a pointer value.
- Do nothing.
This proposal advocates for Option 1, i.e., for C++ to require [u]intptr_t
, because:
- Pre-existing code: All implementations surveyed provide these on all platforms. This has led to a large corpus of pre-existing code using
[u]intptr_t
. Requiring [u]intptr_t
makes this code portable to all platforms C++ supports. Inventing new C++ types would make this code non-idiomatic and cause significant churn on all ecosystems for little added value.
- Compatibility with C: By requiring these types in C++ with the same semantics as C we ensure C++ remains forward compatible with C eventually requiring these types, since if that were to happen, C++ would get their exact same semantics and ABI. This is particularly important with respect to the pointer provenance rules as specified in TS 6010. Adding new C++ types that are not available in C would reduce C++'s compatibility with C. There is however a nuanced header file inconsistency between C and C++ that is covered in the next section.
- ABI: Platforms whose ABI specifies
intmax_t
to be smaller than the platform's pointer size are allowed to provide wider [u]intptr_t
integer types since C23 and C++23 due to extended integer type support.
- Cost: There is a cost to doing nothing. Significant time was spent on
atomic_ref::address
to find a sub-optimal solution when the right solution everyone agrees on is uintptr_t
.
Usage Guideline
[u]intptr_t
is well suited for C++ language or C++ Standard Library APIs that need an integer type capable of holding a pointer value, i.e., an integer type with a lossless conversion from/to pointer.
Some features or APIs may only need an integer type capable of holding a pointer address. C and C++ do not currently provide an integer type suited for this use case, but some implementations do provide it as an extension, in platforms were this distinction is crucial, e.g., CHERI C/C++ implementations provide ptraddr_t
in <stddef.h>
(the CHERI C/C++ Programming Guide is currently outdated and mentions vaddr_t
instead of ptraddr_t
).
Wording changes
Modify [cstdint.syn]:
- The header
<cstdint>
supplies integer types having specified widths, and macros that specify limits of integer types.
// all freestanding
namespace std {
using int8_t = signed integer type; // optional
using int16_t = signed integer type; // optional
using int32_t = signed integer type; // optional
using int64_t = signed integer type; // optional
using intN_t = see below; // optional
using int_fast8_t = signed integer type;
using int_fast16_t = signed integer type;
using int_fast32_t = signed integer type;
using int_fast64_t = signed integer type;
using int_fastN_t = see below; // optional
using int_least8_t = signed integer type;
using int_least16_t = signed integer type;
using int_least32_t = signed integer type;
using int_least64_t = signed integer type;
using int_leastN_t = see below; // optional
using intmax_t = signed integer type;
using intptr_t = signed integer type; // optional
using uint8_t = unsigned integer type; // optional
using uint16_t = unsigned integer type; // optional
using uint32_t = unsigned integer type; // optional
using uint64_t = unsigned integer type; // optional
using uintN_t = see below; // optional
using uint_fast8_t = unsigned integer type;
using uint_fast16_t = unsigned integer type;
using uint_fast32_t = unsigned integer type;
using uint_fast64_t = unsigned integer type;
using uint_fastN_t = see below; // optional
using uint_least8_t = unsigned integer type;
using uint_least16_t = unsigned integer type;
using uint_least32_t = unsigned integer type;
using uint_least64_t = unsigned integer type;
using uint_leastN_t = see below; // optional
using uintmax_t = unsigned integer type;
using uintptr_t = unsigned integer type; // optional
}
#define INTN_MIN see below
#define INTN_MAX see below
#define UINTN_MAX see below
#define INT_FASTN_MIN see below
#define INT_FASTN_MAX see below
#define UINT_FASTN_MAX see below
#define INT_LEASTN_MIN see below
#define INT_LEASTN_MAX see below
#define UINT_LEASTN_MAX see below
#define INTMAX_MIN see below
#define INTMAX_MAX see below
#define UINTMAX_MAX see below
#define INTPTR_MIN see below // optional
#define INTPTR_MAX see below // optional
#define UINTPTR_MAX see below // optional
#define PTRDIFF_MIN see below
#define PTRDIFF_MAX see below
#define SIZE_MAX see below
#define SIG_ATOMIC_MIN see below
#define SIG_ATOMIC_MAX see below
#define WCHAR_MIN see below
#define WCHAR_MAX see below
#define WINT_MIN see below
#define WINT_MAX see below
#define INTN_C(value) see below
#define UINTN_C(value) see below
#define INTMAX_C(value) see below
#define UINTMAX_C(value) see below
-
The header defines all types and macros the same as the C standard library header <stdint.h>
except that the types intptr_t
and uintptr_t
and the macros INTPTR_MIN
, INTPTR_MAX
, and UINTPTR_MAX
are always defined and are not optional. See also: ISO/IEC 9899:2018, 7.20.
-
All types that use the placeholder N are optional when N is not 8, 16, 32, or 64. The exact-width types intN_t
and uintN_t
for N = 8, 16, 32, and 64 are also optional; however, if an implementation defines integer types with the corresponding width and no padding bits, it defines the corresponding typedef-names. Each of the macros listed in this subclause is defined if and only if the implementation defines the corresponding typedef-name.
[Note 1: The macros INTN_C and UINTN_C correspond to the typedef-names int_leastN_t
and uint_leastN_t
, respectively. — end note]
Acknowledgements
Jens Gustedt for their help with coordinating with WG14, TS 6010, N2889, and establishing a contact with the IBM AS/400 team. Nikolaos Strimpas and and Alibek Omarov for their help in documenting the impact to Elbrus. Aaron Ballman, Jessica Clarke, Jonathan Wakely, Ville Voutilainen, and many others, for feedback that resulted in substantial improvements to the proposal.
Document Number: P3248R1.
Date: 2024-06-16.
Reply to: Gonzalo Brito Gadeschi <gonzalob _at_ nvidia.com>.
Authors: Gonzalo Brito Gadeschi.
Audience: SG1, SG22, EWG, LEWG.
Require
[u]intptr_t
Changelog
[u]intptr_t
.[u]intptr_t
.Motivation
Proposals like P2835 and P3125 use
[u]intptr_t
as an integer type capable of holding a pointer value in their APIs[1]. However,[u]intptr_t
being optional forces sub-optimal design choices such as making APIs optional or introducing workarounds.The potential absence of
[u]intptr_t
compromises the portability of high-level software and attempts to address this introduce software engineering overheads and potential portability bugs, as seen in libvlc PR#1519.This proposal advocates for requiring
[u]intptr_t
in C++ to ensure that all C++ code can rely on integer types capable of holding a pointer value.Status quo
C Programming language semantics of
[u]intptr_t
The ISO/IEC 9899:2023 Working Draft specifies
[u]intptr_t
semantics as follows:Other sections of the specification provide additional operations that preserve
[u]intptr_t
values:memcpy
fprintf
/fscanf
on[u]intptr_t
.ISO/IEC CD TS 6010 - A provenance-aware memory object model for C (N3005) explores extending these guarantees.
C++'s [expr.reinterpret.cast#5] brings C
[u]intptr_t
semantics into C++ as follows:C++ [cstdio.syn#1] imports
frpintf
/fscanf
from C.Requiring
[u]intptr_t
in the C Programming LanguageThe C programming language proposal N2889 explored requiring
[u]intptr_t
. It was rejected for C23 but adopted into ISO/IEC CD TS 6010 - A provenance-aware memory object model for C (N3005) to enable C to gain experience with the proposal. There is consensus that this the right approach, but there is not enough implementation experience.Impact analysis
Impact on conforming implementations
A survey found ubiquitous support for
[u]intptr_t
in conforming C++ implementations (*):[u]intptr_t
are available: libstdc++, libc++, and Microsoft STL.[u]intptr_t
on all targerts, including those with non-standard pointer sizes: GCC, Clang, MSVC.[u]intptr_t
. Extended integer types avoid breaking the ABI ofintmax_t
when introducing a wider[u]intptr_t
(this used to be a problem, see N2889).We did not find any conforming implementation that is inconsistent in C and C++ with respect to the availability of
[u]intptr_t
: all implementations found provide these types in the headers of both programming languages.We did not find any conforming implementation that:
[u]intptr_t
, or[u]intptr_t
.Therefore, we conclude that C++ requiring
[u]intptr_t
:for any currently conforming implementation.
(*) many C++ implementations are not conforming in one way or antoher, but here we focus on pointers.
Impact on non-conforming implementations
Full support for
[u]intptr_t
cannot be expected on platforms that lack full support for pointers. All the non-conforming implementations found, are non-conforming with respect to pointer support. For example, their I/O functions (fprintf
/fscanf
) ormemcpy
to unaligned addresses do not uphold pointer round-trips (e.g. via%p
) validity requirements.We evaluate the impact on these implementations in terms of what "partial" support for
[u]intptr_t
can be provided and at what effort (e.g. at least to document which partial support, if any, is provided).We found that the following non-conforming platforms would not be impacted by C++ requiring
[u]intptr_t
:[u]intptr_t
documenting limitations on its support. For more details, see, e.g., the CHERI C/C++ Programming Guide or the more recent: Zaliva et al., Formal Mechanised Semantics of CHERI C: Capabilities, Undefined Behaviour, and Provenance, ASPLOS '24.We found that the following non-conforming platforms may be impacted by C++ requiring
[u]intptr_t
:[u]intptr_t
(even though these types are currently optional). Whether it can implement[u]intptr_t
is to be determined, but if it can, whether it does so may depend on other factors like customer demand.[u]intptr_t
is only 64-bit wide and does not support ptr2int2ptr round-trips. CMake and libfmt support its compiler, and the latter employs a fallback in case[u]intptr_t
is not available. Whether it can implement[u]intptr_t
is to be determined, and if it can, whether it does so may depend on other factors.Header file inconsistency between C and C++
On a platform in which the C implementation does not provide this type (only non-conforming implementations found), the
<stdint.h>
header does not provide this type in the C programming language (e.g. when processed by a C compiler).Per [support.c.header.other.1], in C++
<stdint.h>
has the same content as<cstdint>
. In a platform in which[u]intptr_t
is not available to C via<stdint.h>
, it is required to be available to C++ via both<stdint.h>
and<cstdint>
:This disconnect in platforms in which the C implementation does not provide
[u]intptr_t
may impact developer productivity in those platforms.Design
Design alternatives:
[u]intptr_t
.[u]intptr_t
- capable of holding a pointer value.This proposal advocates for Option 1, i.e., for C++ to require
[u]intptr_t
, because:[u]intptr_t
. Requiring[u]intptr_t
makes this code portable to all platforms C++ supports. Inventing new C++ types would make this code non-idiomatic and cause significant churn on all ecosystems for little added value.intmax_t
to be smaller than the platform's pointer size are allowed to provide wider[u]intptr_t
integer types since C23 and C++23 due to extended integer type support.atomic_ref::address
to find a sub-optimal solution when the right solution everyone agrees on isuintptr_t
.Usage Guideline
[u]intptr_t
is well suited for C++ language or C++ Standard Library APIs that need an integer type capable of holding a pointer value, i.e., an integer type with a lossless conversion from/to pointer.Some features or APIs may only need an integer type capable of holding a pointer address. C and C++ do not currently provide an integer type suited for this use case, but some implementations do provide it as an extension, in platforms were this distinction is crucial, e.g., CHERI C/C++ implementations provide
ptraddr_t
in<stddef.h>
(the CHERI C/C++ Programming Guide is currently outdated and mentionsvaddr_t
instead ofptraddr_t
).Wording changes
Modify [cstdint.syn]:
<cstdint>
supplies integer types having specified widths, and macros that specify limits of integer types.The header defines all types and macros the same as the C standard library header
<stdint.h>
except that the typesintptr_t
anduintptr_t
and the macrosINTPTR_MIN
,INTPTR_MAX
, andUINTPTR_MAX
are always defined and are not optional. See also: ISO/IEC 9899:2018, 7.20.All types that use the placeholder N are optional when N is not 8, 16, 32, or 64. The exact-width types
intN_t
anduintN_t
for N = 8, 16, 32, and 64 are also optional; however, if an implementation defines integer types with the corresponding width and no padding bits, it defines the corresponding typedef-names. Each of the macros listed in this subclause is defined if and only if the implementation defines the corresponding typedef-name.[Note 1: The macros INTN_C and UINTN_C correspond to the typedef-names
int_leastN_t
anduint_leastN_t
, respectively. — end note]Acknowledgements
Jens Gustedt for their help with coordinating with WG14, TS 6010, N2889, and establishing a contact with the IBM AS/400 team. Nikolaos Strimpas and and Alibek Omarov for their help in documenting the impact to Elbrus. Aaron Ballman, Jessica Clarke, Jonathan Wakely, Ville Voutilainen, and many others, for feedback that resulted in substantial improvements to the proposal.
This does not imply that these proposals make correct use of these types; the Usage Guideline section covers that. ↩︎