Document Number: | DXXXXR1 |
Date: | 2020-06-15 |
Author: | Anthony
Williams Just Software Solutions Ltd |
Audience: | EWG |
This paper relates to P1726: Pointer lifetime-end zap and provenance, too. My argument is that in many ways pointers are already treated like bags of bits by the language, so we should be consistent, and treat them as such throughout. A consequence of this is that there can be no "lifetime-end pointer zap".
This paper provides a series of examples. I believe all these examples are clearly defined by the standard due to the fact that pointers are scalar types and trivially copyable types.
As shown by these examples, the "pointer zap" from the final sentence of
[basic.stc] p4 ( Any other use of an invalid pointer value has
implementation-defined behavior
), and especially note 31 (Some
implementations might define that copying an invalid pointer value
causes a system-generated runtime fault.
) is clearly incompatible
with pointers being trivially copyable types from [basic.types]
p3. Consequently we should either strike that permission from the
standard and require that invalid pointer values are still copyable and
comparable, or we should decide that pointers are not trivially
copyable after all, which would have far reachging
consequences.
All standard references are to the C++ working draft from the 2020-04 mailing: N4861.
All these examples have been tested with gcc, clang and MSVC. Links to compiler explorer are provided for each example.
Strike the final sentence and note 31 from [basic.stc] p4:
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values (6.8.2). Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior.Any other use of an invalid pointer value has implementation-defined behavior.31
Add a new sentence to the end of [basic.stc] p4:
Copying and assigning invalid pointer values preserves the value representation. Comparisons involving an invalid pointer value return an unspecified result. An invalid pointer value will become a valid pointer value if region of storage with dynamic storage duration is allocated and the value representation of a pointer to the newly allocated storage cast to the same pointer type as the erstwhile-invalid pointer value is the same as the value representation of the erstwhile-invalid pointer value.
memcpy
on a pointer#include <assert.h> #include <string.h> int main() { int *x= new int(42); int *y= nullptr; memcpy(&y,&x,sizeof(x)); assert(x == y); assert(*y==42); }
Here, we use memcpy
to copy the bits of a pointer from one
pointer to another. The second pointer is now valid and points to the same
thing the original did because pointers are trivially copyable
([basic.types] p3).
memcpy
via a buffer#include <assert.h> #include <string.h> int main() { int *x= new int(42); int *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); memcpy(&y, buffer, sizeof(x)); assert(x == y); assert(*y == 42); }
Here, we use memcpy
to copy the bits of a pointer from one
pointer to a buffer, and then from that buffer to another pointer. The
second pointer is now valid and points to the same thing the original did
because pointers are trivially copyable ([basic.types] p2).
reinterpret_cast
to an integer#include <assert.h> #include <string.h> #include <stdint.h> int main() { int *x= new int(42); int *y= nullptr; uintptr_t temp= reinterpret_cast<uintptr_t>(x); y= reinterpret_cast<int *>(temp); assert(x == y); assert(*y == 42); }
Here we rely on the provision of [expr.reinterpret.cast] p5 that a pointer may be cast to an integer and back and retain its value.
memcpy
with modifications#include <assert.h> #include <string.h> int main() { int *x= new int(42); int *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); for(auto &c : buffer) { c^= 0x55; } for(auto &c : buffer) { c^= 0x55; } memcpy(&y, buffer, sizeof(x)); assert(x == y); assert(*y == 42); }
Now we take example 1 a step further: we perform a reversible
modification on the bits in the buffer after the
first memcpy
, then reverse that modification
and memcpy
it back. Since the bits in the buffer now hold
their original values, we can copy them to a pointer, which will have the
same value, because pointers are trivially copyable.
memcpy
and write to file#include <assert.h> #include <string.h> #include <stdio.h> int main() { int *x= new int(42); int *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); auto file= fopen("tempfile", "wb"); auto written= fwrite(buffer, 1, sizeof(buffer), file); assert(written == sizeof(buffer)); fclose(file); memset(buffer, 0, sizeof(buffer)); file= fopen("tempfile", "rb"); auto read= fread(buffer, 1, sizeof(buffer), file); assert(read == sizeof(buffer)); fclose(file); memcpy(&y, buffer, sizeof(x)); assert(x == y); assert(*y == 42); }
This time we are copying the pointer to a buffer, writing our bytes to a file, clearing the buffer and reading the bytes back from the file, then copying the bytes back to the pointer. If our file is unmodified then the buffer will have the same contents after reading as it did before writing, so copying the buffer back to the pointer yields the same value, and the pointer is again valid and points to the same object.
#include <assert.h> #include <string.h> #include <stdio.h> #include <new> struct X { int i; }; int main() { X *x= new X{42}; X *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); x->~X(); new(x) X{99}; memcpy(&y, buffer, sizeof(x)); assert(x == y); assert(y->i == 99); assert(x->i == 99); }
This time, we destroy the pointed-to object and recreate a new object with a new value at the same memory location.
The pointer x
still holds the same bit pattern, and still
points to a valid object, so both the original pointer x
and
the newly constructed copy y
point to the new object, and all
is well by [basic.life] p8.
delete
and new
the object#include <assert.h> #include <string.h> #include <stdio.h> #include <new> struct X { int i; }; int main() { X *x= new X{42}; X *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); delete x; y= new X{99}; unsigned char buffer2[sizeof(x)]; memcpy(buffer2, &y, sizeof(x)); if(memcmp(buffer, buffer2, sizeof(x))) { printf("Different address\n"); return 0; } memcpy(&x, buffer2, sizeof(x)); assert(x == y); assert(y->i == 99); assert(x->i == 99); }
This time, we destroy the pointed-to object with delete
and recreate a new object
with a new value with new
.
We then copy the new pointer into a buffer and compare the buffers. If the buffers are different, then the pointers are clearly different and our test doesn't work, so we stop.
If the buffers are the same, then we copy the new buffer (which is a copy of our new pointer) into the old pointer.
x
is now a copy of the raw bits of our new pointer, so
everything must work.
delete
and new
the object again#include <assert.h> #include <string.h> #include <stdio.h> #include <new> struct X { int i; }; int main() { X *x= new X{42}; X *y= nullptr; unsigned char buffer[sizeof(x)]; memcpy(buffer, &x, sizeof(x)); delete x; y= new X{99}; unsigned char buffer2[sizeof(x)]; memcpy(buffer2, &y, sizeof(x)); if(memcmp(buffer, buffer2, sizeof(x))) { printf("Different address\n"); return 0; } assert(x == y); assert(y->i == 99); assert(x->i == 99); }
This is the same as example 7, except we don't copy the raw bits from the new buffer over our old pointer.
We know that the bits of x
and the bits of y
are the same because we compared them with memcmp
. Since the
pointers are trivially copyable, the value of the pointer is determined by
the value representation, which is the set of bits of
the object representation. Since we know the object
representation is the same, the value representation must be
the same, so the pointers must have the same value.
Since the pointers must have the same value, x
must be equal
to y
, and must point to the same object, and all is well.
std::atomic
to hold the pointer#include <assert.h> #include <string.h> #include <stdio.h> #include <new> #include <atomic> struct X { int i; }; int main() { X *x= new X{42}; X *y= nullptr; std::atomic<X *> p(x); delete x; y= new X{99}; X *temp= y; if(!p.compare_exchange_strong(temp, y)) { printf("Different address\n"); return 0; } assert(x == y); assert(y->i == 99); assert(x->i == 99); }
This is the same as example 8, except instead of
using memcmp
to determine the equivalence, we
use compare_exchange_strong
, which compares pointer as-if
with memcmp
.
std::atomic
to hold the pointer, comparison the other way round#include <assert.h> #include <string.h> #include <stdio.h> #include <new> #include <atomic> struct X { int i; }; int main() { X *x= new X{42}; X *y= nullptr; delete x; y= new X{99}; std::atomic<X *> p(y); if(!p.compare_exchange_strong(x, nullptr)) { printf("Different address\n"); return 0; } assert(x == y); assert(y->i == 99); assert(x->i == 99); }
This is the same as example 9, except that rather than comparing
the temp
value copied from y
with our stored
pointer, we store the new value in the atomic, and compare it to our
original x
. This still works because
the compare_exchange_strong
compares as-if
using memcmp
, so we are comparing the object representation
of x
against the object representation of the copy
of y
stored in p
: if the pointers have the same
object representation then they have the same value representation, so must
be the same and point to the same object.