[ub] Type punning to avoid copying

Gabriel Dos Reis gdr at axiomatics.org
Mon Sep 9 23:54:48 CEST 2013


Richard Smith <richardsmith at google.com> writes:

| On Mon, Sep 9, 2013 at 10:44 AM, Gabriel Dos Reis <gdr at axiomatics.org> wrote:
| 
|     Jeffrey Yasskin <jyasskin at google.com> writes:
| 
|     | On Sun, Sep 8, 2013 at 9:48 PM, Gabriel Dos Reis <gdr at axiomatics.org>
|     wrote:
|     | > |     In fact, it is not at clear that either programmers or compilers
|     want
|     | > |     a notion of 'object resuscitation'.
|     | > |
|     | > |
|     | > | I don't know what you mean by 'resuscitation' here. Can you
|     elaborate?
|     | >
|     | > Resuscitation here refer to the idea that
|     | >
|     | >   "there exists a set of times when objects' lifetimes begin and end,
|     and
|     | >    that set gives the program defined behavior, then the program has
|     | >    defined behavior"
|     | >
|     | > If we don't have a uniquely defined point in time where the lifetime of
|     | > an object starts, that means that either it never started or it started
|     | > and ended multitple times.
|     |
|     | I think you've misread Richard here. He didn't say "a set of times
| 
|     I am not sure but only Richard knows whether he meant to talk about a
|     specific object's lifetime or the lifetimes of a collection of objects.
| 
|     If the latter, that is a very surprising way to tackle the issue since
|     it does not appear to shed more light than the conversation about the
|     lifetime of a given object.
| 
| 
| I did mean the latter.
| 
| One problem we face is that, for a given execution of a given program, the
| lifetime of a given object is not uniquely determined.

I think this is where we differ, at a very fundamental level.  
Truth to be told, this is the first time that I see this suggestion.

In my discussions with Bjarne, I never got the impression that this is
what he conceived or wanted.  But, he will sure shime in...

| And that's not a bug,
| it's a feature, and people rely on it. Consider this contrived example:
| 
| struct A { int x; int y; };
| struct B { double d; };
| static_assert(sizeof(A) == sizeof(B), "");
| 
| int f() {
|   alignas(A, B) char my_buffer[sizeof(A)];
|   memcpy(my_buffer, data_from_network, sizeof(A)); // #1
|   if (getch() == 'x') // #2
|     return x + reinterpret_cast<A*>(my_buffer)->x; // #3
|   else
|     return x + reinterpret_cast<B*>(my_buffer)->d; // #4
| }
| 
| [Under my interpretation of the current rules, and assuming the static_assert
| does not fire:]
| 
| In the case where the 'if' condition is true, an object of type A at address &
| my_buffer must have had its lifetime begin before line #1, and last until after
| line #3. And conversely, when the 'if' condition is false, an object of type B
| at address &my_buffer must have had its lifetime begin before line #1, and last
| until after line #4.
| 
| Both of these cases appear to have defined behavior, assuming the memcpy copies
| in a valid object representation for the chosen type: storage of the
| appropriate size and alignment was obtained for an object with trivial
| initialization ([basic.life] (3.8)p1), so the object's lifetime began. But by
| the time we reach line #2, we *do not know* whether there is an object of type
| A or an object of type B in my_buffer.
| 
| 
|     | when an object's (singular, possessive) lifetimes begin and end"; he
|     | said "a set of times when objects' (plural, possessive) lifetimes
|     | begin and end". I read that as a mapping of an object to the point in
|     | time when its lifetime begins and the point in time when its lifetime
|     | ends.
| 
|     Here is the original paragraph:
| 
|     # More generally, my view of how the lifetime rules in [basic.life]p1 are
|     # intended to work is:
|     #  * If there exists a set of times when objects' lifetimes begin and end,
|     and
|     # that set gives the program defined behavior, then the program has defined
|     # behavior
|     #  * Otherwise, the program has undefined behavior
|     #
|     # (In effect, the programmer chooses when lifetimes begin and end, and does
|     not
|     # need to write this intent in the source code.) Different choices of
|     lifetime
|     # beginning/end can only change whether the program has defined behavior,
|     and
|     # cannot imbue it with two different defined behaviors, so this approach
|     seems to
|     # be coherent, and (I think) captures what people expect.
| 
| 
| Right. Note that there can be multiple different mappings of object to
| lifetime, all of which give the program defined behavior (for instance, we
| don't know exactly when the object's lifetime starts in the above example
| code). We have a fundamental coherence property: if there are multiple
| different such mappings with defined behavior, all of them must have the *same*
| defined behavior.

This coherence problem exists only if you believe that the C++ object
model was designed to have such multiple mappings for object lifetime.  This
explains why we have been talking past each other until now :-)

-- Gaby


More information about the ub mailing list