[ub] Type punning to avoid copying

Howard Hinnant howard.hinnant at gmail.com
Thu Jul 25 19:40:36 CEST 2013


On Jul 25, 2013, at 11:45 AM, Gabriel Dos Reis <gdr at cs.tamu.edu> wrote:

> Howard Hinnant <howard.hinnant at gmail.com> writes:
> 
> | It was this SO question that started this thread:
> | 
> | http://stackoverflow.com/q/17789928/576911
> 
> OK, this version obviously breaks the rule and invoked undefined behavior.
> We won't go over that again.  But, let me point out that this particular
> type-based alias analysis in GCC has a long history and tradition -- and
> most certainly longer in high-performance compilers.  When it was
> implemented in GCC by Mark Mitchell (I think), we found that type-based
> alias analysis found lot of broken (C++) codes.  You can search GCC
> mailing list archive for the long discussions, and also the web.  So
> it was moved under a dedicated -fstrict-aliasing flag, then after
> a while activated at -O2.  This is ancient history, by which I mean it
> happened *15* years ago at least, circa 1998.  Independently, people wanted a
> dedicated flag for activating the warning; that was introduced more than
> a decade ago, in 2002. 
> 
> | I'm curious: The accepted answer uses memcpy and the claim is that
> | this is a correct answer to the question.  That is it does not exhibit
> | undefined behavior.  My current understanding is that I agree with
> | this answer.  But I wanted to check here.  Do people here agree that:
> | 
> | http://stackoverflow.com/a/17790026/576911
> | 
> | does not break the aliasing rules, or otherwise invoke undefined behavior?
> 
> This version does not break aliasing rules.  The question of whether it
> invokes undefined behavior is less obvious.  The current rules guarantee
> that if you memcpy() to a place, and memcpy() the same bytes back, you
> get the original value and that is well defined.  The rules also say
> that if you don't initialize an object, but you do a read access you
> invoke undefined behavior unless the object is of character type or array
> thereof.  Now, there is an interesting twist here, which is that int32_t
> admits no trap representation (just like characters), I cannot find a
> rule a rule that says that even if int32_t object isn't initialized it
> is OK, or whether memcpy() into the object is initialization.  Note, the
> first memcpy() isn't from an int32_t.
> 
> If that analysis is correct, I would welcome a rule in the language that
> says that memcpy() into an object of type that admits no trap
> representation is equivalent to initialization with value obtained from
> appropriate interpretation of the value representation bits of the
> "initializing byte" sequence.  I think this won't hinder existing
> program transformation since memcpy() is essentially understood as
> ending previous object lifetime and starting a new one.

Ok, thanks.  I'm copying the entire thing here for reference:

float InverseSquareRoot2(float x)
{
    float xhalf = 0.5f*x;
    int32_t i;
    assert(sizeof(x) == sizeof(i));
    std::memcpy(&i, &x, sizeof(i));
    i = 0x5f3759df - (i>>1);
    std::memcpy(&x, &i, sizeof(i));
    x = x*(1.5f - xhalf*x*x);
    return x;
}

If we did that (make InverseSquareRoot2 implementation defined, well defined, whatever), would that also confer the same status to this rewrite of InverseSquareRoot2?

float InverseSquareRoot4(float x)
{
    union U
    {
        float as_float;
        int32_t as_int;
    };
    U u = {x};
    U tmp;
    float xhalf = 0.5f*x;
    std::memcpy(&tmp.as_int, &u.as_float, sizeof(U));
    std::memcpy(&u.as_int, &tmp.as_int, sizeof(U));
    u.as_int = 0x5f3759df - (u.as_int>>1);
    std::memcpy(&tmp.as_float, &u.as_int, sizeof(U));
    std::memcpy(&u.as_float, &tmp.as_float, sizeof(U));
    u.as_float = u.as_float*(1.5f - xhalf*u.as_float*u.as_float);
    return u.as_float;
}

Howard



More information about the ub mailing list