[ub] Type punning to avoid copying (was: unions and undefined behavior)

Lawrence Crowl Lawrence at Crowl.org
Thu Jul 25 03:31:06 CEST 2013


On 7/24/13, Nevin Liber <nevin at eviloverlord.com> wrote:
> On 24 July 2013 09:20, Nevin Liber <nevin at eviloverlord.com> wrote:
>> On 24 July 2013 06:58, Gabriel Dos Reis <gdr at cs.tamu.edu> wrote:
>>> The "union hack" is treacherous and needs far more investigation
>>> than it appears.
>>
>> So treacherous that people have been managing to use it for the last 30
>> years or so?
>>
>>>  Are we going to mandate that the reintepretation of
>>> the bits cannot possibly yield trap representation?
>>
>> I wouldn't.  This is for non-portable machine-specific code.  If you blow
>> a trap representation or an alignment access, you are on your own.
>>
>> If I wanted to do this via copying, I'd just use Java.
>
> Assuming no alignment, packing, endian or malformed packet issues, here is
> an example of what people want to do:
>
> enum class Protocol : char
> {
>     UDP = 17,
>     // ...
> };
>
> struct IPHeader
> {
>     //...
>     Protocol protocol;
>     // ...
> };
>
> struct UDPHeader : IPHeader
> {
>     //...
>     uint16_t length;
>     //...
>     char     data[1];
> };
>
> union Header
> {
>     IPHeader  ip;
>     UDPHeader udp;
>     //...
> };
>
> // Deliver payload to f without copying
> Payload(dynarray<char> const& packet, std::function<void(char const*, char
> const*)> const& f)
> {
>     // Want to overlay Header on packet.data()
>     auto& header = *static_cast<Header const*>(static_cast<void
> const*>(packet.data()));
>
>     switch (header.ip.protocol)
>     {
>         case Protocol::UDP:
>             f(&header.udp.data[0], &header.udp.data[header.udp.length -
> 8]);
>             break;
>         //...
>     }
>
> }
>
> Q1:  How many places has undefined behavior been invoked in the above?

I do not see any because you are aliasing with char arrays, and so you are
pretty much covered on the big problems.

(Personally, I detest the fact that we have overloaded "characters" with "raw
memory", but that's not likely to get fixed any time soon.)

> Q2:  What is the correct way to write this code?

-- 
Lawrence Crowl


More information about the ub mailing list