Jens Maurer
2001-09-13

Issue 273: POD classes and operator&()

The Issue

The following is a quote from issue 273, as submitted by Andrei Iltchenko:
I think that the definition of a POD class in the current version of the Standard is overly permissive in that it allows for POD classes for which a user-defined operator function operator& may be defined. Given that the idea behind POD classes was to achieve compatibility with C structs and unions, this makes 'Plain old' structs and unions behave not quite as one would expect them to.

[...]

The fact that the definition of a POD class allows for POD classes for which a user-defined operator& is defined, may also present major obstacles to implementers of the offsetof macro from <cstddef>

POD

These sections in the standard define the semantics of "POD":

Aggregate

These section in the standard define the semantics of "aggregate":

Analysis

Nothing in the current wording of the standard makes a struct or union with a user-defined operator& a non-POD type. However, that is certainly surprising. At least one example in the standard (3.9p2) assumes that operator& has its default meaning. The normative text cautiously only talks about "underlying storage" (3.9p2) and "pointer to T" (3.9p3). It is not directly specified in the normative text how that pointer may be obtained when operator& is overloaded.

Older source code inherited from C which relies on the semantics laid out in 3.9p2 will not have operator& overloaded, since C does not permit operator overloading.

The following sections demonstrate how the two problematic areas of PODs with overloaded operator& can be resolved: Getting the address of an object and "offsetof".

Getting the address of an object with overloaded operator&

In order to retrieve the address of the storage for an object with overloaded operator & the following technique can be applied:
#include <cstring>

struct A
{
  int x;
  void operator&() const;
};

int main()
{
  A a;
  unsigned char space[sizeof(A)];
  std::memcpy(space, &reinterpret_cast<unsigned char&>(static_cast<A&>(a)), sizeof(A));
}
Note that getting the address for the storage of some object a requires that a is an l-value, therefore it can legally be converted into a reference.

offsetof

Offsetof is usually a macro implemented (non-conformingly) as something like
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
(Implementation taken from GCC 3.0.1.)

This implementation fails if the type of the member has operator& overloaded:

#include <cstddef>

struct A
{
  int x;
  void operator&() const;
};

struct B
{
  A a;
};

int main()
{
  int o = offsetof(B, a);
}
If "offsetof" is defined according to the idea above, the code compiles with some compilers:
#define offsetof(TYPE, MEMBER) \
  (unsigned int)&reinterpret_cast<unsigned char&>(((TYPE*)0)->MEMBER)
Note that "offsetof" could also be implemented by other non-standard means proprietary to the implementation; this shows only one example.

Conclusion

Option 1: Forbidding overloading of operator& in PODs puts the burden on the users, because they potentially have to change existing code exploiting this "feature".

Option 2: Continuing to allow operator& in PODs puts the burden on the implementors; they have to adjust their "offsetof" macros for this newly discovered problem.

I favor option 2, but I'd like to have some guidance from the core working group on this.

Proposed Resolution

Option 1

Replace in 9 paragraph 4
A POD-struct is an aggregate class that has no non-static data members of type pointer to member, non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. Similarly, a POD-union is an aggregate union that has no non-static data members of type pointer to member, non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. A POD class is a class that is either a POD-struct or a POD-union.
by
A POD-struct is an aggregate class that has no non-static data members of type pointer to member, non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator, no user-defined destructor, and no overloaded operator &. Similarly, a POD-union is an aggregate union that has no non-static data members of type pointer to member, non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator, no user-defined destructor, and no overloaded operator &. A POD class is a class that is either a POD-struct or a POD-union.

Option 2

Replace the example in 3.9 paragraph 2 by

    #define N sizeof(T)
    char buf[N];
    T obj;                          //   obj  initialized to its original value
    // Note: PODs can have operator& overloaded.
    memcpy(buf, &reinterpret_cast<unsigned char&>(static_cast<T&>(obj), N);

    //  between these two calls to  memcpy, obj  might be modified
    memcpy(&reinterpret_cast<unsigned char&>(static_cast<T&>(obj), buf, N);
    //  at this point, each subobject of  obj  of scalar type
                                    //  holds its original value
Add a footnote in 18.1 paragraph 5:
[Footnote: Note that PODs can have operator& overloaded.]