Offset Of Class Members

Sulley

2024-06-15

程序 - C++

offsetof is a built-in macro in C++ used to evaluate the layout relative offset of a non-static data member in a class. Unfortunately, using the macro in C++ comes with some fundamental problems that prevent uses of it. This blog post tries to mitigate these problems and provide a generic workaround of offsetof.

All code is based on C++20.

The Built-in `offsetof`

Let's start with formalizing the problem. Given a class type T and a non-static data member m, the built-in macro offsetof receives both T and m to return m's relative offset inside T, for example:

struct T {
  char c;
  int i;
  double d;
};

offsetof(T, c);   // 0
offsetof(T, i);   // 4
offsetof(T, d);   // 8

However, if T is not a POD standard-layout type, the result of offsetof is undefined and use of it is conditionally-supported. This means we should not use offsetof for non standard-layout types.

Here is the code sample: https://godbolt.org/z/zhxWKaW93. When you compile it, the compiler will complain that 'offsetof' within non-standard-layout type 'ab' is conditionally-supported though it does compile successfully. This is because the standard stipulates that offsetof only works perfectly for standard-layout types and also it's up to the compiler to implement this macro for non-POD non-standard-layout types.

Besides, as seen in the code sample, offsetof does not support virtual inheritance. If you try to access the offset of a data member in a virtually inherited class, you will get an error invalid access to non-static data member in virtual base of NULL object.

It's worth noting that the offsetof macros expands to an integral constant expression of type std::size_t. This means that its value is determined at compile time. This is quite important because

Pointer-to-member and Template Argument Deduction

So what can we do to eliminate these warnings, even further solving the virtual inheritance problem? The first try is using the pointer-to-member syntax and template argument deducation. We can write the following code:

// Version 1 
template <typename C, typename M>
constexpr std::size_t OffsetOf(M C::* p)
{
  return (char*)&(((C*)nullptr)->*p) - (char*)nullptr;
}

constexpr auto offset = OffsetOf(&T::c);

However, this will give us an error saying that 'reinterpret_cast' is not a constant expression, as stipulated by the standard. This means you cannot evaluate offset at compile time, nor can you add the constexpr specifier. Except for the compile time pitfall, this snippet of code cannot deal with multiple inheritance. For example:

struct A {
    int a;
    float b;
};

struct B {
    double c;
    int d;
};

struct X : A, B {};

OffsetOf(&X::c); // 0, Wrong!

In this case, OffsetOf outputs 0 while the correct output should be 8. This is because when you feed &X::c as the argument, the template arguments are actually deduced to C=B and M=double, therefore the instantiated template function is OffsetOf(double B::* p). To this end, we need to provide an additional template parameter B, to indicate the real class of interest.

// Version 2
template <typename B = void, typename C, typename M>
constexpr std::size_t OffsetOf(M C::* p)
{
    using Base = typename std::conditional<std::is_same_v<B, void>, C, B>::type;
    return (char*)&((Base*)nullptr->*p) - (char*)nullptr;
}

struct A {
    int a;
    float b;
};

struct B {
    double c;
    int d;
};

struct X : A, B {};

OffsetOf(&A::a);    // 0
OffsetOf(&A::b);    // 4
OffsetOf(&B::c);    // 0
OffsetOf(&B::d);    // 8
OffsetOf<X>(&X::a); // 0
OffsetOf<X>(&X::b); // 4
OffsetOf<X>(&X::c); // 8
OffsetOf<X>(&X::d); // 16

Good! Now OffsetOf receives an extra argument with default void. If it's not provided, then the regular argument deduction will happen, otherwise it will use the provided argument for type casting.

The problems of compile-time evaluation and virtual inheritance still remain, because (char*) is a reinterpret_cast, which cannot be used in constant expression, and we are casting a nullptr instead of a real pointer to object of type Base, which violates the rule of virtual inheritance.

To solve the virtual inheritance issue, we can create a dummy object of type Base and use this object to get the relative offset. In this way, we are leveraging a real object rather than a fake nullptr pointer.

We can write the following version 3 code:

// Version 3
template <typename T, T v>
struct ClassMemberTraits;

template <typename C, typename M, M C::* v>
struct ClassMemberTraits<M C::*, v>
{
    using ClassType = C;
    using MemberType = M;
    constexpr static M C::* value = v;
};

template <
    typename Traits, 
    typename B = void,
    typename M = typename Traits::MemberType, 
    typename C = typename Traits::ClassType, 
    M C::* p   = Traits::value >
struct OffsetHelper
{
    using Base = typename std::conditional<std::is_same_v<B, void>, C, B>::type;

    union U
    {
        char c;
        Base base;
        constexpr U() noexcept : c{} {};
        constexpr ~U() noexcept {};
    };
    constexpr static U dummy {};

    constexpr static std::size_t GetOffsetOf()
    {
        return (char*)&(dummy.base.*p) - (char*)&dummy;
    }
};

template <auto MemberPtr, typename B = void>
std::size_t GetOffsetOf = OffsetHelper<ClassMemberTraits<decltype(MemberPtr), MemberPtr>, B>::GetOffsetOf();

#define OffsetOf(C, M) GetOffsetOf<&C::M, C>

struct A {
    int a;
    float b;
};

struct B {
    double c;
    int d;
};

struct X : A, B {};

struct AA { int a; };
struct BB : public virtual AA   { int b; };
struct CC : public virtual AA   { int c; };
struct DD : public BB, public CC { int d; };

OffsetOf(A, a);    // 0
OffsetOf(A, b);    // 4
OffsetOf(B, c);    // 0
OffsetOf(B, d);    // 8
OffsetOf(X, a);    // 0
OffsetOf(X, b);    // 4
OffsetOf(X, c);    // 8
OffsetOf(X, d);    // 16

OffsetOf(DD, a);   // 32
OffsetOf(DD, b);   // 8
OffsetOf(DD, c);   // 24
OffsetOf(DD, d);   // 28

Things are getting more compilicated. Let's go through the ideas:

Our goal is to create a real object of type Base when accessing its member, so we need to use template class (struct) to store such an object. Of course, we should also try to maintain its constexpr-ness. The OffsetHelper struct is then defined to accomplish this goal.
As now it's a class rather than a function, we cannot pass in a pointer-to-member function to let the template to deduce the arguments. Rather, we must pass in the template arguments by hand. We need Base, MemberType, ClassType and the pointer-to-member object p. To faciliate extracting types and values, we use another template class ClassMemberTraits, functioning similarly to the template function to store types and the pointer-to-member object. Use of it is quite simple: the first argument receives a type of pointer-to-member, and the second argument is a corresponding value. decltype will serve this purpose.
Inside OffsetHelper, we still define the Base type according to the passed-in argument B. Then we define a union U, containing a char and a Base object, and create a constexpr static U object dummy. You may ask why we don't directly create a constexpr Base object? Well, this is because when Base is virtually inherited, we cannot instantiate it with a constexpr constructor. You can use static inline Base base{} to bypass the need of a union, but it, including the union version, will tangibly create an object of type Base due to a non-constexpress-ness. Not a good idea.
Then, we access the address of dummy in function GetOffsetOf and use a macro OffsetOf() and a helper variable template GetOffsetOf to obtain the offset.

The next step, of course, is to make it constexpr! The difficulty is the reinterpret_cast of (char*). Note that in the expression (char*)&(dummy.base.*p) - (char*)&dummy, we are actullay comparing two addresses, but with different types (e.g., int* and float*). However, C++ disallows subtracting two pointers of different types. So we need to maintain not only an object of type Base, but also its member of interest. Our goal is to have the address of pointer-to-member object equal to the address of the maintained member.

We can write the following code:

// Version 4
#include <iostream>

template <typename T, T v>
struct ClassMemberTraits;

template <typename C, typename M, M C::* v>
struct ClassMemberTraits<M C::*, v>
{
    using ClassType = C;
    using MemberType = M;
    constexpr static M C::* value = v;
};

#pragma pack(push, 1)
template<typename M, std::size_t Offset>
struct MemberAt
{
    char padding[Offset];
    M member;
};
#pragma pack(pop)

template<typename M>
struct MemberAt<M, 0>
{
    M member;
};

template<typename B, typename M, std::size_t Offset>
union PaddedUnion
{
    char c;
    B base;
    MemberAt<M, Offset> member;
};

template <
    typename Traits, 
    typename B,
    typename M = typename Traits::MemberType, 
    typename C = typename Traits::ClassType, 
    M C::* p   = Traits::value,
    std::size_t Offset = 0 >
struct OffsetHelper
{
    constexpr static PaddedUnion<B, M, Offset> dummy{};
    constexpr static std::size_t GetOffsetOf()
    {
        if constexpr (&(dummy.base.*p) > &dummy.member.member)
        {
            return OffsetHelper<Traits, B, M, C, p, Offset + sizeof(M)>::GetOffsetOf();
        }
        else
        {
            return Offset;
        }
    }
};

template <auto MemberPtr, typename B>
constexpr std::size_t GetOffsetOf = OffsetHelper<ClassMemberTraits<decltype(MemberPtr), MemberPtr>, B>::GetOffsetOf();

#define OffsetOf(C, M) GetOffsetOf<&C::M, C>

struct A {
    int a;
    float b;
};

struct B {
    double c;
    int d;
};

struct X : A, B {};

struct AA { int a; };
struct BB : public virtual AA   { int b; };
struct CC : public virtual AA   { int c; };
struct DD : public BB, public CC { int d; };


constexpr auto A_a =  OffsetOf(A, a);
constexpr auto A_b =  OffsetOf(A, b);
constexpr auto B_c =  OffsetOf(B, c);
constexpr auto B_d =  OffsetOf(B, d);
constexpr auto X_a =  OffsetOf(X, a);
constexpr auto X_b =  OffsetOf(X, b);
constexpr auto X_c =  OffsetOf(X, c);
constexpr auto X_d =  OffsetOf(X, d);

constexpr auto DD_a =  OffsetOf(DD, a);
constexpr auto DD_b =  OffsetOf(DD, b);
constexpr auto DD_c =  OffsetOf(DD, c);
constexpr auto DD_d =  OffsetOf(DD, d);

static_assert(OffsetOf(A, a) == 0, "");
static_assert(OffsetOf(A, b) == 4, "");
static_assert(OffsetOf(B, c) == 0, "");
static_assert(OffsetOf(B, d) == 8, "");
static_assert(OffsetOf(X, a) == 0, "");
static_assert(OffsetOf(X, b) == 4, "");
static_assert(OffsetOf(X, c) == 8, "");
static_assert(OffsetOf(X, d) == 16, "");

static_assert(OffsetOf(DD, a) == 32, "");
static_assert(OffsetOf(DD, b) == 8, "");
static_assert(OffsetOf(DD, c) == 24, "");
static_assert(OffsetOf(DD, d) == 28, "");

Now it's computed at compile time and work for multiple inheritance as well as virtual inheritance! But unfortunately, it fails on special layout types, e.g., types with array, #pragma pack or alignas. You can see the fail cases at https://godbolt.org/z/9xE8azGrG. This is because inside the OffsetHelper struct, we recursively compute the offset by adding sizeof(M), the size of the member type. When the type layout is special, such as including array members, adding a sizeof(M) offset will induce the wrong result.

What about changing it to sizeof(M) < alignof(C) ? sizeof(M) : alignof(C)? Still wrong for the assertion static_assert(OffsetOf(al, arr) == 10, "") due to a mismatch between member size of type alignment.

Before moving to the next version, let's step into the details of this version of code.

As before, we use ClassMemberTraits to extract the class type, member type and pointer-to-member object.
We define a union PaddedUnion to represent a Base type object and a data member of type M with a specified offset Offset. Note that the Offset is used to indicate how relatively far of the data member is currently away from base. It may be any non-negative value and recursively incremented as long as &(dummy.base.*p) > &dummy.member.member is satisfied. That is, the current offset of member still does not reach the real offset of it, and the distance is exactly &(dummy.base.*p) - &dummy.member.member. Okay, why doesn't we just use &(dummy.base.*p) - &dummy.member.member. It's because it's not a constant expression (they're not pointing to the same array or to the same object), and you cannot use static_cast to cast it to char* either.
What we can only do is to try to increment Offset and see if it's equal to the real offset of the data member, i.e., &(dummy.base.*p). So we use a if constexpr to check the condition and recursively try it.
Last, we use a utility macro OffsetOf(C, M) to align its use with the built-in macro offsetof.

Yes, the core idea is to guess the offset. The guess will stop if &(dummy.base.*p) == &dummy.member.member. Now that we use guess, we can guess it more efficiently, by leveraging the binary partitioning algorithm.

We have the following version 5 code (test cases at https://godbolt.org/z/7MGKrePKc):

// Version 5
#include <iostream>

template <typename T>
struct ClassMemberTraits;

template <typename C, typename M>
struct ClassMemberTraits<M C::*>
{
    using ClassType = C;
    using MemberType = M;
};

#pragma pack(push, 1)
template<typename M, std::size_t Offset>
struct MemberAt
{
    char padding[Offset];
    M member;
};
#pragma pack(pop)

template<typename M>
struct MemberAt<M, 0>
{
    M member;
};

template<typename B, typename M, std::size_t Offset>
union PaddedUnion
{
    char c;
    B base;
    MemberAt<M, Offset> member;
};

// ~~~~~ Begin core modification ~~~~~
template <
    auto MemberPtr,
    typename B,
    std::size_t Low,
    std::size_t High,
    std::size_t Mid = (Low + High) / 2>
struct OffsetHelper
{
    using M = ClassMemberTraits<decltype(MemberPtr)>::MemberType;
    
    constexpr static PaddedUnion<B, M, Mid> dummy{};
    constexpr static std::size_t GetOffsetOf()
    {
        if constexpr (&(dummy.base.*MemberPtr) > &dummy.member.member)
        {
            return OffsetHelper<MemberPtr, B, Mid + 1, High>::GetOffsetOf();
        }
        else if constexpr (&(dummy.base.*MemberPtr) < &dummy.member.member)
        {
            return OffsetHelper<MemberPtr, B, Low, Mid>::GetOffsetOf();
        }
        else
        {
            return Mid;
        }
    }
};
// ~~~~~ End core modification ~~~~~

template <auto MemberPtr, typename B>
constexpr std::size_t GetOffsetOf = OffsetHelper<MemberPtr, B, 0, sizeof(B)>::GetOffsetOf();

#define OffsetOf(C, M) GetOffsetOf<&C::M, C>

If you meet an error saying that constexpr variable cannot have non-literal type, you can add the following constexpr constructor and destructor to PaddedUnion:

1 2	constexpr PaddedUnion() noexcept : c{} {} constexpr ~PaddedUnion() noexcept {}

The core idea is simple: using binary search to guess the offset. We just need to slightly modify the OffsetHelper struct and the definition of GetOffsetOf(). Now, all tests passed! Congratulations to you, and to me!

Conclusion

To get a robust and generic offsetof is really really hard in C++. Although the built-in macro offsetof has a compiler-wise implementation, it should be your choice for most of the time. If you do care about the warnings, or you want to calculate the offset for virtual inheritance, our final version code may appeal to you.

Reference

https://gist.github.com/graphitemaster/494f21190bb2c63c5516

The Built-in offsetof

Pointer-to-member and Template Argument Deduction

Conclusion

Reference

The Built-in `offsetof`