offsetof
is a built-in macro in C++ used to evaluate the
layout relative offset of a non-static data member in a class.
Unfortunately, using the macro in C++ comes with some fundamental
problems that prevent uses of it. This blog post tries to mitigate these
problems and provide a generic workaround of offsetof
.
All code is based on C++20.
The Built-in offsetof
Let's start with formalizing the problem. Given a class type
T
and a non-static data member m
, the built-in
macro offsetof
receives both T
and
m
to return m
's relative offset inside
T
, for example:
1 | struct T { |
However, if T
is not a POD standard-layout
type, the result of offsetof
is undefined and use of it is
conditionally-supported. This means we should not use
offsetof
for non standard-layout types.
Here is the code sample: https://godbolt.org/z/zhxWKaW93. When you
compile it, the compiler will complain that
'offsetof' within non-standard-layout type 'ab' is conditionally-supported
though it does compile successfully. This is because the standard
stipulates that offsetof
only works perfectly for
standard-layout types and also it's up to the compiler to implement this
macro for non-POD non-standard-layout types.
Besides, as seen in the code sample, offsetof
does not
support virtual inheritance. If you try to access the offset of a data
member in a virtually inherited class, you will get an error
invalid access to non-static data member in virtual base of NULL object
.
It's worth noting that the offsetof
macros expands to an
integral constant expression of type std::size_t
. This
means that its value is determined at compile time. This is quite
important because
Pointer-to-member and Template Argument Deduction
So what can we do to eliminate these warnings, even further solving the virtual inheritance problem? The first try is using the pointer-to-member syntax and template argument deducation. We can write the following code:
1 | // Version 1 |
However, this will give us an error saying that 'reinterpret_cast' is
not a constant expression, as stipulated by the standard.
This means you cannot evaluate offset
at compile time, nor
can you add the constexpr
specifier. Except for the compile
time pitfall, this snippet of code cannot deal with multiple
inheritance. For example:
1 | struct A { |
In this case, OffsetOf
outputs 0
while the
correct output should be 8
. This is because when you feed
&X::c
as the argument, the template arguments are
actually deduced to C=B
and M=double
,
therefore the instantiated template function is
OffsetOf(double B::* p)
. To this end, we need to provide an
additional template parameter B
, to indicate the real class
of interest.
1 | // Version 2 |
Good! Now OffsetOf
receives an extra argument with
default void
. If it's not provided, then the regular
argument deduction will happen, otherwise it will use the provided
argument for type casting.
The problems of compile-time evaluation and virtual inheritance still
remain, because (char*)
is a reinterpret_cast
,
which cannot be used in constant expression, and we are casting a
nullptr
instead of a real pointer to object of type
Base
, which violates the rule of virtual inheritance.
To solve the virtual inheritance issue, we can create a dummy object
of type Base
and use this object to get the relative
offset. In this way, we are leveraging a real object rather than a fake
nullptr
pointer.
We can write the following version 3 code:
1 | // Version 3 |
Things are getting more compilicated. Let's go through the ideas:
- Our goal is to create a real object of type
Base
when accessing its member, so we need to use template class (struct) to store such an object. Of course, we should also try to maintain its constexpr-ness. TheOffsetHelper
struct is then defined to accomplish this goal. - As now it's a class rather than a function, we cannot pass in a
pointer-to-member function to let the template to deduce the arguments.
Rather, we must pass in the template arguments by hand. We need
Base
,MemberType
,ClassType
and the pointer-to-member objectp
. To faciliate extracting types and values, we use another template classClassMemberTraits
, functioning similarly to the template function to store types and the pointer-to-member object. Use of it is quite simple: the first argument receives a type of pointer-to-member, and the second argument is a corresponding value.decltype
will serve this purpose. - Inside
OffsetHelper
, we still define theBase
type according to the passed-in argumentB
. Then we define a unionU
, containing achar
and aBase
object, and create aconstexpr static U
objectdummy
. You may ask why we don't directly create a constexprBase
object? Well, this is because whenBase
is virtually inherited, we cannot instantiate it with a constexpr constructor. You can usestatic inline Base base{}
to bypass the need of a union, but it, including the union version, will tangibly create an object of typeBase
due to a non-constexpress-ness. Not a good idea. - Then, we access the address of
dummy
in functionGetOffsetOf
and use a macroOffsetOf()
and a helper variable templateGetOffsetOf
to obtain the offset.
The next step, of course, is to make it constexpr
! The
difficulty is the reinterpret_cast
of (char*)
.
Note that in the expression
(char*)&(dummy.base.*p) - (char*)&dummy
, we are
actullay comparing two addresses, but with different types (e.g.,
int*
and float*
). However, C++ disallows
subtracting two pointers of different types. So we need to maintain not
only an object of type Base
, but also its member of
interest. Our goal is to have the address of pointer-to-member object
equal to the address of the maintained member.
We can write the following code:
1 | // Version 4 |
Now it's computed at compile time and work for multiple inheritance
as well as virtual inheritance! But unfortunately, it fails on special
layout types, e.g., types with array, #pragma pack
or
alignas
. You can see the fail cases at
https://godbolt.org/z/9xE8azGrG. This is because inside the
OffsetHelper
struct, we recursively compute the offset by
adding sizeof(M)
, the size of the member type. When the
type layout is special, such as including array members, adding a
sizeof(M)
offset will induce the wrong result.
What about changing it to
sizeof(M) < alignof(C) ? sizeof(M) : alignof(C)
? Still
wrong for the assertion
static_assert(OffsetOf(al, arr) == 10, "")
due to a
mismatch between member size of type alignment.
Before moving to the next version, let's step into the details of this version of code.
- As before, we use
ClassMemberTraits
to extract the class type, member type and pointer-to-member object. - We define a union
PaddedUnion
to represent aBase
type object and a data member of typeM
with a specified offsetOffset
. Note that theOffset
is used to indicate how relatively far of the data member is currently away frombase
. It may be any non-negative value and recursively incremented as long as&(dummy.base.*p) > &dummy.member.member
is satisfied. That is, the current offset ofmember
still does not reach the real offset of it, and the distance is exactly&(dummy.base.*p) - &dummy.member.member
. Okay, why doesn't we just use&(dummy.base.*p) - &dummy.member.member
. It's because it's not a constant expression (they're not pointing to the same array or to the same object), and you cannot usestatic_cast
to cast it tochar*
either. - What we can only do is to try to increment
Offset
and see if it's equal to the real offset of the data member, i.e.,&(dummy.base.*p)
. So we use aif constexpr
to check the condition and recursively try it. - Last, we use a utility macro
OffsetOf(C, M)
to align its use with the built-in macrooffsetof
.
Yes, the core idea is to guess the offset. The guess
will stop if
&(dummy.base.*p) == &dummy.member.member
. Now that
we use guess, we can guess it more efficiently, by leveraging the binary
partitioning algorithm.
We have the following version 5 code (test cases at https://godbolt.org/z/7MGKrePKc):
1 | // Version 5 |
If you meet an error saying that
constexpr variable cannot have non-literal type
, you can
add the following constexpr constructor and destructor to
PaddedUnion
:
1 | constexpr PaddedUnion() noexcept : c{} {} |
The core idea is simple: using binary search to guess the offset. We
just need to slightly modify the OffsetHelper
struct and
the definition of GetOffsetOf()
. Now, all tests passed!
Congratulations to you, and to me!
Conclusion
To get a robust and generic offsetof
is really really
hard in C++. Although the built-in macro offsetof
has a
compiler-wise implementation, it should be your choice for most of the
time. If you do care about the warnings, or you want to calculate the
offset for virtual inheritance, our final version code may appeal to
you.
Reference
https://gist.github.com/graphitemaster/494f21190bb2c63c5516