Discovering offsetof

A few days ago, while I was taking a small programming contest, I came across an interesting enough question related to pointer arithmetic in the C programming language and wanted to elaborate on it.


Pointer arithmetics

Pointer arithmetic represents the process of performing arithmetic operations on pointers. The set of operations you can perform on pointers is limited to additions, substractions, increments, decrements, comparisons and assignments.

Wikipedia has a pretty straightforward way of defining how pointer arithmetics are related to array indexing :

In C, array indexing is formally defined in terms of pointer arithmetic; that is, the language specification requires that array[i] be equivalent to *(array + i).

A trivial example

Arithmetic operations can be performed on pointers just like on any other variable. The compiler use the declared type of the pointer as a unit of measure in the process of applying arithmetic operations to it. As such, if you increment a pointer to an array of int by one, the pointer will point to the next element of the array, or if you prefer, its address would be incremented by sizeof(int).

A very simple example of using pointer arithmetics would be to loop through an array :

We use a second pointer so that we can keep the original pointer to array.

In the previous example, the postfix operator ++ will be the first operator to be evaluated, and even if we didn’t use parentheses, it will be evaluated first since it has a higher precedence than the indirection operator *. The latter is then used to dereference ptr and will return the value located at the address it is currently pointing at.

In the previous example, ptr was of type char, which means that on each iteration of the loop, the address at which it was actually pointing was moved forward by sizeof(char).

The original problem

Let’s get back to the original question needed to be solved now that we know what the concept pointer arithmetic refers to.

The problem is to be able to deduce the pointer to a structure given a pointer to one of its member. We’re given the type of the structure as well as the name of one of its member. There shouldn’t be any assumption made on the number of members the structure has, or their size.

For the purpose of the challenge, we will consider the following structure of type t_struct which holds a float, an int (our member, named member) and a char.

The function we need to write takes a pointer to a member of a structure and returns a pointer to the actual structure. It shall be prototyped as follow :

In order for us to resolve the challenge we will need to :

  • Find out what the offset between the top of the structure and the actual address of the member variable is.
  • Provide a solution that must be compiler and platform agnostic.
  • Not being sensitive to the layout of the structure (i.e our solution must work regardless of whether other members are added or removed from the structure).

It found myself asking how I could possibly find the right solution for quite a long time and came first with several intermediate, not-so-much working solutions, which were relying on some absolutely non-portable stuffs.

I then searched the web for a way one could achieve this using the standard libraries that comes bundled with every ISO/ANSI C compliant compilers and found about offsetof. It is a standard macro available in stddef.h which allows to compute the offset between the head of a structure and one of its member. It is defined as follow :

Let’s break down this macro into smaller pieces :

  • A pointer of type st* is declared as pointing to the address zero.
  • The pointer is dereferenced to access a member m which expands to the name of the member in our structure.
  • The address of this member is computed.
  • The address is casted to a size_t variable.

The first obvious reaction is to wonder how this could possibly work and not crash since a null pointer is dereferenced. But interestingly enough, most of the compilers will not actually dereference the pointer since it deduces that we only want the address of the member in the structure it is pointing to, and not accessing to its actual value, which indeed would have caused a crash.

The test was conducted on GCC 4.6.

What happens when using different platforms and compilers you might ask ? Well, as offsetof is part of the ANSI C library, every compiler is required to provide it as a macro defined like as the above, or which relies on some compiler intrinsic.

Using offsetof to solve our problem

To use offsetof, it is required to specify the type of the structure we are operating on and the name of its member we would like to compute the offset to the top of the structure, in our case that would be :

Computing the actual address of the structure

Now that we can calculate the offset between the address of the structure and the address of the member, it is easy to deduce the address of the structure. Let’s illustrate this by a fully workable example :

my_offsetof

During the contest it wasn’t specified whether I could use the standard offsetof macro, so I wanted to push things a little bit further by writing my own macro that would do the same job as well as being portable.

The problem with the expression we’ve seen above is that it relies on how the compiler is going to evaluate it at compile-time. To address this issue, we can use a simple macro that uses a perfectly valid address instead of the zero-based address used by offsetof :

In this case, the my_offsetof macro does not take the type of the structure in parameter, but a pointer to a valid address pointing to a t_struct structure. This one has however the disadvantage of being slower (makes also a substraction), and to take more memory since we need to allocate space for an additional structure.