offsetof
can wait for now; let’s talk pointers quick.
In C and C++, there is a datatype called a pointer. Here’s an example:
/* Define some number, `x` */
int x = 0;
/* Assign memory location of `x` to a pointer, `p` */
int *p = &x;
/* Dereference `p`, setting what `p` points to (in this case, `x`) to 5 */
*p = 5; // this changes the value of x to 5
And that’s a very quick introduction to pointers. Play around with them if the concept’s not clear to you, the key points for now are:
&
operator gets the memory location of an object.
&
is used to create a reference in C++, which is a whole different thing).*
operator has two meanings here:
p
, it is used to denote the datatype of p
as a pointer (read as “p
is a pointer to int
”).*
is used to dereference p
, accessing the value p
points to (in this case, x
).Now, also cool is that you can do pointer arithmetic. For example, we can (most likely) do this:
int x = 5;
int y = 8;
/* Set pointer to `x` */
int *p = &x;
/* Modify `x` */
*p = 100;
/* Increment `p` */
p++; //same result as p += 1
/* Modify `y` (technically undefined behavior I think, but usually works!) */
*p = 1000;
Here, we do what we did the same thing we did before with x
and p
. However, then we increment p
, and then we are able to dereference p
and access the value of y
! How is this possible? Well, local variables are allocated in the same segment of memory in the program (the stack, but that’s not important for now), and they’re usually allocated right next to each other in memory, so we’re able to just add the sizeof(int)
to a pointer to x
to get p
to point to the next integer in memory, y
.
Now, that p++
operation might be confusing, because it usually only increments a number by 1, and sizeof(int) == 4
(for most platforms used today). Don’t worry though; pointer arithmetic is a well defined C operation:
If `p` is a pointer to some element of an array, then `p++` increments `p` to point to the next element, and `p+=i` increments it to point `i` elements beyond where it currently does.
Key things about pointer arithmetic here:
p
here actually increments the pointer by sizeof(int)
, since p
points to a pointer.Almost there to offsetof
, let’s talk about structs
first. A struct
is very simple; it’s just a container that holds arbitrary data. For instance:
/* Create struct named my_struct */
struct my_struct {
int x;
float z;
long num;
char str[4];
};
/* Make an instance of my_struct named s that holds our data */
struct my_struct s = {
10,
5.8,
11342,
"hey" // 4 bytes, last one is '\0'
};
So, we’ve got a bunch of stuff stored contiguously in memory (ignore struct padding for now). Using what we just learned about pointers, we can access elements of a struct
in exactly the same way using a pointer to a struct
!
/* Our struct from before */
struct my_struct {
int x; // bytes 0-4
float z; // bytes 4-8
long num; // bytes 8-16
char str[4]; // bytes 16-24 (padding!!!)
};
struct my_struct s = (struct my_struct) {
10,
5.8,
11342,
"hey"
};
/* Make `p` point to `s` */
struct my_struct *p = &s;
#include <stdio.h>
printf("prints 10: %d\n", *((int*)p));
printf("prints 5.8: %f\n", *(float*)((char*)p + 4));
So we have this pointer to a struct my_struct
, and in the first printf
we’re casting it to an int
pointer, then dereferencing it. This should make perfect sense: the int
we want to access, x
, is the first element of the struct
; therefore, if we cast our my_struct
pointer, p
, to an int
pointer, it will point at the int
.
That final printf
has a bit more going on in it. Let’s write the relevant part again to get a better look:
*(float*)((char*)p + 4)
This piece of code takes the pointer, p
, casts it to type char *
, adds 4
to it, casts that incremented pointer to type float *
, and dereferences it to access the float z
stored inside the struct my_struct
pointed to by p
.
The final float *
cast and dereference is the same as the previous printf
, and shouldn’t be confusing. Looking at the potentially confusing part: we cast p
to type char *
because of what we previously learned about pointer arithmetic——that, if we do p + n
, we increment p
by sizeof(p's type) * n
. On my system, this math checks out:
printf("%p\n", p); // printed 0x7ffcda62f9e0 on my system
printf("%p\n", p+4); // printed 0x7ffcda62fa40 on my system
printf("%d\n", (p+4)-(p)); // prints 96, and 4 * sizeof(struct my_struct) == 96!
If tried to access the memory at location p+96
, that would be undefined behavior. We’d just be accessing random, garbage memory! Most likely, the program would just print out 0 if we tried to access that memory location (it’s not that many bytes away from the rest of our program data) but it’s also quite likely we’d get a segmentation fault error.
We need to increment p
by the correct offset value by adding sizeof(int)
to p
, since the 4
bytes making up int x
are all that separates p
from the struct
member we want to access, float z
. By casting p
to type char *
, we can now do p+4
and get the expected result since sizeof(char) == 1
.
Determining the byte offset for each element of a struct
manually is beyond the scope of this article: see here for a quick explanation about it, and consult C/C++ documentation for the size of each primitive type on your platform (for instance, long
is 4
bytes in Windows, but 8
on most other 64-bit systems).
At this point, everything here should be fairly clear. Let’s recap:
struct
allows us to store data in contiguous locations in memory.struct
data member by adding the bytes needed to store all previous data members to the memory location of the struct
object.struct
has padding, which tries to align bytes to make the sizeof(struct)
a multiple of 8 or 4——dependent on your system——so it can be confusing and unportable to compute the offset of a struct
member manually.If any points remain confusing, trying drawing out a memory diagram, or trying yourself with more examples. If you’ve made it this far, you should be able to understand this stuff.
For further reading, I’d recommend this link, The Lost Art of Structure Packing. Everything I’ve written about applies to a basic struct
in both C and C++, but, as the link mentions for C++, “classes that look like structs may ignore the rule that the address of a struct is the address of its first member!” (source).
offsetof
[#]Finally, we’re here! Hopefully you noticed my bold text before and already can guess what offsetof
does, but now let’s cover the details. First, some information from the man page:
/* From man offsetof(3) */
#include <stddef.h>
size_t offsetof(type, member);
/* The macro offsetof() returns the offset of the field `member` from
* the start of the structure `type`.
*/
Let’s bullet the major takeaways here:
offsetof
is a macro, not a function.
offsetof
is evaluated at compile time (good for runtime performance)offsetof
takes an argument type
, and an argument member
.Here’s a quick example of a call to offsetof
, using our struct
named my_struct
as an example:
#include <stddef.h>
/* Returns 8 (for a Linux 64-bit system) */
size_t my_offset = offsetof(struct my_struct, num);
Now, let’s actually see how offsetof
is implemented. Wikipedia has a very good article on the topic, so let’s examine their implementation of offsetof
:
#define offsetof(type, member) \
((size_t)&(((type *)0)->member))
This looks a bit similar to our previous example, but a lot more confusing. Let’s unravel it:
type
to start at memory address 0.->
operator to get the value at member
.&
operator.size_t
in case the value returned is a very large number.In practice, this means we can access a struct
member in a more programmatic way using this calculated offset value:
/* Our struct from before */
struct my_struct {
int x; // bytes 0-4
float z; // bytes 4-8
long num; // bytes 8-16
char str[4]; // bytes 16-24 (padding!!!)
};
struct my_struct s = (struct my_struct) {
10,
5.8,
11342,
"hey"
};
struct my_struct *p = &s;
/* Get offset of num (== 8 on Linux) */
size_t my_offset = offsetof(struct my_struct, num);
/* Access the field with the offset */
long my_num = *((char *)p + my_offset);
As a final note about offsetof
, a bunch of people like to have a long argument over whether this macro is actually undefined behavior or not, which is why compilers like gcc implement the macro using a compiler builtin function. All I know for sure is that the code works with gcc and MSVC; therefore, while it’s probably safe to use your own implementation in practice, it’s better to just use offsetof
as defined in stddef.h
.
offsetof
[#]Pointer arithmetic, offsetof
, it might all come off as needlessly complex to a freshman computer science student. “Why would I ever need this in real life?” he asks himself while not paying attention to any of his classes because he thinks they’re all pointless, only to come out of school not having learned anything because he decided everything was “pointless.” However, I promise that offsetof
has actual uses.
First, I’d just like to highlight how useful of a learning exercise this was. Obtaining a deep understand of what offsetof
is and how
to use it has allowed us to strengthen our understand about the memory layout of a struct
and see how pointers work, two of the most important concepts for a C programmer to understand.
Anyways, here is is a quick code snippet that roughly illustrates how offsetof
might be used in a real production C or C++ program:
struct Database {
/* A ton of data members in here */
...
int data_member;
};
struct UpdateEntry {
const char *data_name;
size_t offset;
void (*update_func)(void *);
};
Now, we might create an array of struct UpdateEntry
, which would look something like this:
struct UpdateEntry table[] {
/* Lots of entries here... */
{ "my favorite member", offsetof(Database, data_member), &my_update_func }
};
It should hopefully be clear what we’re doing here: the struct UpdateEntry
contains some string, an offset into our database, and a pointer to a function update_func
. This could allow us to generically update our database items through each of these entries present in the array by, for instance, doing a function call like this:
// Earlier in the file, a struct Database named db is defined
struct UpdateEntry *my_entry = &table[some_madeup_index];
// Bit ugly with my parenthesis because I'm not sure of operator precedence
my_entry->update_func(((uint8_t *)&db) + my_entry->offset);
// Note that the implicit conversion to void * might not work in C++
We have now passed the member data_member
by pointer to the function pointed to by update_func
, and we can now update and data member of the struct
as we please in a fairly self-contained and generic fashion.
This is the major use case of offsetof
: there’s not really another easy way to statically reference a certain member of a struct
in C. offsetof
allows us to do this, making programming constructs like my struct UpdateEntry
possible and, therefore, relatively common among larger C codebases.
Site Licensing Site last updated: 2025-01-05