In this document, we will examine some desirable extensions that some C compilers have implemented to make flexible array members ("FAMs") more usable, alternative options for compilers not implementing these extensions, and pose the question of whether a lack of a portable and straightforward way to automatically allocate memory for FAMs upon initialization is a deficiency in the ISO C standard.
FAMs are a feature introduced in the C99 standard. To motivate use cases, let's consider a deficient interface that could be helped by them: the Berkeley sockets API. With the sockets API, one fills in the following structure to create or connect to a UNIX domain socket:
#define _POSIX_C_SOURCE 200809L
#include <sys/un.h>
struct sockaddr_un {
sa_family_t sun_family;
char sun_path[/* unspecified size */];
}
This structure may also contain additional non-standard members which a portable
application need not be concerned with. Here sun_family
should always
be set to the magic value AF_UNIX
, and sun_path
contains a not-necessarily null-terminated pathname in the filesystem for the socket.
POSIX leaves the size of the sun_path
array unspecified, and in fact it
is not clear whether it is permitted to be a flexible array member. For the purposes
of this discussion, we'll assume hereafter that sun_path
is not a FAM,
as is the case in every major implementation. (After all, the sockets API predates
the introduction of FAMs.)
In practice, sun_path
is usually pretty small and its size is
arbitrarily chosen by the standard library. This is quite an undesirable situation.
If the pathname may contain multibyte characters, one may be limited to a pathname
with as few as
sizeof((struct sockaddr_un){0}.sun_path)/MB_LEN_MAX
characters. If the size of the sun_path
array were to be omitted
in its declaration, and if it's the last member of a structure that has at least
one other member, then it constitutes a flexible array member.
The concept is that you can allocate extra memory at the end of the structure to provide as much room for the flexible array member as one wishes, and in this way the flexible array member provides a name for this extra room. One could then use it like this (error checking omitted for brevity):
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <string.h>
#include <sys/un.h>
#define PATH "/run/foo/bar/socket"
int main(void) {
/* Note that sizeof() returns the size of the structure as if the flexible
* array member were not there (since the compiler doesn't know the size of
* the memory that we're managing for it), except that there might be extra
* padding at the end as necessary to satisfy alignment requirements. */
struct sockaddr_un *const addr = malloc(sizeof(*addr) + sizeof(PATH));
/* This seems to be the only strictly conforming way to clear
* out the non-standard members of the structure. An all-bits-zero
* representation (i.e. the product of using memset()) need not be
* the same thing as what we'd get with default initialization. */
memcpy(addr, &(struct sockaddr_un){.sun_family = AF_UNIX}, sizeof(struct sockaddr_un));
memcpy(&addr->sun_path, PATH, sizeof(PATH));
/* do something with our structure */
}
However, for most usages dynamic allocation is overkill. If the string or its length is known at compile-time, we'd like to simply do
struct sockaddr_un addr = {
.sun_family = AF_UNIX,
.sun_path = "/run/foo/bar/socket"
};
Unfortunately if sun_path
is a flexible array member, this is
no longer possible in strictly conforming ISO C. When a structure with a FAM is
automatically allocated, it is given no room for the FAM, but there is little
sense in not "doing the right thing" when an explicit initializer is provided,
as is done here. Some compilers such as TinyCC support this as an extension:
if an initializer is provided for a FAM, then the appropriate amount of room
is allocated.
If an initializer is known to have a fixed size, then a disgusting but
portable and strictly-conforming way to get exactly what we want is to do
our own "memory management" on the stack using a union. We can create an
array of char
that is big enough to provide enough room for
our structure with its flexible array member.
#define _POSIX_C_SOURCE 200809L
#include <string.h>
#include <sys/un.h>
#define PATH "/run/foo/bar/socket"
int main(void) {
union {
struct sockaddr_un addr;
char spc[sizeof(struct sockaddr_un) + sizeof(PATH)];
} addr = {0};
addr.addr.sun_family = AF_UNIX;
memcpy(&addr.addr.sun_path, PATH, sizeof(PATH));
}
Given that this is possible, it makes one wonder why this isn't supported in
a more clean way directly in C. With a GCC extension, variable-length arrays inside
unions and structures, a similar trick works to allocate a variable amount of space
for the flexible array member. This might be useful where malloc()
is not available and where the structure with the flexible array member is only
needed within a block of code.