Technique: Objects in C

While C is not an object-oriented language, objects can be implemented using structures.

Common first-pass approaches to library implementations involve APIs that operate on static library data. We typically reach a point where we need our library interfaces to work on any number of instances, instead of just internal static instances.

Objects can be thought of as structures, and member functions are those which take a pointer to the particular structure which represents the instance that should be operated on. The structure should contain all of the relevant data that should be contained with an object.

For example, consider a circular buffer structure:

struct circular_buf_t {
	uint8_t * buffer;
	size_t head;
	size_t tail;
	size_t max; //of the buffer
	bool full;
};

To ensure our C APIs can be used with multiple objects, we need to:

  1. Ensure that all mutable configuration options and context-specific information (e.g., state) in stored in the structure
  2. Ensure that the necessary APIs take a pointer to the structure (to avoid copies and enable state modifications if necessary)
    1. Functions that do not modify the structure should specify the input parameter as const for clarity

To support multiple objects, we would declare a struct instance for each object. We invoke the associated APIs with whatever struct we need to use at that time.

Our APIs are then adjusted to take a structure pointer as the first parameter. For example:

/// Retrieve a value from the buffer
/// Returns 0 on success, -1 if the buffer is empty
int circular_buf_get(struct circular_buf_t* me, uint8_t * data);

References

Technique: Encapsulation and Information Hiding in C

Encapsulation is usually associated with object-oriented programming languages, and information hiding is not straightforward to achieve in C. However, these two principles can still be applied by creating interfaces that work on “opaque” structures.

The first step is to forward-declare a structure in a header file. This structure will NOT have a definition in the header file, so users cannot declare this type directly in their code. They can only work with pointers to this type.

// Opaque circular buffer structure
typedef struct circular_buf_t circular_buf_t;

The definition for this structure will be kept inside of the corresponding .c file. This will ensure that we can change the structure definition as needed, without requiring end users to update their code. The users only need to know that this type of structure exists.

We often provide a corresponding “handle” type that is mapped as a pointer to this opaque structure, a void *, or a uintptr_t. This is a better indication that the user cannot directly dereference the type.

// Handle type, the way users interact with the API
typedef circular_buf_t* cbuf_handle_t;
Note

If not using a pointer to the structure as the handle type, then inside of the implementation you would cast to the appropriate type.

You could also completely eliminate the struct forward definition and instead only supply a generic void * or uintptr_t handle type.

All public interfaces will be declared in the header file and must operate on the handle type. All private interfaces and data will be defined in the corresponding .c file so they are not visible to consumers of the header.

When working with opaque types, an initialization function is often required in order to generate a valid handle.

/// Pass in a storage buffer and size 
/// Returns a circular buffer handle 
cbuf_handle_t circular_buf_init(uint8_t* buffer, size_t size);
Note

Recall that since our opaque struct has no public definition in the .h file, users cannot declare these structures on their own. If you do not want to utilize dynamic memory allocation (common in embedded software), you can instead statically pre-allocate a pool of objects inside of your .c file. Be sure to assert or return a NULL value if the assignment fails because all pre-allocated objects are in use. Also indicate to users that they should check for failure.

That handle is then passed as an input parameter to all public functions associated with the opaque type:

/// Reset the circular buffer to empty, head == tail
void circular_buf_reset(cbuf_handle_t me);

/// Put version 1 continues to add data if the buffer is full
/// Old data is overwritten
void circular_buf_put(cbuf_handle_t me, uint8_t data);

A “destructor” is usually required in order to free any previously allocated data:

/// Free a circular buffer structure. 
/// Does not free data buffer; owner is responsible for that 
void circular_buf_free(cbuf_handle_t me);

Summary

  • Forward-declare structures or provide handle typedefs to users
  • Declare an initialization function that can generate an “object” of the appropriate type and return the handle to the user
  • Declare a destructor that can free any memory allocated to the “object” with the associated handle
  • All public interfaces are declared in the header file and operate on the handle type
  • All private interfaces and data is declared within the .c file (without extern!) so that external modules cannot access them

References

Technique: Refactoring Global Data to Support Multiple Instances

5 August 2020 by Phillip Johnston • Last updated 14 December 2021Many applications are initially written to support a single instance of an object in a system. This often maps directly to the initial design of a system. However, future growth often requires these assumptions to be adjusted. Functions built around the assumption of a single instance of global data will often need to be updated to support multiple instances. In other cases, the use of concurrency requires that a function is made reentrant by removing internal references to global data and converting functions to use pointers or values as …

Paper: What Every Systems Programmer Should Know About Lockless Concurrency

21 May 2020 by Phillip Johnston • Last updated 15 August 2023 “What Every Systems Programmer Should Know About Lockless Concurrency” is a short paper by Matt Kline that explores ways we can synchronize concurrent operations without relying on standard OS constructs, such as mutexes, semaphores, and condition variables. These strategies are also used to implement those same OS constructs. The paper also briefly touches on sources of concurrency problems that arise from compilers and from the hardware itself. Note: This paper was featured as a Reading Club assignment. Files What Every Systems Programmer Should Know About Lockless Concurrency Archive …

Enabling Users to Override Functions in C & C++

8 May 2020 by Phillip Johnston • Last updated 14 December 2021 There are many cases were we want to give our users the ability to override functions in a library or module. This is often for be for customization or optimization purposes. The Pattern With many C and C++ compilers, we can declare function and variable definitions as “weakly linked” using __attribute__((weak)) or the __weak alias: __attribute__((weak)) void weak_function() { // Intentional no-op } Note: While ELF allows you to declare a symbol with weak linkage and not supply a definition, this approach will fail with Mach-O objects and …

How Non-Member Functions Improve Encapsulation

We’ve received permission from Scott Meyers to re-publish this article on our website after the version on Dr. Dobbs went down. I’ve taken the liberty of converting some of the bold print to code font, as well as clarifying some code example indentation/bracketing.

Even if you don’t completely agree with Scott’s conclusion, this article still serves as an excellent discussion about the idea of encapsulation.

How Non-Member Functions Improve Encapsulation

by Scott Meyers

When it comes to encapsulation, sometimes less is more.

I’ll start with the punchline: If you’re writing a function that can be implemented as either a member or as a non-friend non-member, you should prefer to implement it as a non-member function. That decision increases class encapsulation. When you think encapsulation, you should think non-member functions.

Surprised? Read on.

Background

When I wrote the first edition of Effective C++ in 1991 [1], I examined the problem of determining where to declare a function that was related to a class. Given a class C and a function f related to C, I developed the following algorithm:

if (f needs to be virtual)
{
	make f a member function of C;
}
else if (f is operator>> or operator<<)
{
	make f a non-member function;
	if (f needs access to non-public members of C)
	{
		make f a friend of C;
	}
}
else if (f needs type conversions on its left-most argument)
{
	make f a non-member function;
	if (f needs access to non-public members of C)
	{
		make f a friend of C;
	}
}
else
{
	make f a member function of C;
}

This algorithm served me well through the years, and when I revised Effective C++ for its second edition in 1997 [2], I made no changes to this part of the book.

In 1998, however, I gave a presentation at Actel, where Arun Kundu observed that my algorithm dictated that functions should be member functions even when they could be implemented as non-members that used only C’s public interface. Is that really what I meant, he asked me? In other words, if f could be implemented as a member function or a non-friend non-member function, did I really advocate making it a member function? I thought about it for a moment, and I decided that that was not what I meant. I therefore modified the algorithm to look like this [3]:

if (f needs to be virtual)
{
	make f a member function of C;
}
else if (f is operator>> or operator<<)
{
	make f a non-member function;
	if (f needs access to non-public members of C)
	{
		make f a friend of C;
	}
}
else if (f needs type conversions on its left-most argument)
{
	make f a non-member function;
	if (f needs access to non-public members of C)
	{
		make f a friend of C;
	}
}
else if (f can be implemented via C's public interface)
{
	make f a non-member function;
}
else
{
	make f a member function of C;
}

Since then, I’ve been battling programmers who’ve taken to heart the lesson that being object-oriented means putting functions inside the classes containing the data on which the functions operate. After all, they tell me, that’s what encapsulation is all about.

They are mistaken.

Encapsulation

Encapsulation is a means, not an end. There’s nothing inherently desirable about encapsulation. Encapsulation is useful only because it yields other things in our software that we care about. In particular, it yields flexibility and robustness. Consider this struct, whose implementation I think we’ll all agree is unencapsulated:

struct Point 
{
	int x, y;
};

The weakness of this struct is that it’s not flexible in the face of change. Once clients started using this struct, it would, practically speaking, be very hard to change it; too much client code would be broken. If we later decided we wanted to compute x and y instead of storing those values, we’d probably be out of luck. We’d be similarly thwarted if we decided a superior design would be to look x and y up in a database. This is the real problem with poor encapsulation: it precludes future implementation changes. Unencapsulated software is inflexible, and as a result, it’s not very robust. When the world changes, the software is unable to gracefully change with it. (Remember that we’re talking here about what is practical, not what is possible. It’s clearly possible to change struct Point, but if enough code is dependent on it in its current form, it’s not practical.)

Now consider a class with an interface that offers clients capabilities similar to those afforded by the struct above, but with an encapsulated implementation:

class Point {
public:
  int getXValue() const;
  int getYValue() const;
  void setXValue(int newXValue);
  void setYValue(int newYValue);

private:
... // whatever...
};

This interface supports the implementation used by the struct (storing x and y as ints), but it also affords alternative implementations, such as those based on computation or database lookup. This is a more flexible design, and the flexibility makes the resulting software more robust. If the class’s implementation is found lacking, it can be changed without requiring changes to client code. Assuming the declarations of the public member functions remain unchanged, client source code is unaffected. (If a suitable implementation has been adopted [4], clients need not even recompile.)

Encapsulated software is more flexible than unencapsulated software, and, all other things being equal, that flexibility makes it the superior design choice.

Degrees of Encapsulation

The class above doesn’t fully encapsulate its implementation. If the implementation changes, there’s still code that might break. In particular, the member functions of the class might break. In all likelihood, they are dependent on the particulars of the data members of the class. Still, it seems clear that the class is more encapsulated than the struct, and we’d like to have a way to state this more formally.

It’s easily done. The reason the class is more encapsulated than the struct is that more code might be broken if the (public) data members in the struct change than if the (private) data members of the class change. This leads to a reasonable approach to evaluating the relative encapsulations of two implementations: if changing one might lead to more broken code than would the corresponding change to the other, the former is less encapsulated than the latter. This definition is consistent with our intuition that if making a change is likely to break a lot of code, we’re less likely to make that change than we would be to make a different change that affected less code. There is a direct relationship between encapsulation (how much code might be broken if something changes) and practical flexibility (the likelihood that we’ll make a particular change).

An easy way to measure how much code might be broken is to count the functions that might be affected. That is, if changing one implementation leads to more potentially broken functions than does changing another implementation, the first implementation is less encapsulated than the second. If we apply this reasoning to the struct above, we see that changing its data members may break an unknowably large number of functions — every function that uses the struct. In general, we can’t count how many functions this is, because there’s no way to locate all the code that uses a particular struct. This is especially true for library code. However, the number of functions that might be broken if the class’s data members change is easy to determine: it’s all the functions that have access to the private part of the class. That’s just four functions (assuming none are declared in the private part of the class), and we know that because they’re all conveniently listed in the class definition. Since they’re the only functions that have access to the private parts of the class, they’re the only functions that can be affected if those parts change.

Encapsulation and Non-Member Functions

We’ve now seen that a reasonable way to gauge the amount of encapsulation in a class is to count the number of functions that might be broken if the class’s implementation changes. That being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions. And that observation is what justifies my argument for preferring non-member non-friend functions to member functions: if a function f could be implemented as a member function or as a non-friend non-member function, making it a member would decrease encapsulation, while making it a non-member wouldn’t. Since functionality is not at issue here (the functionality of f is available to class clients regardless of where f is located), we naturally prefer the more encapsulated design. It’s important that we’re trying to choose between member functions and non-friend non-member functions. Just like member functions, friend functions may be broken when a class’s implementation changes, so the choice between member functions and friend functions is properly made on behavioral grounds. Furthermore, we now see that the common claim that ‘‘friend functions violate encapsulation’’ is not quite true. Friends don’t violate encapsulation, they just decrease it — in exactly the same manner as member functions.

This analysis applies to any kind of member functions, including static ones. Adding a static member function to a class when its functionality could be implemented as a non-friend non-member decreases encapsulation by exactly the same amount as does adding a non-static member function. One implication of this is that it’s generally a bad idea to move a free function into a class as a static member just to show that it’s related to the class. For example, if I have an abstract base class for Widgets and then use a factory function [4,5,6] to make it possible for clients to create Widgets, the following is a common, but inferior way to organize things:

// a design less encapsulated than it could be
class Widget {
... // all the Widget stuff; may be
// public, private, or protected

public:

// could also be a non-friend non-member
static Widget* make(/* params */);
};

A better design is to move make out of Widget, thus increasing the overall encapsulation of the system. To show that Widget and make are related, the proper tool is a namespace:

// a more encapsulated design
namespace WidgetStuff {
  class Widget { ... };
  Widget* make( /* params */ );
};

Alas, there is a weakness to this design when templates enter the picture. For details, see the accompanying sidebar.

Syntax Issues

If you’re like many people with whom I’ve discussed this issue, you’re likely to have reservations about the syntactic implications of my advice that non-friend non-member functions should be preferred to member functions, even if you buy my argument about encapsulation. For example, suppose a class Wombat supports the functionality of both eating and sleeping. Further suppose that the eating functionality must be implemented as a member function, but the sleeping functionality could be implemented as a member or as a non-friend non-member function. If you follow my advice from above, you’d declare things like this:

class Wombat {
public:
  void eat(double tonsToEat);
...
};

void sleep(Wombat& w, double hoursToSnooze);

That would lead to a syntactic inconsistency for class clients, because for a Wombat w, they’d write

w.eat(.564);

to make it eat, but they would write

sleep(w, 2.57);

to make it sleep. Using only member functions, things would look much neater:

class Wombat {
public:
  void eat(double tonsToEat);
  void sleep(double hoursToSnooze);
  ...
};

w.eat(.564);
w.sleep(2.57);

Ah, the uniformity of it all! But this uniformity is misleading, because there are more functions in the world than are dreamt of by your philosophy.

To put it bluntly, non-member functions happen. Let us continue with the Wombat example. Suppose you write software to model these fetching creatures, and imagine that one of the things you frequently need your Wombats to do is sleep for precisely half an hour. Clearly, you could litter your code with calls to w.sleep(.5), but that would be a lot of .5s to type, and at any rate, what if that magic value were to change? There are a number of ways to deal with this issue, but perhaps the simplest is to define a function that encapsulates the details of what you want to do. Assuming you’re not the author of Wombat, the function will necessarily have to be a non-member, and you’ll have to call it as such:

// might be inline, but it doesn't matter
void nap(Wombat& w) { w.sleep(.5); }

Wombat w;
...
nap(w);

And there you have it, your dreaded syntactic inconsistency. When you want to feed your wombats, you make member function calls, but when you want them to nap, you make non-member calls.

If you reflect a bit and are honest with yourself, you’ll admit that you have this alleged inconsistency with all the nontrivial classes you use, because no class has every function desired by every client. Every client adds at least a few convenience functions of their own, and these functions are always non-members. C++ programers are used to this, and they think nothing of it. Some calls use member syntax, and some use non-member syntax. People just look up which syntax is appropriate for the functions they want to call, then they call them. Life goes on. It goes on especially in the STL portion of the Standard C++ library, where some algorithms are member functions (e.g., size), some are non-member functions (e.g., unique), and some are both (e.g., find). Nobody blinks. Not even you.

Interfaces and Packaging

Herb Sutter has explained that the ‘‘interface’’ to a class (roughly speaking, the functionality provided by the class) includes the non-member functions related to the class, and he’s shown that the name lookup rules of C++ support this meaning of ‘‘interface’’ [7,8]. This is wonderful news for my ‘‘non-friend non-members are better than members’’ argument, because it means that the decision to make a class-related function a non-friend non-member instead of a member need not even change the interface to that class! Moreover, the liberation of the functions in a class’s interface from the confines of the class definition leads to some wonderful packaging flexibility that would otherwise be unavailable. In particular, it means that the interface to a class may be split across multiple header files.

Suppose the author of the Wombat class discovered that Wombat clients often need a number of convenience functions related to eating, sleeping, and breeding. Such convenience functions are by definition not strictly necessary. The same functionality could be obtained via other (albeit more cumbersome) member function calls. As a result, and in accord with my advice in this article, each convenience function should be a non-friend non-member. But suppose the clients of the convenience functions for eating rarely needed the convenience functions for sleeping or breeding. And suppose the clients of the sleeping and breeding convenience functions also rarely needed the convenience functions for eating and, respectively, breeding and sleeping.

Rather than putting all Wombat-related functions into a single header file, a preferable design would be to partition the Wombat interface across four separate headers, one for core Wombat functionality (primarily the class definition), and one each for convenience functions related to eating, sleeping, and breeding. Clients then include only the headers they need. The resulting software is not only clearer, it also contains fewer gratuitous compilation dependencies [4,9]. This multiple-header approach was adopted for the standard library. The contents of namespace std are spread across 50 different headers. Clients #include the headers declaring the parts of the library they care about, and they ignore everything else.

In addition, this approach is extensible. When the declarations for the functions making up a class’s interface are spread across multiple header files, it becomes natural for clients creating application-specific sets of convenience functions to cluster those functions into a new header file and to #include that file as appropriate. In other words, to treat the application-specific convenience functions just like they treat the convenience functions provided by the author of the class. This is as it should be. After all, they’re all just convenience functions.

Minimalness and Encapsulation

In Effective C++, I argued for class interfaces that are complete and minimal [10]. Such interfaces allow class clients to do anything they might reasonably want to do, but classes contain no more member functions than are absolutely necessary. Adding functions beyond the minimum necessary to let clients get their jobs done, I wrote, decreases the class’s comprehensibility and maintainability, plus it increases compilation times for clients. Jack Reeves has written that the addition of member functions beyond those truly required violates the open/closed principle, yields fat class interfaces, and ultimately leads to software rot [11]. That’s a fair number of arguments for minimizing the number of member functions in a class, but now we have an additional reason: failure to do so decreases a class’s encapsulation. Of course, a minimal class interface is not necessarily the best interface. I remarked in Effective C++ that adding functions beyond those truly necessary may be justifiable if it significantly improves the performance of the class, makes the class easier to use, or prevents likely client errors [10]. Based on his work with various string-like classes, Jack Reeves has observed that some functions just don’t ‘‘feel’’ right when made non-members, even if they could be non-friend non-members [12]. The ‘‘best’’ interface for a class can be found only by balancing many competing concerns, of which the degree of encapsulation is but one.

Still, the lesson of this article should be clear. Conventional wisdom notwithstanding, use of non-friend non-member functions improves a class’s encapsulation, and a preference for such functions over member functions makes it easier to design and develop classes with interfaces that are complete and minimal (or close to minimal). Arguments about the naturalness of the resulting calling syntax are generally unfounded, and adoption of a predilection for non-friend non-member functions leads to packaging strategies for a class’s interface that minimize client compilation dependencies while maximizing the number of convenience functions available to them.

It’s time to abandon the traditional, but inaccurate, ideas of what it means to be object-oriented. Are you a true encapsulation believer? If so, I know you’ll embrace non-friend non-member functions with the fervor they deserve.

In the main article, I argue that static member functions should be made non-members whenever that is possible, because that increases class encapsulation. I consider these two possible implementations for a factory function:

// the less encapsulated design
class Widget {
   ...  
public:
   static Widget* make(/* params */);
};     

// the more encapsulated design
namespace WidgetStuff {
   class Widget { ... };
   Widget* make( /* params */ );
};

Andrew Koenig pointed out that the first design (where make is static inside the class) enables one to write a template function that invokes make without knowing the type of what is being made:

template<typename T>
void doSomething( /* params */ )
{
   // invoke T’s factory function
   T *pt = T::make( /* params */ );    
   ...
}

This isn’t possible with the namespace-based design, because there’s no way from inside a template to identify the namespace in which a type parameter is located. That is, there’s no way to figure out what ??? is in the pseudocode below:

template<typename T>
void doSomething( /* params */ )
{
   // there’s no way to know T’s containing namespace!
   T *pt = ???::make( /* params */ );
   ...                               
}

For factory functions and similar functions which can be given uniform names, this means that maximal class encapsulation and maximal template utility are at odds. In such cases, you have to decide which is more important and cater to that. However, for static member functions with class-specific names, the template issue fails to arise, and encapsulation can again assume precedence.

Acknowledgements

Thanks to Arun Kundu for asking the question that led to this article. Thanks also to Jack Reeves, Herb Sutter, Dave Smallberg, Andrei Alexandrescu, Bruce Eckel, Bjarne Stroustrup, and Andrew Koenig for comments on pre-publication drafts that weren’t as good as they should have been. (That’s why they were drafts.) Finally, great thanks to Adela Novak for organizing the C++ seminars in Lucerne (Switzerland) that led to the many hours on planes and trains that allowed me to write the initial draft of this article.

Notes and References

[1] Scott Meyers. Effective C++: 50 Specific Ways to Improve Your Programs and Designs, First Edition (Addison-Wesley, 1991), Item 19.
[2] Scott Meyers. Effective C++, Second Edition (Addison-Wesley, 1998).
[3] The algorithm remains unchanged in current printings of Effective C++, because I’d have to also add the requisite reasoning (this article), and making such a substantial change to a book already in production simply isn’t practical.
[4] Effective C++, Item 34.
[5] Erich Gamma et al. Design Patterns, Elements of Reusable Object-Oriented Software (Addison-Wesley, 1995). Also known as the GOF or ‘‘Gang of Four’’ book.
[6] James O. Coplien. Advanced C++: Programming Styles and Idioms (Addison-Wesley, 1991).
[7] Herb Sutter. ‘‘Sutter’s Mill: What’s in a Class?’’ C++ Report,March 1998.
[8] Herb Sutter. Exceptional C++ (Addison-Wesley, 1999), Items 31–34.
[9] John Lakos. Large-Scale C++ Software Design (Addison-Wesley, 1996).
[10] Effective C++, Item 18.
[11] Jack Reeves. ‘‘(B)leading Edge: How About Namespaces?,’’ C++ Report, April 1999.
[12] Jack Reeves. Personal communication.

About the Author

Scott Meyers is a recognized authority on C++; he provides consulting services to clients worldwide. He is the author of Effective C++, Second Edition (Addison-Wesley, 1998), More Effective C++(Addison-Wesley, 1996), and Effective C++ CD (Addison-Wesley, 1999). Scott received his Ph.D. in Computer Science from Brown University in 1993.

Technique: Inheritance and Polymorphism in C

While C is not an object-oriented language, objects, inheritance, and polymorphism can be implemented in C.

Table of Contents:

  1. Inheritance
  2. Polymorphism
  3. Further Reading

Inheritance

The core idea behind inheritance relies on structures. A base class or API can be represented by a struct definition that includes all relevant interfaces and data members.

For example, consider this example of an event base “class” from an Electron Vector article:

typedef enum {
  EVENT_BUTTON_PRESSED,
  EVENT_BUTTON_RELEASED,
  EVENT_KNOB_SET,
  EVENT_SWITCH_SET,
} event_type_t;

typedef struct {
  event_type_t type;
} event_t;

To simulate inheritance in C, we define additional event structures which include the base structure as the first element. The base structure must always be the first element.

typedef struct {
  event_t event;
  button_t button;
} event_button_pressed_t;

typedef struct {
  event_t event;
  button_t button;
} event_button_released_t;

The key detail is that a pointer to the derived structures (event_button_pressed_t) is also a pointer to the base structure (event_t). You can safely cast between the two structures, and any derived type can be used with APIs that require only the base type.

Just remember this foundational pattern, outlined in Dmitry Frank’s article:

/* base class */
typedef struct S_Base {
   /* ... */
} T_Base;
 
/* derived class */
typedef struct S_Derived {
   T_Base base;
   /* ... */
} T_Derived

Polymorphism

Polymorphism is the ability to substitute objects using matching interfaces for one-another at runtime. With C++, polymorphism is handled through virtual interfaces and the virtual table (vtable). The example code used below is adapted from Miro Samek’s excellent overview.

For example, we can define a base “class” for a shape, along with a vtable which contains function pointers. Derived classes populate the vtable pointers with functions that are specific to the derived class.

struct ShapeVTable; // Forward declare the vtable

// Base class definition
typedef struct {
	struct ShapeVTable const *vptr;
	int32_t x; // x-coord of shape's position
	int32_t y; // y-coord of shape's position
} Shape; 

struct ShapeVTable
{
	uint32_t (*area)(Shape const * const me);
	void (*draw)(Shape const * const me);
};

Non-virtual functions for the base class are implemented as a standard functions which take a base class pointer as the first parameter:

/* Shape's operations (Shape's interface)... */
void Shape_ctor(Shape * const me, int16_t x, int16_t y); 
void Shape_moveBy(Shape * const me, int16_t dx, int16_t dy);

One such required non-virtual function is a constructor, whose most important job is to supply the proper vtable for the class:

void Shape_ctor(Shape * const me, int16_t x, int16_t y) {
	static struct ShapeVtbl const vtbl = { /
		* vtbl of the Shape class */ 
		&Shape_area_,
		&Shape_draw_ 
	};
	me->vptr = &vtbl; /* "hook" the vptr to the vtbl */ 
	me->x = x;
	me->y = y;

Any base class or derived function can reference the virtual functions using the vtable:

static inline uint32_t Shape_area(Shape const * const me) 
{ 
	return (*me->vptr->area)(me);
}

To derive from the base class, we simply use the base class Shape as the first element of the derived structure:

typedef struct {
	Shape base; /* <== inherits Shape */
	/* attributes added by this subclass... */ 
	uint16_t width;
	uint16	_t height;
} Rectangle;

As a result, we can cast Rectangle so it can be used with Shapeinterfaces. We can also create Rectangle-specific interfaces.

Of course, our derived class needs its own constructor:

void Rectangle_ctor(Rectangle * const me, int16_t x, int16_t y,
	uint16_t width, uint16_t height)
{
	static struct ShapeVtbl const vtbl = { 
		/* vtbl of the Rectangle class */ 
		&Rectangle_area_,
		&Rectangle_draw_
	};
	
	// We always call the base class constructor
	Shape_ctor(&me->base, x, y);
	
	// But we must override the base vtable with Rectangle's
	me->base.vptr = &vtbl; 
	
	// Rectangle-specific values
	me->width = width;
	me->height = height;
}

Because we defined a new class, we need Rectangle-specific implementations of the virtual functions. These are added to a new vtable.

Whenever a Rectangle is constructed, the base class constructor is called first. Then, the vtable is set to Rectangle’s table. Any virtual function calls will dereference the proper Rectangle-specific implementation, even when using base class interfaces.

This is the basic overview of implementing polymorphism and virtual functions in C. For a detailed overview of implementing full polymorphism, see the links in Further Reading.

As a final note, if you are relying on these concepts in your C programs, you will likely benefit from switching to C++, if only for the fact that the compiler handles so much of this overhead for you. Additionally, you can rely on optimizations from the C++ compiler that won’t work in the C implementation.

Further Reading

For more on Inheritance, Polymorphism, and OOP in C: