How Non-Member Functions Improve Encapsulation

Today's reading recommendation is on the subject of encapsulation.

I found this article very informative. I picked up new tips on evaluating encapsulation of a design and good metrics for determining whether adding an interface in a particular spot will increase or decrease the overall convenience for my end users. I also learned some new strategies for increasing encapsulation.

If you read nothing else, I recommend reviewing Meyers's algorithm for where to place a function relative to a class in my highlights below.

I'll leave Mr. Meyers to summarize his overall point:

Still, the lesson of this article should be clear. Conventional wisdom notwithstanding, use of non-friend non-member functions improves a class's encapsulation, and a preference for such functions over member functions makes it easier to design and develop classes with interfaces that are complete and minimal (or close to minimal). Arguments about the naturalness of the resulting calling syntax are generally unfounded, and adoption of a predilection for non-friend non-member functions leads to packaging strategies for a class's interface that minimize client compilation dependencies while maximizing the number of convenience functions available to them.

Read "How Non-Member Functions Improve Encapsulation" here.

My Highlights

Scott Meyers's algorithm for where to declare a function related to a class:

if (f needs to be virtual)
   make f a member function of C;
else if (f is operator>> or
         operator<<)
   {
   make f a non-member function;
   if (f needs access to non-public
       members of C)
      make f a friend of C;
   }
else if (f needs type conversions
         on its left-most argument)
   {
   make f a non-member function;
   if (f needs access to non-public
       members of C)
      make f a friend of C;
   }
else if (f can be implemented via C's
         public interface)
   make f a non-member function;
else
   make f a member function of C;

Meyers then launches into some analysis on how to judge whether your change will improve or reduce encapsulation. I found these ideas useful to ponder over and helpful when evaluating a new design.

Encapsulation is a means, not an end. There's nothing inherently desirable about encapsulation. Encapsulation is useful only because it yields other things in our software that we care about. In particular, it yields flexibility and robustness.

This is the real problem with poor encapsulation: it precludes future implementation changes. Unencapsulated software is inflexible, and as a result, it's not very robust. When the world changes, the software is unable to gracefully change with it. (Remember that we're talking here about what is practical, not what is possible. It's clearly possible to change struct Point, but if enough code is dependent on it in its current form, it's not practical.)

Encapsulated software is more flexible than unencapsulated software, and, all other things being equal, that flexibility makes it the superior design choice.

This leads to a reasonable approach to evaluating the relative encapsulations of two implementations: if changing one might lead to more broken code than would the corresponding change to the other, the former is less encapsulated than the latter. This definition is consistent with our intuition that if making a change is likely to break a lot of code, we're less likely to make that change than we would be to make a different change that affected less code. There is a direct relationship between encapsulation (how much code might be broken if something changes) and practical flexibility (the likelihood that we'll make a particular change).

An easy way to measure how much code might be broken is to count the functions that might be affected. That is, if changing one implementation leads to more potentially broken functions than does changing another implementation, the first implementation is less encapsulated than the second.

We've now seen that a reasonable way to gauge the amount of encapsulation in a class is to count the number of functions that might be broken if the class's implementation changes. That being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions. And that observation is what justifies my argument for preferring non-member non-friend functions to member functions: if a function f could be implemented as a member function or as a non-friend non-member function, making it a member would decrease encapsulation, while making it a non-member wouldn't. Since functionality is not at issue here (the functionality of f is available to class clients regardless of where f is located), we naturally prefer the more encapsulated design.

It's important that we're trying to choose between member functions and non-friend non-member functions. Just like member functions, friend functions may be broken when a class's implementation changes, so the choice between member functions and friend functions is properly made on behavioral grounds. Furthermore, we now see that the common claim that "friend functions violate encapsulation" is not quite true. Friends don't violate encapsulation, they just decrease it — in exactly the same manner as member functions.

Adding a static member function to a class when its functionality could be implemented as a non-friend non-member decreases encapsulation by exactly the same amount as does adding a non-static member function. One implication of this is that it's generally a bad idea to move a free function into a class as a static member just to show that it's related to the class.

This encapsulation idea can backfire if you want to make a factory function and implement templated code to call factory functions generically:

For factory functions and similar functions which can be given uniform names, this means that maximal class encapsulation and maximal template utility are at odds. In such cases, you have to decide which is more important and cater to that. However, for static member functions with class-specific names, the template issue fails to arise, and encapsulation can again assume precedence.

If you reflect a bit and are honest with yourself, you'll admit that you have this alleged inconsistency with all the nontrivial classes you use, because no class has every function desired by every client. Every client adds at least a few convenience functions of their own, and these functions are always non-members. C++ programers are used to this, and they think nothing of it. Some calls use member syntax, and some use non-member syntax. People just look up which syntax is appropriate for the functions they want to call, then they call them. Life goes on. It goes on especially in the STL portion of the Standard C++ library, where some algorithms are member functions (e.g., size), some are non-member functions (e.g., unique), and some are both (e.g., find). Nobody blinks. Not even you.

Herb Sutter has explained that the "interface" to a class (roughly speaking, the functionality provided by the class) includes the non-member functions related to the class, and he's shown that the name lookup rules of C++ support this meaning of "interface." This is wonderful news for my "non-friend non-members are better than members" argument, because it means that the decision to make a class-related function a non-friend non-member instead of a member need not even change the interface to that class! Moreover, the liberation of the functions in a class's interface from the confines of the class definition leads to some wonderful packaging flexibility that would otherwise be unavailable. In particular, it means that the interface to a class may be split across multiple header files.

On Interfaces and Packaging:

Suppose the author of the Wombat class discovered that Wombat clients often need a number of convenience functions related to eating, sleeping, and breeding. Such convenience functions are by definition not strictly necessary. The same functionality could be obtained via other (albeit more cumbersome) member function calls. As a result, and in accord with my advice in this article, each convenience function should be a non-friend non-member. But suppose the clients of the convenience functions for eating rarely needed the convenience functions for sleeping or breeding. And suppose the clients of the sleeping and breeding convenience functions also rarely needed the convenience functions for eating and, respectively, breeding and sleeping.

Rather than putting all Wombat-related functions into a single header file, a preferable design would be to partition the Wombat interface across four separate headers, one for core Wombat functionality (primarily the class definition), and one each for convenience functions related to eating, sleeping, and breeding. Clients then include only the headers they need. The resulting software is not only clearer, it also contains fewer gratuitous compilation dependencies. This multiple-header approach was adopted for the standard library. The contents of namespace std are spread across 50 different headers. Clients #include the headers declaring the parts of the library they care about, and they ignore everything else.

When it comes to encapsulation, sometimes less is more.

In addition, this approach is extensible. When the declarations for the functions making up a class's interface are spread across multiple header files, it becomes natural for clients creating application-specific sets of convenience functions to cluster those functions into a new header file and to #include that file as appropriate. In other words, to treat the application-specific convenience functions just like they treat the convenience functions provided by the author of the class. This is as it should be. After all, they're all just convenience functions.

Minimalism and Encapsulation

In Effective C++, I argued for class interfaces that are complete and minimal. Such interfaces allow class clients to do anything they might reasonably want to do, but classes contain no more member functions than are absolutely necessary. Adding functions beyond the minimum necessary to let clients get their jobs done, I wrote, decreases the class's comprehensibility and maintainability, plus it increases compilation times for clients. Jack Reeves has written that the addition of member functions beyond those truly required violates the open/closed principle, yields fat class interfaces, and ultimately leads to software rot. That's a fair number of arguments for minimizing the number of member functions in a class, but now we have an additional reason: failure to do so decreases a class's encapsulation.

Of course, a minimal class interface is not necessarily the best interface. I remarked in Effective C++ that adding functions beyond those truly necessary may be justifiable if it significantly improves the performance of the class, makes the class easier to use, or prevents likely client errors. Based on his work with various string-like classes, Jack Reeves has observed that some functions just don't "feel" right when made non-members, even if they could be non-friend non-members. The "best" interface for a class can be found only by balancing many competing concerns, of which the degree of encapsulation is but one.

Still, the lesson of this article should be clear. Conventional wisdom notwithstanding, use of non-friend non-member functions improves a class's encapsulation, and a preference for such functions over member functions makes it easier to design and develop classes with interfaces that are complete and minimal (or close to minimal). Arguments about the naturalness of the resulting calling syntax are generally unfounded, and adoption of a predilection for non-friend non-member functions leads to packaging strategies for a class's interface that minimize client compilation dependencies while maximizing the number of convenience functions available to them.