Open-Closed Principle [OCP]

The Open-Closed Principle (OCP) is a software design principle that states:

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.

The OCP can be restated in more familiar terms, such as in Patterns in the Machine: A Software Engineering Guide to Embedded Development:

The Open-Closed Principle (OCP) says that you want to design your software so that if you add new features or some new functionality, you only add new code; you do not rewrite existing code. A traditional example of the OCP is to introduce an abstract interface to decouple a “client” from the “server.”

We prefer an even more generalized form of the OCP: design your software components so that you can add new functionality or customize behavior without changing the source code of the component.

Note

The OCP is the “O” in the SOLID acronym.

Table of Contents:

  1. Evolution of the OCP
  2. Applying the OCP
  3. Benefits
  4. Balancing Points
  5. Examples
  6. Related Concepts
  7. References

Evolution of the OCP

Bertrand Meyer is the originator of the OCP, and he described it as follows:

A module will be said to be open if it is still available for extension. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.

A module will be said to be closed if [it] is available for use by other modules. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding).

A class is closed, since it may be compiled, stored in a library, baselined, and used by client classes. But it is also open, since any new class may use it as parent, adding new features. When a descendant class is defined, there is no need to change the original or to disturb its clients.

What Meyer described is essentially what we would describe as “implementation inheritance”. Nowadays, the idea of implementation inheritance has fallen out of favor. As a result, some people look negatively at the OCP based on this interpretation of it.

Robert Martin took the idea of the OCP and refocused it on abstract interfaces and polymorphism.

In contrast to Meyer’s usage, this definition advocates inheritance from abstract base classes. Interface specifications can be reused through inheritance but implementation need not be. The existing interface is closed to modifications and new implementations must, at a minimum, implement that interface.

[…]

the implementations can be changed and multiple implementations could be created and polymorphically substituted for each other.

This restatement, focused on abstract interfaces that are closed to change, sounds quite like David Parnas’s information hiding principle. Martin himself notes that “on a module level, this idea is best applied in conjunction with information hiding”.

Martin also provided a helpful restatement of what it means to be “open” and “closed”:

Modules that conform to the open-closed principle have two primary attributes.

  1. They are “Open For Extension”.
    This means that the behavior of the module can be extended. That we can make the module behave in new and different ways as the requirements of the application change, or to meet the needs of new applications.
  2. They are “Closed for Modification”.
    The source code of such a module is inviolate. No one is allowed to make source code changes to it.

More recently, we (and others) generalize the idea even further. The authors of Patterns in the Machine: A Software Engineering Guide to Embedded Development provide the following restatement:

Adding new functionality should not be done by editing existing source code. That is the frame of mind you need to approach designing every module with, and you achieve it by putting together a loosely coupled design.

Our own restatement of the OCP advises us to design your software components so that you can add new functionality or customize behavior without changing the source code of the component..

These two contemporary restatements of the OCP get to the heart of the matter: prevent changes from cascading throughout a system by a) putting up firewalls, b) making components “closed” to specific changes, and c) providing mechanisms to externally extend and control behaviors without modifying a component’s source code.

Applying the OCP

The OCP advises us that software components should be open for extension yet closed for modification. On the surface, this appears to be a conundrum: the typical way one would extend the behavior of a component is by changing it! If a component cannot be changed, how can it be extended?

This question can be attacked from multiple angles:

  1. Abstract Interfaces
  2. Providing Hooks for External Customization
  3. Implementation Decisions

Ensuring a component adheres to the OCP is an explicit design decision. Designers must choose what types of extensions and changes that a component is closed against (or whether the OCP applies to a component at all). Mechanisms must be provided by the component to support the target extensions. To provide actual benefit, the existence of these extension mechanisms is not enough – they must be documented so they can be used effectively to achieve the goals of the OCP. It is also wise to document what changes a component is/isn’t closed against for future maintainers.

Abstract Interfaces

The classical answer to resolving these two competing goals is abstraction. In general, this is a broad answer, since abstraction takes many forms. Martin’s OCP largely focuses on abstract interfaces combined with dynamic polymorphism (i.e., inheritance) or static polymorphism (e.g., templated parameters in C++ that expect a particular interface).

In C++, using the principles of object oriented design, it is possible to create abstractions that are fixed and yet represent an unbounded group of possible behaviors. The abstractions are abstract base classes, and the unbounded group of possible behaviors is represented by all the possible derivative classes. It is possible for a module to manipulate an abstraction. Such a module can be closed for modification since it depends upon an abstraction that is fixed. Yet the behavior of that module can be extended by creating new derivatives of the abstraction.

Abstract interfaces can be viewed as “specifications”, and these specifications can be reused through inheritance even though the implementation is not. The interface specifications will be closed to modifications. New implementations that satisfy the interface can be created, leaving the implementation of the interface open to extension. New behaviors and requirements are implemented by providing new implementations rather than by modifying existing implementations, since those are closed to modifications. In this sense, the OCP can be viewed simply as a restatement of the information hiding principle, and all of the associated advice will apply equally well here.

Further reading

For more on abstract interfaces, see:

Providing Hooks for External Customization

Abstract interfaces are a useful tool for achieving the goals of the OCP, but they are not the only tool. Here are techniques that enable your software components to be configured and extended by user applications:

  1. Configuration Parameters
  2. Template Method Pattern
  3. Callbacks
  4. Communicating Through Queues
  5. Table-Driven Behavioral Specifications

Configuration Parameters

One of the easiest ways to achieve the OCP is to provide configuration options for your software component. This way, user applications can control the behavior of your component without changing the component’s source code. You can supply configuration options in several ways:

  • Run-time configuration options, such as specifying desired values in a struct that is passed into a component through a constructor or initialization routine

  • C++ template parameters for classes and function

  • Using the preprocessor and #ifndef to provide compile-time configuration using a build system, a dedicated configuration system like KConfig, or a configuration header.

    #ifndef SCREEN_WIDTH_PX
    #error You must provide a definition for SCREEN_WIDTH_PX. 
    #endif
    
    #ifndef SCREEN_HEIGHT_PX
    #error You must provide a definition for SCREEN_HEIGHT_PX. 
    #endif
    
    #ifndef PIXELS_PER_BYTE
    #define PIXELS_PER_BYTE 8 
    #endif
    
    #ifndef SCREEN_BUFFER_SIZE_BYTES
    // Calculated: (width in px * height in px) / pixels per byte
    #define SCREEN_BUFFER_SIZE_BYTES ((SCREEN_WIDTH_PX * SCREEN_HEIGHT_PX) / PIXELS_PER_BYTE)
    #endif 
    

The more parameters that can be controlled from outside of your software component, the better. This reduces the likelihood that you will need to change the component in the future.

Template Method Pattern

The Template Method pattern can provide users with the ability to customize actions taken by your component. You can designate template methods that comprise one or more optional or required steps. These steps can be supplied or overridden by user programs, enabling user applications to change aspects of your component’s behavior without modifying the component source code.

Template methods are useful in the following scenarios:

  • Decoupling a component from platform-specific details. The application can specify those based on its target platform. This enables the component to work with any platform that can implement the required step(s).
  • Decoupling one component from another. Instead, a template method can be supplied, allowing an external component to connect the components together.
  • Allowing users to configure, extend, or override a componet’s behavior to meet their application’s requirements.

Callbacks

Similar to the Template Method pattern, callback functions provide user applications with customization points that can extend a component’s behavior without modifying its source code. Callback functions are typically invoked when a particular action occurs (e.g., transfer complete callback, error callback). User applications can implement callback handlers in order to connect components together from the outside or to take an application-specific action in response to the event.

The Observer Pattern can be used in the same way as a callback function. This pattern is useful when there are multiple subscribers who may be interested in an event.

Communicating Through Queues

Rather than communicating through interfaces, components can instead communicate through queues. The data format (closed to modification) passed through the queue becomes the primary interface. Producers and consumers of information can be swapped out without the need to modify the component(s) on the other end.

Communication through queues can also be combined with other mechanisms, such as template methods or callbacks. This combination can prevent your component from becoming coupled to a specific queue implementation. It also gives your users the flexibility to decide whether or not a queue should be used at all.

Table-Driven Behavioral Specifications

Some aspects of a component that are likely to change over time can be defined in a table, and the application will be made responsible for supplying the table implementation. The component itself can then become agnostic to the contents, simply understanding how to generically access the information present in the table for its purposes. Changes can be handled through the application’s definition of the table, leaving the source code of component itself unchanged. Tables are also useful for specifying application-specific configuration details.

Further reading

For more information, see:

Encapsulation

The OCP also depends on proper encapsulation. Within the OCP context, you should be particularly concerned about properly encapsulating implementation details so that they cannot be accessed or modified directly – components should only interact through the published interfaces. This means applying the following two design policies in your system:

  1. Eliminate Global Variables
  2. Make Member Variables Private

Eliminate Global Variables

Martin states this more strongly: “No global variables – ever”. He points out that the use of global variables makes the OCP impossible to achieve:

The argument against global variables is like the argument against public member variables. No module that depends upon a global variable can be closed against any other module that might write to that variable. Any module that uses the variable in a way that the other modules don’t expect will break those other modules. It is too risky to have many modules be subject to the whim of one badly behaved one.

Make Member Variables Private

Similar to the advice to eliminate global variables, all class member variables and “file global” member variables in a component should be made private so they cannot be accessed from outside of the component.

In OOD, we expect that the methods of a class are not closed to changes in the member variables of that class. However we do expect that any other class, including sub- classes are closed against changes to those variables. We have a name for this expectation, we call it: encapsulation.
– Robert Martin

Benefits

The OCP is an essential tool in designing software for change. Intentionally designing your software components to be easily extended and closed to change improves the ability of your systems to respond to change.

In the strictest sense, you are designing components that a) never change and b) are implemented against an abstract interface. Changes in requirements will then mean that you are going to a) create a new extension to existing behavior, or b) create a new component to implement the new requirements for the existing interface(s). In either case, you will not change old code to get your desired behavior.

Components that adhere to the OCP act as a “firewall” against change. Because you are adding new code instead of modifying existing code (and working through abstract interfaces), changes are prevented from cascading throughout a system. They are isolated to the creation of a new component and its integration into the system.

Quote

All systems change during their life cycles. This must be borne in mind when developing systems expected to last longer than the first version.
– Ivar Jacobson

Balancing Points

The OCP is best viewed as a goal or a guiding light. Software components cannot be 100% closed against all extensions or changes. Some changes will affect “closed” components by their nature. For example:

  • The methods of a class or component are not closed to changes in the private variables of that component, but external components interfacing with that component are closed to changes in the private variables.
  • Components are not closed to interface changes. They will cascade into all components that use the interface.
  • Components are not closed to changes resulting from discovering an implementation error, design error, or error in understanding.

Examples

The following examples show how different techniques can be used (and combined) to achieve the OCP in production code.

  • The AX5043 driver uses a template method to allow applications to configure the driver’s SPI interactions without modifying the driver source code. Several configuration parameters are also provided through “instance structures”, allowing applications to configure the radio for the intended use case. Multiple radio instances can be supported with the use of multiple instance structures.
  • embeddedartistry/libc provides common implementations for standard library functions and headers, while deferring architecture-specific implementation details to architecture-specific headers. New processor architectures can be supported by creating a new architecture-specific tree, defining the types appropriately for that platform, and supplying additional function implementations as needed. The base headers and function implementations require no modifications.
  • embeddedartistry/libmemory provides a single, common memory allocation interface (i.e., malloc and friends) with multiple implementations to the interface that can be selected by users. New allocation schemes are added by creating new implementations. This library also uses the template method pattern to enable user applications to externally specify locking behavior for thread safety without modifying the implementation source.
  • The embeddedartistry/printf library provides a template method that applications can use to configure the output for the printf family of functions (putchar_()), as well as multiple compile-time configuration options that can tune the library for a specific application and platform.
  • The Embedded Virtual Machine framework heavily uses the OCP by applying many of the techniques discussed above: abstract interfaces with multiple implementations, template methods, configuration options, and callbacks.
  • The Patterns in the Machine repository was designed with the OCP in mind. Using the OCP is also discussed in the corresponding book.

References

  • Information Hiding

  • Wikipedia: Open-Closed Principle

  • Wikipedia: SOLID

    The open–closed principle: “Software entities … should be open for extension, but closed for modification.”

  • Object Oriented Software Construction by Bertrand Meyer

    A class is closed, since it may be compiled, stored in a library, baselined, and used by client classes. But it is also open, since any new class may use it as parent, adding new features. When a descendant class is defined, there is no need to change the original or to disturb its clients.

  • Design Principles and Design Patterns by Robert Martin

    A module should be open for extension but closed for modification.

    Of all the principles of object oriented design, this is the most important. It originated from the work of Bertrand Meyer It means simply this: We should write our modules so that they can be extended, without requiring them to be modified. In other words, we want to be able to change what the modules do, without changing the source code of the modules.

    This may sound contradictory, but there are several techniques for achieving the OCP on a large scale. All of these techniques are based upon abstraction. Abstraction is the key to the OCP. Several of these techniques are described below.

    The techniques Martin mentions to achieve the OCP are dynamic polymorphism (i.e., inheritance from an abstract interface) and static polymorphism (i.e., polymorphism acheived through templates and generics).

    Architectural Goals of the OCP. By using these techniques to conform to the OCP, we can create modules that are extensible, without being changed. This means that, with a little forethought, we can add new features to existing code, without changing the existing code and by only adding new code. This is an ideal that can be difficult to achieve, but you will see it achieved, several times, in the case studies later on in this book.

    Even if the OCP cannot be fully achieved, even partial OCP compliance can make dramatic improvements in the structure of an application. It is always better if changes do not propogate into existing code that already works. If you don’t have to change working code, you aren’t likely to break it.

  • The Open-Closed Principle” by Robert Martin

    As Ivar Jacobson said: “All systems change during their life cycles. This must be borne in mind when developing systems expected to last longer than the first version.” How can we create designs that are stable in the face of change and that will last longer than the first version? Bertrand Meyer gave us guidance as long ago as 1988 when he coined the now famous open-closed principle. To paraphrase him:

    SOFTWARE ENTITIES (CLASSES, MODULES, FUNCTIONS, ETC.) SHOULD BE OPEN FOR EXTENSION, BUT CLOSED FOR MODIFICATION.

    When a single change to a program results in a cascade of changes to dependent modules, that program exhibits the undesirable attributes that we have come to associate with “bad” design. The program becomes fragile, rigid, unpredictable and unreusable. The open- closed principle attacks this in a very straightforward way. It says that you should design modules that never change. When requirements change, you extend the behavior of such modules by adding new code, not by changing old code that already works.

    Modules that conform to the open-closed principle have two primary attributes.

    1. They are “Open For Extension”.
      This means that the behavior of the module can be extended. That we can make the module behave in new and different ways as the requirements of the application change, or to meet the needs of new applications.
    2. They are “Closed for Modification”.
      The source code of such a module is inviolate. No one is allowed to make source code changes to it.

    It would seem that these two attributes are at odds with each other. The normal way to extend the behavior of a module is to make changes to that module. A module that cannot be changed is normally thought to have a fixed behavior. How can these two opposing attributes be resolved?

    Abstraction is the Key.

    In C++, using the principles of object oriented design, it is possible to create abstractions that are fixed and yet represent an unbounded group of possible behaviors. The abstractions are abstract base classes, and the unbounded group of possible behaviors is represented by all the possible derivative classes. It is possible for a module to manipulate an abstraction. Such a module can be closed for modification since it depends upon an abstraction that is fixed. Yet the behavior of that module can be extended by creating new derivatives of the abstraction.

    Since programs that conform to the open-closed principle are changed by adding new code, rather than by changing existing code, they do not experience the cascade of changes exhibited by non-conforming programs.

    It should be clear that no significant program can be 100% closed. […] In general, no matter how “closed” a module is, there will always be some kind of change against which it is not closed.

    Since closure cannot be complete, it must be strategic. That is, the designer must choose the kinds of changes against which to close his design. This takes a certain amount of prescience derived from experience. The experienced designer knows the users and the industry well enough to judge the probability of different kinds of changes. He then makes sure that the open-closed principle is invoked for the most probable changes.

    In OOD, we expect that the methods of a class are not closed to changes in the member variables of that class. However we do expect that any other class, including sub- classes are closed against changes to those variables. We have a name for this expectation, we call it: encapsulation.

    Make all Member Variables Private.

    No Global Variables — Ever.

    The argument against global variables is similar to the argument against pubic member variables. No module that depends upon a global variable can be closed against any other module that might write to that variable. Any module that uses the variable in a way that the other modules don’t expect, will break those other modules. It is too risky to have many modules be subject to the whim of one badly behaved one.

    On the other hand, in cases where a global variable has very few dependents, or cannot be used in an inconsistent way, they do little harm. The designer must assess how much closure is sacrificed to a global and determine if the convenience offered by the global is worth the cost.

    Again, there are issues of style that come into play. The alternatives to using globals are usually very inexpensive. In those cases it is bad style to use a technique that risks even a tiny amount of closure over one that does not carry such a risk. However, there are cases where the convenience of a global is significant. The global variables cout and cin are common examples. In such cases, if the open-closed principle is not violated, then the convenience may be worth the style violation.

    Conformance to this principle is what yeilds the greatest benefits claimed for object oriented technology; i.e. reusability and maintainability. Yet conformance to this principle is not achieved simply by using an object oriented programming language. Rather, it requires a dedication on the part of the designer to apply abstraction to those parts of the program that the designer feels are going to be subject to change.

  • Patterns in the Machine : A Software Engineering Guide to Embedded Development by John Taylor and Wayne Taylor

    The Open-Closed Principle (OCP) says that you want to design your software so that if you add new features or some new functionality, you only add new code; you do not rewrite existing code. A traditional example of the OCP is to introduce an abstract interface to decouple a “client” from the “server.”

    Restatement:

    Adding new functionality should not be done by editing existing source code. That is the frame of mind you need to approach designing every module with, and you achieve it by putting together a loosely coupled design.

    PIM’s interpretation of the OCP, then, is quite literally: Adding new functionality should not be done by editing existing source code. That is the frame of mind you need to approach designing every module with, and you achieve it by putting together a loosely coupled design.

    Strategic (OCP)—Think long term when designing and implementing modules.

    It is also important to recognize that a module cannot be 100% closed against all possible extensions. Furthermore, not every module needs to be OCP friendly. It is the responsibility of the designer, architect, and developer to choose what type of extensions a module is closed against. As with most things in life, good choices come with experience, and a lot of experience comes from bad choices.

  • The Open-Closed Principle. and what hides behind it | HackerNoon.com by Vadim Samokhin

Leaky Abstraction

An abstraction is “leaky” when it exposes details about the underlying implementation to the users that should ideally be hidden away.

The term was coined by Joel Spolsky in The Law of Leaky Abstractions, where he states:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

The existence of leaky abstractions means that abstractions do not always simplify our work in the intended ways. While we can often operate with the abstractions, we are not free from understanding the implementation beyond the abstraction. Eventually, a problem will appear, and we will need to look behind the curtain and learn the underlying details.

As Spolsky points out, one implication of this is that abstractions save us time working, but they don’t save us time learning. One tradeoff here is that we can build more complex systems more quickly, but debugging problems that leak from the abstractions can be a lengthy process.

Examples of Leaky Abstractions

Spolsky cites a number of examples in his article. For embedded developers, the most relevant is the idea that memory is abstracted as a big flat address space, but often this abstraction leaks. As Spolsky points out, iterating over a large two-dimensional array can have different performance characteristics depending on how you iterate due to potentials for page faults / cache invalidations or other underlying processor performance implications.

Many embedded systems abstractions designed around hardware are leaky because they encode information in the abstraction that is not actually generally applicable across different hardware components. When a component is swapped, the designers may realize that their abstractions leaked some hardware details that do not apply to the new component, resulting in changes to the interface and/or application.

References

  • Wikipedia: Leaky Abstraction

    A leaky abstraction is an abstraction that exposes details and limitations of its underlying implementation to its users that should ideally be hidden away. Leaky abstractions are considered problematic, since the purpose of abstractions is to manage complexity by concealing unnecessary details from the user.

  • Towards a New Model of Abstraction in Software Engineering by Gregor Kiczales
  • The Law of Leaky Abstractions by Joel Spolsky

    That is, approximately, the magic of TCP. It is what computer scientists like to call an abstraction: a simplification of something much more complicated that is going on under the covers. As it turns out, a lot of computer programming consists of building abstractions. What is a string library? It’s a way to pretend that computers can manipulate strings just as easily as they can manipulate numbers. What is a file system? It’s a way to pretend that a hard drive isn’t really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes.

    Back to TCP. Earlier for the sake of simplicity I told a little fib, and some of you have steam coming out of your ears by now because this fib is driving you crazy. I said that TCP guarantees that your message will arrive. It doesn’t, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can’t do anything about it and your message doesn’t arrive. If you were curt with the system administrators in your company and they punished you by plugging you into an overloaded hub, only some of your IP packets will get through, and TCP will work, but everything will be really slow.

    This is what I call a leaky abstraction. TCP attempts to provide a complete abstraction of an underlying unreliable network, but sometimes, the network leaks through the abstraction and you feel the things that the abstraction can’t quite protect you from. This is but one example of what I’ve dubbed the Law of Leaky Abstractions:

    All non-trivial abstractions, to some degree, are leaky.

    Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

    One reason the law of leaky abstractions is problematic is that it means that abstractions do not really simplify our lives as much as they were meant to. When I’m training someone to be a C++ programmer, it would be nice if I never had to teach them about char*’s and pointer arithmetic. It would be nice if I could go straight to STL strings. But one day they’ll write the code “foo” + “bar”, and truly bizarre things will happen, and then I’ll have to stop and teach them all about char*’s anyway. Or one day they’ll be trying to call a Windows API function that is documented as having an OUT LPTSTR argument and they won’t be able to understand how to call it until they learn about char*’s, and pointers, and Unicode, and wchar_t’s, and the TCHAR header files, and all that stuff that leaks up.

    The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying “learn how to do it manually first, then use the wizzy tool to save time.” Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning.

    And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

    Ten years ago, we might have imagined that new programming paradigms would have made programming easier by now. Indeed, the abstractions we’ve created over the years do allow us to deal with new orders of complexity in software development that we didn’t have to deal with ten or fifteen years ago, like GUI programming and network programming. And while these great tools, like modern OO forms-based languages, let us get a lot of work done incredibly quickly, suddenly one day we need to figure out a problem where the abstraction leaked, and it takes 2 weeks.

Information Hiding

Information hiding is a software design principle, where certain aspects of a program or module (the “secrets”) are inaccessible to clients. The primary goal is to prevent extensive modification to clients whenever the implementation details of a module or program are changed. This is done by hiding aspects of an implementation that might change behind a stable interface that protects clients from the implementation details. Users of that interface (whether it is a module, class, or function) will perform operations purely through its interface. This way, if the implementation changes, the clients do not have to change.

Information hiding serves as a criterion that can be used to decompose a system into modules. The principle is also useful for reducing coupling within a system.

Applying Information Hiding

The key challenge when applying information hiding is determining what information should be hidden and what should be exposed. Parnas suggests that the heuristic we should use is hiding those details that are “likely to change”. This way our changes have only a local effect, since we have hidden the details to be changed behind a firewall of some kind (e.g., an abstract interface).

Using this heuristic, information hiding can be thought of as a three-step process:

  1. Identify all of the pieces of a design that are likely to change or other design details that you might want to hide
    • We call these our “secrets”, and we want to hide these details from external entities.
  2. Isolate each secret into its own module, class, or function
    • This is done so that changes to the secret are isolated and don’t affect the rest of the program
    • Parnas: “All data structures that reveal the presence or number of certain components should be included in separate information hiding modules with abstract interfaces.”
  3. Design intermediate interfaces that are insensitive to changes in the underlying secrets.
    • That is, make sure that your secrets aren’t leaked or revealed by your interfaces’

As Parnas reminds us:

The interface must be general but the contents should not. Specialization is necessary for economy and flexibility.

For advice on practically implementing information hiding in C, please see this entry.

What to Hide?

Here are some ideas that are likely to change:

  • Hardware dependencies
  • Physical storage layout
  • Data formats
  • Data conversions
  • Data access/traversal
  • Algorithms
  • Implementation dependencies
  • Object creation (e.g., factories)
  • Transport layer mechanisms
  • Non-standard language features
  • Library routines
  • Complex data structures
  • Complex logic
  • Global variables (hide behind access routines if they are actually needed)
  • Data constraints, such as array sizes and loop limits
  • Business logic

Benefits

Information hiding helps us improve the ability to change our system while minimizing the impact on other parts of the system.

Since the implementation details are hidden behind a common interface, changes to the implementation do not require changes to the rest of the program (as long as the implementation sufficiently satisfies the specified interface). Changes are isolated to a single location in the ideal case.

For example, a common sensor API may be provided, and any sensor driver that satisfies the interface requirements can be used within the system. This enables a team to swap between components (and even processors or board designs) as necessary.

Areas of Concern

Critiques

Revisiting Information Hiding – Reflections on Classical and Nonclassical Modularity provides some critiques on the classical idea of information hiding, mainly:

  • Information hiding as described corresponds to classical logic, but we know that humans aren’t very good at (don’t default to) reasoning via classical logic; instead, we tend to rely on inductive reasoning

  • Some stakeholders of a system will eventually care about aspects of the system that are supposed to be “hidden”

    When information is hidden behind an abstraction barrier, there are potential stakeholders (or concerns), who are interested in that hidden information.

    If a stakeholder wants to reason about “nonfunctional” aspects of a system, such as time or space complexity or power consumption, he probably needs to reason about implementation details hidden behind abstraction barriers

    For example, different implementations have different time or space behavior of the operations, different rounding errors, different optimizations that the compiler will apply, or different power consumption. To some stakeholders, such concerns may well be important; while some require higher performance, others require higher precision.

  • For large systems, basic assumptions of information hiding (e.g., monotonicity and composability) may not seem to hold:

    Composing two programs which are each separately correct with respect to, say, lock-based concurrency or transactions, are in general no longer correct when composed. More importantly, the non-composability can in general not be deduced from the interfaces of these components

  • Information hiding and separation of concerns can be contradictory:

    For instance, in the canonical AOP example of updating a display when a figure element changes, a figure element module hides less information behind its interface when the display updating logic is separated from the figure element module. In that sense, and contrary to the common notion that information hiding and separation of concerns go hand in hand, information hiding and separation of concerns can actually be contradictory.

    There are many concerns that, when separated, need to expose implementation detail in such a way that information hiding is impaired. Developers have to decide what information to hide and what to separate. This is a fundamental problem of classical modularity

  • Information hiding is limited by “the tyranny of the dominant decomposition”

    What can be hidden behind an interface depends on the chosen decomposition, but there is no “best” decomposition; rather, from each point of view (such as the points of views of the different stakeholders) a different decomposition (and hence information hiding policy) would be most appropriate. What one stakeholder would hide as an implementation detail be- hind an interface is of primary importance to another stakeholder, who would hence choose a different decomposition that exposes that information.

  • Information hiding may still hinder modifiability

    Even if a software system is successfully modularized, and the information needs of all stakeholders and concerns are reflected in the interfaces of components, information hiding might still hinder software evolution. This might be surprising at first, because information hiding is supposed to facilitate software evolution by hiding design decisions behind interfaces, so that they can be changed at will. The problem is that the original developers have to anticipate change and to modularize the software accordingly.

    Unfortunately, it is not clear how to decide up-front which design decisions need to be hidden and which need to be exposed.

    One could argue that successful modularization just needs better planning to better assess what is likely to change, but we believe that this is an implausible assumption because large-scale software systems are assembled from many independently developed and independently evolving parts; hence, a big global “plan” is infeasible and unanticipated changes are unavoidable in long-living projects.

    If the design decision is hidden behind the interface, software evolution might bring a new stakeholder (or concern) into the system which needs to access that hidden information. So, to support the information need of this stakeholder (or concern), the design decision should not have been hidden in the first place.

General Response

In our view, these critiques are valid for some definitions or applications of information hiding – particularly, those which seem to be more far reaching and absolute than our own uses.

Our primary focus is using information hiding as a way to design for change. We do not view information hiding as strictly meaning we do not need to examine implementation details of a module. Instead, we think that implementation details should be hidden from other software modules as much as possible. In fact, in our own model, we still dive into secrets often: our focus is on abstracting common behaviors that other modules require for interaction, while still allowing custom APIs for non-common behaviors. After all, we still need to initialize and configure many modules and properties for our system’s use cases, often in a non-generic and not-easily-abstractable way. For example, every IMU component has a different set of configuration options and setup procedures; however, our algorithms that operate on IMU samples can do so in a generic way that can be made common across devices. We do not have to be so absolute in the application of information hiding.

Certainly it is true every implementation that satisfies an interface is not composable into a suitable system that meets our requirements – we must implement for our system specifically. We aren’t striving to implement general, idealized abstract components that are suitable in all cases.

Engineering is a balancing act: we cannot purely focus on any one desirable quality. Sometimes, we may decide to trade off information hiding for improving separation of concerns. But it also might be true that we still hide things for the rest of the system behind a single interface, allowing the deeper implementation to be tightly coupled.

Finally, we must remember that we are human. We will not create perfect artifacts, no matter how hard we try. Changes will come, some of our assumptions will be invalidated, and we will have to rework the system. This is inevitable, and it is not a reason to apply the techniques that we have at hand to improve our chances of success.

References

  • The original source for information hiding is David Parnas’s Paper Designing Software for Ease of Extension and Contraction. Also recommended is his paperOn the Criteria to Be Used in Decomposing Systems into Modules

  • The Secret History of Information Hiding by David Parnas

    Nonetheless, the paper did explain that it was information distribution that made systems “dirty” by establishing almost invisible connections between supposedly independent modules

    After some thought, it became clear to me that information distribution, and how to avoid it, had to be a big part of that course. I decided to do this by means of a project with limited information distribution and demonstrate the benefits of a “clean” design

    This program is still used as an example of the principle. Only once has anyone noticed that it contains a flaw caused by a failure to hide an important assumption. Every module in the system was written with the knowledge that the data comprised strings of strings. This led to a very inefficient sorting algorithm because comparing two strings, an operation that would be repeated many times, is relatively slow. Considerable speed-up could be obtained if the words were sorted once and replaced by a set of integers with the property that if two words are alphabetically ordered, the integers representing them will have the same order. Sorting strings of integers can be done much more quickly than sorting strings of strings. The module interfaces described in [9] do not allow this simple improvement to be confined to one module.

    My mistake illustrates how easy it is to distribute information unnecessarily. This is a very common error when people attempt to use the information-hiding principle. While the basic idea is to hide information that is likely to change, one can often profit by hiding other information as well because it allows the re-use of algorithms or the use of more efficient algorithms

    Several software design “experts” have suggested that one should reflect exciting business structures and file structures in the structure of the software. In my experience, this speeds up the software development process (by making decisions quickly) but leads to software that is a burden on its owners should they try to update their data structures or change their organisation. Reflecting changeable facts in software structure is a violation of the information-hiding principle.

    In determining requirements it is very important to know about the environment but it is rarely the right “move” to reflect that environment in the program structure.

  • Missing in Action: Information Hiding by Steve McConnell

    In the 20th Anniversary edition of The Mythical Man-Month, Fred Brooks concludes that his criticism of information hiding was one of the few ways in which the first edition of his book was wrong. “Parnas was right, and I was wrong about information hiding,” he proclaims (Brooks 1995). Barry Boehm reported in 1987 that information hiding was a powerful technique for eliminating rework, and he pointed out that it was particularly effective during software evolution (“Improving Software Productivity,” IEEE Computer, September 1987). As incremental, evolutionary development styles become more popular, the value of information hiding can only increase.

    To use information hiding, begin your design by listing the design secrets that you want to hide. As the example suggested, the most common kind of secret is a design decision that you think might change. Separate each design secret by assigning it to its own class or subroutine or other design unit. Then isolate–encapsulate–each design secret so that if it does change, the change doesn’t affect the rest of the program.

    Aside from providing support for structured and object-oriented design, information hiding has unique heuristic power, a unique ability to inspire effective design solutions.

    Object design provides the heuristic power of modeling the world in objects, but object thinking wouldn’t help you avoid declaring the ID as an int instead of an IDTYPE in the example. The object designer would ask, “Should an ID be treated as an object?” Depending on his project’s coding standards, a “Yes” answer might mean that he has to create interface and implementation source-code files for the ID class; write a constructor, destructor, copy operator, and assignment operator; document it all; have it all reviewed; and place it under configuration control. Unless the designer is exceptionally motivated, he will decide, “No, it isn’t worth creating a whole class just for an ID. I’ll just use _int_s.”

    Note what just happened. A useful design alternative, that of simply hiding the ID’s data type, was not even considered. If, instead, the designer had asked, “What about the ID should be hidden?” he might well have decided to hide its type behind a simple type declaration that substitutes IDTYPE for int. The difference between object design and information hiding in this example is more subtle than a clash of explicit rules and regulations. Object design would approve of this design decision as much as information hiding would. Rather, the difference is one of heuristics–thinking about information hiding inspires and promotes design decisions that thinking about objects does not.

  • C2 Wiki: Information Hiding

  • Wikipedia: Information Hiding

  • Revisiting Information Hiding – Reflections on Classical and Nonclassical Modularity

    sInformation hiding is to distinguish the concrete implementation of a software component and its more abstract interface, so that details of the implementation are hidden behind the interface. This supports modular reasoning and independent evolution of the “hidden parts” of a component. If developers have carefully chosen to hide those parts ‘most likely to change’, most changes have only local effects: The interfaces act as a kind of firewall that prevents the propagation of change.

    A key question in information hiding is which information to hide and which information to expose. Parnas suggested the heuristic to hide what is ‘likely to change’.

    the programming research community, in which information hiding is nowadays such an undisputed dogma of modularity that Fred Brooks even felt that he had to apologize to Parnas for questioning it.

    Both information hiding and abstraction imply some notion of substitutability: A module’s implementation can be replaced by a different implementation adhering to the same interface, and since the implementation was hidden to other components in the system in the first place, these other components should not be disturbed by the change.

    The distinction between an interface and implementations of that interface, which is the at the core of information hiding and abstraction, is related to logic. The interface corresponds to a set of axioms, and the implementation of the interface corresponds to a model of the axioms. Substitutability is reflected by the fact that the same theorems hold for all models of the axioms (by soundness of the logic), hence we cannot distinguish two different models within the theory. The heuristic of hiding what is most likely to change is reflected by the design of axiom systems (say, the axioms of a group in abstract algebra) in such a way that there are many interesting models of the axioms.

    As in the case of information hiding and abstraction, compositionality implies a strong notion of substitutability: If a subprogram is substituted by a different subprogram with the same meaning, the meaning of the whole program will still be the same. In other words, we can successfully reason more abstractly on an expression by thinking of its meaning rather than of the expression itself. When reasoning about the program, we can identify expressions having the same meaning. This process is typically called equational reasoning. Since the actual expression is hidden behind its meaning, compositionality can also be seen as a specific form of information hiding by considering the meaning of a program to be its interface.

Change Control Board [CCB]

In software development, a Change Control Board (CCB) or Software Change Control Board (SCCB) is a committee that consists of Subject Matter Experts (SME) and Technical Chiefs, who will make decisions regarding whether or not proposed changes to a software project should be implemented.

Work Breakdown Structure [WBS]

“A work-breakdown structure (WBS) in project management and systems engineering, is a deliverable-oriented breakdown of a project into smaller components. A work breakdown structure is a key project deliverable that organizes the team’s work into manageable sections.”

The Open Group Architecture Framework [TOGAF]

“A framework for enterprise architecture that provides an approach for designing, planning, implementing, and governing an enterprise information technology architecture.”