Memory Allocation

15 October 2025 by Phillip JohnstonTable of Contents: Visualizing Memory Allocation Allocation Schemes Common System Policies Problems Arising from Dynamic Memory Allocation Runtime Problems Structural Problems Implementations References Visualizing Memory Allocation Here are some useful articles that explain how memory allocation typically works, including helpful diagrams and other visualizations. Visual overview of a custom malloc() implementation Memory Allocation, an article that includes a visual overview of how dynamic memory allocation works Allocation Schemes First-fit Free List Buddy Allocation Slab allocation Fixed-size block allocation (Memory pools) Intrusive containers (lists, maps, etc.) Individual elements put into the container store all the metadata …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Differentiating the Dependency Inversion Principle and Dependency Injection

10 February 2025 by Phillip Johnston • Last updated 20 February 2025It’s quite common to think that the Dependency Inversion Principle and Dependency Injection are the same thing, or relatedly to wonder whether they are the same. For the most part, this is fine, and doesn’t really need to be corrected. However, they are not quite the same thing, and it can be useful to understand what distinguishes these two things. Fundamentally, you can distinguish these two in the following way: The Dependency Inversion Principle (DIP) is a software design guideline. This guideline recommends that classes should only have direct …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

System Response Time Guidelines

5 August 2024 by Phillip JohnstonGeneral response time targets: ≤ 100ms response times is about the limit for having the user feel the system is reacting instantly 250ms is around the threshold where a system starts to feel sluggish 1 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. No special feedback is necessary when (100ms < delay ≤ 1 second), but the sluggishness of the system will be noticed. Beyond one second, start providing an busy or progress indicator. 10 seconds is about the limit for keeping the user’s attention focused …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Software Design Principles

6 March 2024 by Phillip Johnston • Last updated 5 January 2026This page serves as top-level entry collecting the various software design principles that we have written about on the website. Architectural Principles The following principles apply to the internal architecture of a software system. They influence how code is organized, how responsibilities are distributed, and how the various components in a system interact. Information Hiding Coupling Cohesion Encapsulation Separation of Concerns Single Responsibility Principle Open-Closed Principle [OCP] Dependency Inversion Principle Interface Segregation Principle Avoid Mixed Metaphors Conway’s Law Usability Principles The Software Usability Principles page collects design principles that …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Software Usability Principles

6 March 2024 by Phillip Johnston • Last updated 8 August 2025The following are a collection of useful software design principles for designing usable software. While we are not usability experts, we find that reviewing these principles does help guide our development in a positive direction. Below, we present six high-level principles, with supporting principles/qualities grouped as sub-bullets. This is not a canonical list or organization scheme, but it is useful for our purposes. Usefulness The system should provide necessary utilities and address the real needs of users The system should not impede efficient use by a skilled, experienced user …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Programming as Theory Building

Programming as Theory Building  is a classic paper by Peter Naur. After reading the paper, I can see why it has been oft recommended and has maintained staying power all these decades. The situation that Naur describes is just as accurate today.

Abstract

Here’s the intro to the PDF version:

In the article, which follows, note that the quality of the designing programmer’s work is related to the quality of the match between his theory of the problem and his theory of the solution. Note that the quality of a later programmer’s work is related to the match between his theories and the previous programmer’s theories.

Using Naur’s ideas, the designer’s job is not to pass along “the design” but to pass along “the theories” driving the design. The latter goal is more useful and more appropriate. It also highlights that knowledge of the theory is tacit in the owning, and so passing along the theory requires passing along both explicit and tacit knowledge.

Files

Reading Club Discussion

This paper was selected for our members’ reading club. Follow this link to discuss the paper.

Summary

Naur’s hypothesis is that programming, which involves both the design and the actual coding of a software system, is more akin to an embodied skill like playing music or painting or writing. You can’t just blindly follow rules to produce excellent works. You need to learn, ideally with a teacher who can give you feedback, and you must apply a model/concept/theory that you develop in your head to your work.

Applied to software development, Naur points out that it takes more than the source code and documentation to be able to work on a system. You need to develop and maintain a cohesive theory of how the system works. This is developed among the team during the initial development and ongoing maintenance. This theory is rarely captured sufficiently in the documentation or the source code, and thus it cannot be easily created without individuals who maintain the theory in their minds. If you’re a new person and want to effectively work on the system, you must be able to interact with other developers who have the theory of operations in their mind. They will be able to answer your questions and to provide feedback in why (or why not) particular approaches are suitable for the system.

What matters most, in Naur’s view is the theory of the system – not the code or the documentation. The system theory is what allows the programmer to explain what the system does and effectively make changes. The challenge is that this theory largely lives in the heads of the programmers. It is more than just “rules” to follow. It is embodied knowledge.

By extension, it matters that each member of the team has a shared theory of the system. The longer that a shared conceptual view can be maintained, the better the system can be maintained. If people have different theories, they will inevitably clash with the changes they make, until the system has decays into a mess. Maintaining system quality and integrity is thus a problem of maintaining a cohesive mental model of the system and its operations. One way we can develop better maintain cohesion is by developing metaphors for the system (or its parts) that can be easily shared and understood by all who are working on the system.

Key Lessons

There are a couple of critical points that Naur makes at the end of the paper, and we should keep these in mind:

  • The death of a program happens when the team that has the system’s theory in their heads dissolves
  • Life of a program is extended by successfully passing on the theory to a new generation of programmers
  • Reviving a program after the team has dissolved is essentially impossible – you probably can’t recreate the original theory in your own mind, and a rewrite would be more effective as you’re actually developing a cohesive mental model of your own that can then be passed on. 
  • If you’re documenting a system from the development/maintenance perspective, what matters most is that you document the operational theory, connections between things, and metaphors for different parts of the system.

My Thoughts

I’m sure that many of us have felt this about code, even if we have not conceived of it it in this way before. All you have to do is look at someone else’s code: what makes perfect sense to them can be completely unapproachable to you. Or maybe you submit a patch to an open source project, getting it rejected for some obscure reason you couldn’t predict just be looking at the project. Or join a new team and struggle to get up-to-speed with the system. Or get handed a legacy project where the original developer has long since moved on, and try to make sense out of the mess. 

I feel the reality of this model deeply with my forays in to the Medtronic PB560 source code. I had thought I would have made much more progress on the review than I have. But progress is slow: I cannot understand why the system works the way it does, how different pieces interact, and what the overall theory is just by looking at the source code. If I could run the code, it would be a different story – I could develop my own theory through experimentation, even if it wasn’t a match to the original developers.

I also think Naur’s idea explains the prevalence of “Not invented here” syndrome: it’s “easier” to rewrite something and develop your own conceptual model than it is to pick up someone else’s. Of course, being justifiable does not mean that it is the rational choice. I work with plenty of libraries for which I do not need to have a conceptual model of the implementation, because I am not maintaining it.

Highlights

You can see the annotated text here.

suggests that programming properly should be regarded as an activity by which the programmers form or achieve a certain kind of insight, a theory, of the matters at hand. This suggestion is in contrast to what appears to be a more common notion, that programming should be regarded as a production of a program and certain other texts.

I like the inclusion of design along with implementation, which tends to be the exclusive focus of “programming”:

shall use the word programming to denote the whole activity of design and implementation of programmed solutions.

Examples given by Naur: access to the team makes all the difference, even if you have full source code and documentation.

In the present context the significant issue is the importance of the personal advice from group A in the matters that concerned how to implement the extensions M to the language. During the design phase group B made suggestions for the manner in which the extensions should be accommodated and submitted them to group A for review. In several major cases it turned out that the solutions suggested by group B were found by group A to make no use of the facilities that were not only inherent in the structure of the existing compiler but were discussed at length in its documentation, and to be based instead on additions to that structure in the form of patches that effectively destroyed its power and simplicity. The members of group A were able to spot these cases instantly and could propose simple and effective solutions, framed entirely within the existing structure.

[…]

This is an example of how the full program text and additional documentation is insufficient in conveying to even the highly motivated group B the deeper insight into the design, that theory which is immediately present to the members of group A.

[…]

In the years following these events the compiler developed by group B was taken over by other programmers of the same organization, without guidance from group A. Information obtained by a member of group A about the compiler resulting from the further modification of it after about 10 years made it clear that at that later stage the original powerful structure was still visible, but made entirely ineffective by amorphous additions of many different kinds. Thus, again, the program text and its documentation has proved insufficient as a carrier of some of the most important design ideas.

The conclusion seems inescapable that at least with certain kinds of large programs, the continued adaption, modification, and correction of errors in them, is essentially dependent on a certain kind of knowledge possessed by a group of programmers who are closely and continuously connected with them.

Very briefly, a person who has or possesses a theory in this sense knows how to do certain things and in addition can support the actual doing with explanations, justifications, and answers to queries, about the activity of concern.

In intelligent behaviour the person displays, not any particular knowledge of facts, but the ability to do certain things, such as to make and appreciate jokes, to talk grammatically, or to fish. More particularly, the intelligent performance is characterized in part by the person’s doing them well, according to certain criteria, but further displays the person’s ability to apply the criteria so as to detect and correct lapses, to learn from the examples of others, and so forth. It may be noted that this notion of intelligence does not rely on any notion that the intelligent behaviour depends on the person’s following or adhering to rules, prescriptions, or methods.

In terms of Ryle’s notion of theory, what has to be built by the programmer is a theory of how certain affairs of the world will be handled by, or supported by, a computer program. On the Theory Building View of programming the theory built by the programmers has primacy over such other products as program texts, user documentation, and additional documentation such as specifications.

the programmer’s knowledge transcends that given in documentation in at least three essential areas:

  1. The programmer having the theory of the program can explain how the solution relates to the affairs of the world that it helps to handle. […]
  2. The programmer having the theory of the program can explain why each part of the program is what it is, in other words is able to support the actual program text with a justification of some sort. […]
  3. The programmer having the theory of the program is able to respond constructively to any demand for a modification of the program so as to support the affairs of the world in a new manner. Designing how a modification is best incorporated into an established program depends on the perception of the similarity of the new demand with the operational facilities already built into the program.

One thing seems to be agreed by everyone, that software will be modified. It is invariably the case that a program, once in operation, will be felt to be only part of the answer to the problems at hand. Also the very use of the program itself will inspire ideas for further useful services that the program ought to provide. Hence the need for ways to handle modifications.

The question of program modifications is closely tied to that of programming costs. In the face of a need for a changed manner of operation of the program, one hopes to achieve a saving of costs by making modifications of an existing program text, rather than by writing an entirely new program.

Naur provides some counterpoints to my ideas around design for change. But I have to note that “low cost” (his terms) are different from the goal of reducing the cost. You can have reduced costs that are still high.

The expectation that program modifications at low cost ought to be possible is one that calls for closer analysis. First it should be noted that such an expectation cannot be supported by analogy with modifications of other complicated man–made constructions. Where modifications are occasionally put into action, for example in the case of buildings, they are well known to be expensive and in fact complete demolition of the existing building followed by new construction is often found to be preferable economically. Second, the expectation of the possibility of low cost program modifications conceivably finds support in the fact that a program is a text held in a medium allowing for easy editing. For this support to be valid it must clearly be assumed that the dominating cost is one of text manipulation. This would agree with a notion of programming as text production. On the Theory Building View this whole argument is false. This view gives no support to an expectation that program modifications at low cost are generally possible.

A further closely related issue is that of program flexibility. In including flexibility in a program we build into the program certain operational facilities that are not immediately demanded, but which are likely to turn out to be useful. Thus a flexible program is able to handle certain classes of changes of external circumstances without being modified.

On this point below I just totally disagree that it’s the goal, and sort of a strawman argument. There’s a difference between “easier to change” and “flexible enough to handle all foreseeable possible needs”. And the costs here are completely imagined according to the scenario. For example, is it really so expensive for me to abstract away the underlying SPI controller with a read/write API that needs to be supplied from the outside? Absolutely not.

It could be that we view “change” differently enough to make this conclusion obvious to him, but specious to me.

It is often stated that programs should be designed to include a lot of flexibility, so as to be readily adaptable to changing circumstances. Such advice may be reasonable as far as flexibility that can be easily achieved is concerned. However, flexibility can in general only be achieved at a substantial cost. Each item of it has to be designed, including what circumstances it has to cover and by what kind of parameters it should be controlled. Then it has to be implemented, tested, and described. This cost is incurred in achieving a program feature whose usefulness depends entirely on future events. It must be obvious that built–in program flexibility is no answer to the general demand for adapting programs to the changing circumstances of the world.

Now, on this point below, we totally agree. And perhaps this ties better into DfC points anyway: that when we make changes blindly or following the “most obvious route” without care to maintaining a theory of operations or design, then we get a tangled mess that is increasingly difficult to change.

What stands out at me is Naur’s point: “For a program to retain its quality it is mandatory that each modification is firmly grounded in the theory of it.” How easy it is to see when modifications clash with the theory!

On the basis of the Theory Building View the decay of a program text as a result of modifications made by programmers without a proper grasp of the underlying theory becomes understandable. As a matter of fact, if viewed merely as a change of the program text and of the external behaviour of the execution, a given desired modification may usually be realized in many different ways, all correct. At the same time, if viewed in relation to the theory of the program these ways may look very different, some of them perhaps conforming to that theory or extending it in a natural way, while others may be wholly inconsistent with that theory, perhaps having the character of unintegrated patches on the main part of the program. This difference of character of various changes is one that can only make sense to the programmer who possesses the theory of the program. At the same time the character of changes made in a program text is vital to the longer term viability of the program. For a program to retain its quality it is mandatory that each modification is firmly grounded in the theory of it. Indeed, the very notion of qualities such as simplicity and good structure can only be understood in terms of the theory of the program, since they characterize the actual program text in relation to such program texts that might have been written to achieve the same execution behaviour, but which exist only as possibilities in the programmer’s understanding.

The building of the program is the same as the building of the theory of it by and in the team of programmers. During the program life a programmer team possessing its theory remains in active control of the program, and in particular retains control over all modifications. The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

The extended life of a program according to these notions depends on the taking over by new generations of programmers of the theory of the program. For a new programmer to come to possess an existing theory of a program it is insufficient that he or she has the opportunity to become familiar with the program text and other documentation. What is required is that the new programmer has the opportunity to work in close contact with the programmers who already possess the theory, so as to be able to become familiar with the place of the program in the wider context of the relevant real world situations and so as to acquire the knowledge of how the program works and how unusual program reactions and program modifications are handled within the program theory.

This problem of education of new programmers in an existing theory of a program is quite similar to that of the educational problem of other activities where the knowledge of how to do certain things dominates over the knowledge that certain things are the case, such as writing and playing a music instrument. The most important educational activity is the student’s doing the relevant things under suitable supervision and guidance.

I totally think the below is the case from trying to look at the PB 560 code base. It’s hard for me to keep up progress precisely because I get stuck.

A very important consequence of the Theory Building View is that program revival, that is reestablishing the theory of a program merely from the documentation, is strictly impossible.

The complete rewrite is the only hope…

In preference to program revival, the Theory Building View suggests, the existing program text should be discarded and the new–formed programmer team should be given the opportunity to solve the given problem afresh.

Firmware Update Support

Firmware update support is an essential capability for contemporary embedded devices, regardless of whether or not they are connected to the internet on a regular basis. Firmware updates are used to add new capabilities after launch, correct errors, and address security vulnerabilities. Firmware update support also ensures that devices can remain useful for a longer period of time, as the development team can respond to changes in the operating environment and customer expectations.

Supporting firmware updates for your system requires a number of supporting device-side and infrastructure capabilities. Update reliability is also significantly aided by adopting supporting processes in your organization.

Firmware updates are a good example of Software Engineering as applied to embedded systems. We have to carefully design the update mechanism, account for failure modes, and make tradeoffs based on our system’s design goals and constraints.

Table of Contents:

  1. Device Capabilities
  2. Infrastructure Capabilities
  3. Supporting Processes
  4. Accounting for the Possibility of Failure
  5. Sub-topics and Variations
  6. Case Studies of Update-related Problems and Vulnerabilities
  7. Related Blog Posts
  8. References

Device Capabilities

Required

  • Device software is split into multiple images.
    • At a minimum, you will need to split the device into a Bootloader and Application.
    • Software may be further refined depending on reliability and update schemes, such as into a Loader, a distinct Updater, or a Fallback Image
  • Fail-safe support in case of an update failure or bad update
  • An integrity check, ensuring that the provided binary has not been corrupted during checksum
  • The device can report its software version
  • The update mechanism is resilient against power and network loss during the update process. This is necessary to avoid bricking devices!
    • Ideally, updates will be “atomic” and
  • A method for receiving firmware updates (whether via USB connection, SD card, or Over-the-Air)
  • Code signing support, which is used to verify both provenance and integrity of an update
  • Support for rolling back to a previous version on command (many implementations only allow you to increase the version)
  • Version data storage, schemas, and communication protocols to support data migrations in response to an update process
  • Ability to specify pre- and post-update actions (e.g., a script) in addition to the firmware update
    • This can be extremely useful for supporting actions like data migrations or file removals, which might need to be executed after an update has completed.
    • This can be useful for implementing post-update sanity checks to make sure that the update processes completed successfully. If the checks do not pass, roll back to the previous version.

Infrastructure Capabilities

Required

  • Minimally, produce a checksum that can be used to verify the binary was transferred without error. Ideally, code signing will be used instead, as you can also verify that the update is coming from an authorized source.
  • Cohort binning of devices enables you to control which devices receive specific firmware updates.
    • This is useful for deploying beta builds to an interested population of beta testers.
    • This is also a common way of implementing staged rollouts.
  • Staged rollouts of firmware updates provides a safer update mechanism than a “deploy to everyone” approach. You start with a small population of devices to make sure that the update succeeds and does not introduce significant new issues. If everything looks good, you continue to roll out the update to increasingly large segments of your population.
  • Ability to roll back firmware to a previous version in the event of a bad update
  • Check-in and heartbeat messages are useful for determining:
    • Whether or not an update was successful (a device will check in with a new firmware version)
    • The distribution of versions throughout the fleet
      • Often, teams are surprised to realize that there’s a distribution of versions, even when a new OTA update is released. Also, you will find that some devices never update.

Supporting Processes

  • Exclusive use of the customer-facing firmware update mechanism to ensure its reliability
    • Many teams leave OTA updates to the end of the project, for example, which is far too late in the process to ensure reliability. A better approach is implementing OTA updates first, and then requiring all development and internal testing to use OTA updates rather than JTAG or USB. This way, the update mechanisms receive significant mileage, and the kinks are worked out before the product is released to customers.
  • Significant testing of the update process, especially with the use of fault injection to ensure that fallbacks and fail-safes work as intended
  • Version Data Storage, Protocols, and Schemas
    • Data migrations are a common challenge that you will need to deal with when updating devices. For example, you might update the “device settings” layout, change a communication protocol, or update an sqlite database schema.
    • Without versioning these items, you cannot safely perform a migration as part of the firmware update process.

Accounting for the Possibility of Failure

Firmware updates can go wrong in many ways. It is important to ensure that your update system is resilient to all of these failures. After all, if you brick devices, you cannot remotely fix them.

To protect against data corruption during the transfer, you should compare the received contents against a checksum to ensure the integrity of the update. Ideally, however, you will use code signing to provide both an integrity check and an assurance that the build comes from an approved source.

Updates should, ideally, be atomic: either the whole update is applied, or no update occurs at all. This is especially important in guarding against corruption due to loss of network connectivity or loss of power during an update.The most common approach to atomic updates is to have dual application partitions in device storage, which we will call “A” and “B”. This approach is akin to the common “double buffering” pattern.

  • The bootloader will boot from partition A, which is currently active.
  • When an update is received, the contents will be placed into partition B.
    • If the update process fails for any reason, the bootloader will continue to boot from partition A.
    • If the update succeeds, the bootloader will boot from partition B.
      • If there is a problem identified during the boot process, this can be indicated to the bootloader, which can automatically fall back to partition A
  • When the next update is received, it will be placed into partition A.

However, memory and storage constraints can make atomic updates difficult to achieve with many embedded devices.

  • RAM may be sufficiently limited such that an entire update payload cannot be received before being applied, but must be streamed to flash instead.
    • This means that the contents of flash could be overwritten with data before it can be determined that the checksum or signature matches the expected value.
  • Flash may be sufficiently limited such that there is not space for a bootloader, two complete applications, and other artifacts.

In cases where the dual partition approach cannot be used, we will create a “fallback” application, which effectively takes the place of the second partition:

  • Updates will always be placed into the main application slot.
  • If an update fails, or some problem is identified during the application boot process, the bootloader will automatically boot into the fallback application.
  • The fallback application contains only the minimal amount of support to configure the processor and its components so that it can connect to the server and request a new update. This allows it to be much smaller than a complete second application.

This fallback firmware must be heavily tested so that it can be trusted to restore a system to a working state in the event of an update failure. Ideally, it will not need any updates once the device is deployed, as there is no fallback in place when the fallback firmware update fails.

If you cannot support any of these schemes with your current resources, you will need to add more storage or avoid OTA updates completely. You run the risk of a power or network failure completely disabling your devices in the field. Wired updates are less sensitive here, as long as you provide a tool that can be used to re-flash the device from a corrupted state (e.g., a DFU utility).

Sub-topics and Variations

  • Another vulnerability in the LPC55S69 ROM / Oxide describes a problem in the LPC55S69 In-System Programming code for the signing mechanism, which allows an attacker to gain non-persistent code execution with a carefully crafted update regardless of whether the update is signed

References

Monorepo Development

This page collects articles, tools, and resources discussing our approach to internally developing code in a monorepo while distributing code to standalone repositories for consumption.

Warning

We are not monorepo advocates, and we will tolerate no arguments about what the “correct” approach is. We simply offer documentation and rationale for our current approach. How you organize your code is dependent on many factors that are unique to your own goals. There are reasons to keep your code in one place, and reasons to work with many distinct repositories. Additionally, we do not have a pure monorepo setup – there are a number of repositories that live completely outside of the monorepo environment.

Articles

Infrastructure

Alternatives

Tools exist that allow you to work with a set of repositories as if they were a monorepo, while actually keeping them separated. These might be more appealing to those facing similar problems.ls

  • josh-project/josh is a related effort to ours. It is focused on supporting sparse-checkouts and implements a caching proxy.
  • repo allows you to create a monorepo-like working environment by combining multiple repositories. It is similar in workflow to the monorepo-and-split workflow.
    • We did not go down this path because it assumes you use Gerrit, and our automated workflows are tied into PRs. It will be easier to deal with our existing quality enforcement automation in a monorepo setup.
  • Sourcegraph provides a tool that supports “batch changes”, allowing you to coordinate changes across repositories
    • We did not choose this because we would need to set up a Sourcegraph server. Scripting the use of git tools within a monorepo was much easier for us to develop, test, and automate.
  • Zephyr’s West uses the concept of workspaces and manifest files to allow you to work with multiple git repositories. You can define a manifest for your monorepo-to-be.
    • This was simply not appealing to us.

References

An Experiment: Develop in a Monorepo and Distribute to Standalone Repositories

I am ideologically aligned with distributing code in repositories that serve a single purpose. As a software consumer, I think that smaller, single purpose repositories and components are more approachable and easier to use. Of course, every decision comes with its tradeoffs, and this one has caused us significant pain. The overhead in developing and …

Software is a cost, not an asset

6 December 2022 by Phillip Johnston • Last updated 28 March 2024Software companies often think of their code (or their software application) as an asset. Given one perspective, this makes sense: software has value, and you can buy it or sell it. This applies to organizations whose product is the code that they are selling. But for most teams, the code is not the product. Therefore, it is not an asset – it is a cost. The real asset is the business capability/value the software provides. Augmented by software, the business (or customer) can now do something they couldn’t do …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.