Case Study: Boeing 737 MAX Crashes

18 August 2020 by Phillip Johnston • Last updated 10 June 2021The Boeing 737 MAX-8 and MAX-9 aircraft were grounded after Ethiopian Airlines and Lion air crashes both resulted in the deaths of everyone on board. The implicated system is the the Maneuvering Characteristics Augmentation System (MCAS), which is part of the flight management computer software. The MCAS was designed to correct for an increased potential to stall the plane due to mechanical design changes. When fed an Angle-of-Attack reading from a bad sensor, the MCAS triggered at an improper time, forcing the plane nosedown and overriding pilot input. The …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Case Study: Toyota Unintended Acceleration

18 August 2020 by Phillip Johnston • Last updated 13 September 2022Unintended acceleration, or the loss of driver control over engine power, in Toyota cars is suspected in the deaths of at least 89 people and injuries to at least 57 more (with hundreds of additional cases being settled). Toyota, the Department of Transportation, the U.S. National Highway Traffic Safety Administration, and journalists cited “driver error” or “stuck acceleration pedals” due to floor mats as the primary cause. The official finding of the joint NHTSA and NASA investigation confirmed this opinion. NASA’s team did find one theoretical way for Toyota’s …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Case Study: Therac-25

18 August 2020 by Phillip Johnston • Last updated 13 September 2022The Therac-25 deaths are a canonical example of software accidents. Two different errors caused multiple patients to receive massive overdoses of radiation, resulting in serious injuries or death. Case Studies This is a well documented accident, so we will refer you to the following sources for understanding what went wrong: Summary Video by Phil Koopman Wikipedia: Therac-25 IEEE: An Investigation of the Therac-25 Accidents, by Nancy Leveson and Clark Turner, is one of the original investigatory articles published on the topic Medical Devices: The Therac-25, by Nancy Leveson, is an updated and …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

nRF52 Security Vulnerability: APPROTECT Bypass

18 August 2020 by Phillip Johnston • Last updated 15 August 2023nRF52 processors provides “access port protection” (APPROTECT register) to enable read back protection and to disable the debug interface. When this feature is enabled, a debugger’s read/write access to CPU register and memory mapped addresses is blocked. Access port protection is often set in production firmware to prevent access to debug interfaces on a device. According to Nordic, it cannot be disabled without erasing all RAM and Flash memory. LimitedResults has worked out a method for bypassing APPROTECT and reactivating the SWD interface using voltage glitching to inject a …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Case Study: GPS Rollover Problems (2019)

18 August 2020 by Phillip Johnston 6 April 2019 was “GPS Rollover day”. The existing GPS standard uses a 10-bit counter to keep track of the number of weeks elapsed since 6 January 1980. Since it’s a 10-bit counter, the max count is 210-1, or 1023, weeks. On the 1024th week, it will rollover to week 0. If you’re keeping score, you’ll be surprised that this was a big deal at all: we already experienced a GPS rollover on 21 August 1999. Many GPS equipment manufacturers understand the rollover requirements and designed their systems to accommodate it. We knew in advnace …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Paper: Designing Software for Ease of Extension and Contraction

5 June 2020 by Phillip Johnston • Last updated 16 August 2024 “Designing Software for Ease of Extension and Contraction” is a classic paper by David L. Parnas which was published in the IEEE Transactions on Software Engineering in 1979. Designing software to be extensible and easily contracted is discussed as a special case of design for change. A number of ways that extension and contraction problems manifest themselves in current software are explained. Four steps in the design of software that is more flexible are then discussed. The most critical step is the design of a software structure called …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Paper: On the Criteria to Be Used in Decomposing Systems into Modules

5 June 2020 by Phillip Johnston • Last updated 15 August 2023 “On the Criteria to Be Used in Decomposing Systems into Modules” is a classic paper published by David Parnas in the Communications of the ACM in 1972. This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time. The effectiveness of a “modularization” is dependent upon the criteria used in dividing the system into modules. A system design problem is presented and both a conventional and unconventional decomposition are described. It is shown that the …

To access this content, you must purchase a Membership - check out the different options here. If you're a member, log in.

Design Pattern Catalogue

Design patterns are valuable concepts to add to your mental library. These are non-obvious solutions to scenarios and problems frequently encountered in the programming world.

Exposure to patterns shows you how many of your problems have been previously solved by others, freeing you to focus your limited time and energy reserves on those problems which are truly novel for your product. You will also develop a mental library of patterns that you can work toward in refactoring efforts.

Table of Contents:

Architecture

Architectural patterns are general, reusable solutions to software architecture problems. They have broader scope than software design patterns and are typically intended to resolve software engineering issues.

General information about architectural patterns:

Common architectural patterns for embedded systems include:

General Software

Subsections:

  1. General Software Design Patterns
  2. Decoupling Patterns
  3. Asynchronous and Event-Driven Patterns
  4. Patterns for Displacing Legacy Systems
  5. Multi-threading Patterns
  6. Refactoring Patterns

General Software Design Patterns

For generally useful software architecture patterns, see:

Generally useful programming patterns:

Decoupling Patterns

  • Pointer Array Pattern – Write generic drivers by accessing registers through a table lookup.
  • Callback Function – Provide a reference to a function (or function-like object) that will be invoked when a specific condition holds true.
  • Facade Pattern – Provide a unified interface to a set of interfaces in a subsystem. Facade defines a higher-level interface that makes the subsystem easier to use.
  • Mediator Pattern – Define an object that encapsulates how a set of objects interact. Mediator promotes loose coupling by keeping objects from referring to each other explicitly, and it lets you vary their interaction independently.
    • Main Pattern – decouple modules (and keep as many as possible independent of the underlying platform) by making the application responsible for the connections between and configuration of each module. We consider this a variation of the Mediator pattern.
    • Managing Complexity with the Mediator and Facade Patterns provides descriptions and examples of those patterns
  • Template Method Pattern – Defines the skeleton of an algorithm or operation in high-level steps. Users or subclasses can override or implement the behavior of specific steps within the algorithm, but are not able to modify the general algorithm flow itself.
  • Generation Gap – Separate generated code from non-generated code through the Template Method pattern. This ensures that customizing the generated code does not require users to modify generated code.
  • Non-Virtual Interface – controls how methods in a base class are overridden. Base classes use public, non-virtual functions that can be called by clients. Overridable methods are defined as protected, virtual members.
  • Adapter – The adapter pattern allows the interface of an existing module to be used as/with another interface. This is done by adding in a thin “adapter” module that maps the desired interface onto the existing interface.
  • Configuration Table Pattern – store configuration and initialization information inside of a table, and pass the table to an initialization routine that iterates over the table entries.

Asynchronous and Event-Driven Patterns

  • Active Object – decouples method execution from method invocation for objects that each reside in their own thread of control. Typically, an active object is constructed using an internal thread and a queue of operations or events that will be executed on the active object’s thread.
  • Message Passing – invoke behavior by sending messages to communicate what you want done (in contrast to the default approach of directly invoking functions provided by a module/object).
  • Message Queue – components used for communication in software systems that represent a queue of messages or events that are awaiting processing.
  • Event Loop – waits for and processes (or dispatches) events or messages in a program.

Patterns for Displacing Legacy Systems

  • Patterns of Legacy Displacement, by Ian Cartwright, Rob Horn, and James Lewis, is a collection of patterns that can be used to modernize legacy systems. The primary link contains a general discussion on updating legacy systems, as well as an example for removing a Middleware
    • Critical Aggregator – Combine data from different parts of the business to support making critical decisions
    • Divert the Flow – First divert cross-organization activities away from legacy
    • Extract Product Lines – Identify and separate systems by product line.
    • Feature Parity – Replicate existing functionality of a legacy system using a new technology stack.
    • Legacy Mimic – New system interacts with legacy system in such a way that the old system is not aware of any changes.
    • Transitional Architecture – Software elements installed to ease the displacement of a legacy system that we intend to remove when the displacement is complete.

Multi-threading Patterns

Refactoring Patterns

Embedded Software

For a general discussion of embedded software patterns, see:

Subsections:

General

  • Monitor-Actuator Pair Design Pattern
    • How can we avoid a single-point of failure in a safety-critical component?
    • Use two independent hardware components operating as a “monitor-actuator” CPU pair. The actuator CPU is the component that actually performs the computation or control function. An independent monitor Chip is used to detect when the actuator has failed and mitigate the failure (e.g., by resetting the actuator or triggering an emergency stop)
  • Digital twin – a virtual representation that serves as the real-time digital counterpart of a physical object or process.

Watchdog Timers

For general watchdog timer usage patterns:

Hardware Interfaces

State Machines

  • Ultimate Hook
    • How can you provide a consistent policy for handling events, while allowing clients to override every aspect of the default behavior?
    • Hierarchical state nesting – a composite state can define the default behavior (the common look and feel) and supply an “outer shell” for nesting client substates
  • Reminder
    • How can you prevent loosely related functions of a system from becoming tightly coupled by a common event?
      • Consider, for example, periodic data acquisition, in which a sensor producing the data needs to be polled at a predetermined rate
      • Assume that a periodic TIMEOUT event is dispatched to the system at the desired rate to provide the stimulus for polling the sensor.
      • Because the system has only one external event (the TIMEOUT event), it seems that this event needs to trigger both the polling of the sensor and the processing of the data
      • We don’t want that
    • Invent a stimulus to propagate a reminder that data is ready for processing
    • Moreover, you can use state nesting to arrange these two functions in a hierarchical relation, which gives you even more control over the behavior
  • Deferred Event
    • How can we defer an event (which can be postponed within a certain limit) that is received at an inconvenient moment until the system is finished processing the current activity?
    • Defer the new request and handle it at a more convenient time, which effectively leads to altering the sequence of events presented to the state machine
  • Orthogonal Component
    • How can we model objects consisting of relatively independent parts with their own state behavior without using orthogonal reasons?
    • Use object composition. Concurrency virtually always arises within objects by aggregation; that is, multiple states of the components can contribute to a single state of the composite object.
  • Transition to History
    • How can we handle a high-priority event during a state transition in a composite state, and then return to the most recent substate of the composite state after processing completes?
    • Store the most recently active leaf substate as a dedicated “history” data member, and use that member during a “transition to history” staet
  • State-Local Storage
    • How can we reduce the runtime memory requirements for complex state machines, where we have multiple complex states that don’t require use of all of the extended state variables needed by other states?
    • Allow state machine designer to reduce the memory footprint of a state machine by providing variables local to states. As the state machine transitions from one state to another, the SLS mechanism automatically overlays the extended-state variables for the target state configuration on top of the no-longer needed variables for the source state configuration. This results in a lower memory footprint.

Memory Allocation

  • How do you allocate memory to store your data structures?
  • Fixed Allocation
    • How can you ensure you will never run out of memory?
    • Pre-allocate objects during initialization.
  • Variable Allocation
    • How can you avoid unused empty space?
    • Allocate and deallocate variable-sized objects as and when you need them.
  • Memory Discard
    • How can you allocate temporary objects?
    • Allocate objects from a temporary workspace and discard it on completion.
  • Pooled Allocation
    • How can you allocate a large number of similar objects?
    • Pre-allocate a pool of objects, and recycle unused objects.
  • Compaction
    • How do you recover memory lost to fragmentation?
    • Move objects in memory to remove unused space between them.
  • Reference Counting
    • How do you know when to delete a shared object?
    • Keep a count of the references to each shared object, and delete each object when its count is zero.
  • Garbage Collection
    • How do you know when to delete shared objects?
    • Identify unreferenced objects, and de-allocate them.

Managing Limited Memory

  • How can you manage memory use across a whole system?
  • Memory Limit:
    • How can you share out memory between multiple competing components?
    • Set limits for each component and fail allocations that exceed the limits.
  • Small Interfaces
    • How can you reduce the memory overheads of component interfaces?
    • Design interfaces so that clients control data transfer.
  • Captain Oates
    • How can you fulfill the most important demands for memory?
    • Sacrifice memory used by less vital components rather than fail more important tasks.
  • Read-Only Memory
    • What can you do with read-only code and data?
    • Store read-only code and data in read-only memory.
  • Hooks
    • How can you change information in read-only storage?
    • Access read-only information through hooks in writable storage and change the hooks to give the illusion of changing the information.

Small Data Structures

  • How can you reduce the memory needed for your data?
  • How can you reduce the memory needed to store a data structure?
    • Pack data items within the structure so that they occupy the minimum space. *Sharing
  • How can you avoid multiple copies of the same information?
    • Store the information once, and share it everywhere it is needed. *Copy-on-Write
  • How can you change a shared object without affecting its other clients?
    • Share the object until you need to change it, then copy it and use the copy in future. *Embedded Pointer
  • How can you reduce the space used by a collection of objects?
  • How can you support several different implementations of an object?
    • Make each implementation satisfy a common interface.

Compression

  • How can you fit a quart of data into a pint pot of memory?
  • Table Compression
    • How do you compress many short strings?
    • Encode each element in a variable number of bits so that the more common elements require fewer bits.
  • Difference Coding
    • How can you reduce the memory used by sequences of data?
    • Represent sequences according to the differences between each item.
  • Adaptive Compression
    • How can you reduce the memory needed to store a large amount of bulk data?
    • Use an adaptive compression algorithm.

Secondary Storage

  • What can you do when you have run out of primary storage?
  • Application Switching
    • How can you reduce the memory requirements of a system that provides many different functions?
    • Split your system into independent executables, and run only one at a time.
  • Data File
    • What can you do when your data doesn’t fit into main memory?
    • Process the data a little at a time and keep the rest on secondary storage.
  • Resource Files
    • How can you manage lots of configuration data?
    • Keep configuration data on secondary storage, and load and discard each item as necessary.
  • Packages
    • How can you manage a large program with lots of optional pieces?
    • Split the program into packages, and load each package only when it’s needed.
  • Paging
    • How can you provide the illusion of infinite memory?
    • Keep a system’s code and data on secondary storage, and move them to and from main memory as required.

Fail-Safe Data Storage

User Interfaces

Distributed Systems

  • Distributed Embedded Scheduling
  • Distributed Time
  • Circuit Breaker Pattern

    The Circuit Breaker pattern is designed to detect failures and encapsulates the logic of preventing a system from executing an operation that’s set to fail. Instead of repeatedly making requests to a service that is likely unavailable or facing issues, the circuit breaker stops all attempts for a while, giving the troubled service time to recover.

  • Patterns of Distributed Systems, a project by Unmesh Joshi, documents a number of patterns. The primary page also describes general problem situations where these patterns can help, as well as how patterns can be sequenced together.
    • Clock-Bound Wait – Wait to cover the uncertainty in time across cluster nodes before reading and writing values so values can be correctly ordered across cluster nodes.
    • Consistent Core – Maintain a smaller cluster providing stronger consistency to allow large data cluster to coordinate server activities without implementing quorum based algorithms.
    • Emergent Leader – Order cluster nodes based on their age within the cluster to allow nodes to select a leader without running an explicit election.
    • Fixed Partitions – Keep the number of partitions fixed to keep the mapping of data to the partition unchanged when size of a cluster changes.
    • Follower Reads – Serve read requests from followers to achieve better throughput and lower latency
    • Generation Clock) – A monotonically increasing number indicating the generation of the server
    • Gossip Dissemination – Use random selection of nodes to pass on information to ensure it reaches all the nodes in the cluster without flooding the network
    • Heartbeat – Show a server is available by periodically sending a message to all the other servers.
    • High-water Mark – An index in the write ahead log showing the last successful replication.
    • Hybrid Clock – Use a combination of system timestamp and logical timestamp to have versions as date-time, which can be ordered
    • Idempotent Receiver – Identify requests from clients uniquely so they can ignore duplicate requests when client retries
    • Key-Range Partitions – Partition data in sorted key ranges to efficiently handle range queries.
    • Lamport Clock – Use logical timestamps as a version for a value to allow ordering of values across servers
    • Leader and Followers – Have a single server to coordinate replication across a set of servers.
    • Lease – Use time bound leases for cluster nodes to coordinate their activities.
    • Low-water Mark – An index in the write ahead log showing which portion of the log can be discarded.
    • Paxos – Use two consensus building phases to reach safe consensus even when nodes disconnect
    • Quorum – Avoid two groups of servers making independent decisions, by requiring majority for taking every decision.
    • Replicated Log – Keep the state of multiple nodes synchronized by using a write-ahead log that is replicated to all the cluster nodes.
    • Request Batch – Combine multiple requests to optimally utilise the network.
    • Request Pipeline – Improve latency by sending multiple requests on the connection without waiting for the response of the previous requests.
    • Request Waiting List – Track client requests which require responses after the criteria to respond is met based on responses from other cluster nodes.
    • Revert to Source – Identify the originating source of data and integrate to that
    • Segmented Log – Split log into multiple smaller files instead of a single large file for easier operations
    • Single Socket Channel – Maintain order of the requests sent to a server by using a single TCP connection
    • Singular Update Queue – Use a single thread to process requests asynchronously to maintain order without blocking the caller
    • State Watch – Notify clients when specific values change on the server
    • Two Phase Commit – Update resources on multiple nodes in one atomic operation.
    • Versioned Value – Store every update to a value with a new version, to allow reading historical values.
    • Version Vector – Maintain a list of counters, one per cluster node, to detect concurrent updates
    • Write-Ahead Log – Provide durability guarantee without the storage data structures to be flushed to disk, by persisting every state change as a command to the append only log

Security

For general discussion of security patterns:

Security anti-patterns:

Safety

For a general discussion of safety architectural patterns:

Subsections:

Avoiding Single Points of Failure

For a general discussion on patterns for single points of failure:

Patterns:

  • Monitor-Actuator Pair Design Pattern
    • How can we avoid a single-point of failure in a safety-critical component?
    • Use two independent hardware components operating as a “monitor-actuator” CPU pair. The actuator CPU is the component that actually performs the computation or control function. An independent monitor Chip is used to detect when the actuator has failed and mitigate the failure (e.g., by resetting the actuator or triggering an emergency stop)

Critical System Isolation

For a general discussion on critical system isolation patterns:

Redundancy Management

For a general discussion on redundancy management patterns:

Data Integrity

For a general overview on data integrity patterns:

Testing

For a general overview of testing patterns, see:

Design for test patterns:

  • Dependency Injection
    • How do we design the system-under-test so that we can replace its dependencies at run time?
    • The client provides the depended-on object to the system-under-test
  • Dependency Lookup
    • How do we design the system-under-test so that we can replace its dependencies at run time?
    • The system-under-test asks another object to return the depended-on object before it uses it
  • Humble Object
    • How can we test a component that is tightly coupled to a framework which we cannot easily integrate into our test environment?
    • We extract all the logic from the hard-to-test component into a component that is testable via synchronous tests. As a result, the Humble Object component becomes a very thin adapter layer that contains very little code. Each time the Humble Object is called by the framework, it delegates to the testable component.
  • Test Hook
    • How can we make code testable when it is coupled with hard-coded dependencies?
    • We modify the behavior of the system-under-test or its dependencies to support testing by putting adding a functional hook or testing flag that can be changed during runtime

Cloud Computing

For a general overview of cloud computing patterns, see:

Field Atlas Entries

No posts found.