Q&A: How do You Reason About Flash Memory Size Overhead?

We received the following question from a reader:

I’m trying to decide on flash size for an MCU. The program size is currently 100kB. I’m considering 256 kB and 512 kB flash at the moment. How much “factor of safety” do you recommend?

This is a complex question, because there are so many factors which can change the recommendation.

At a basic level, there are two truths:

  1. If your product has an update capability, you will be asked to add new features
  2. If you can add new features, you will do so until there is no space left

Based on those truths, the foundational recommendation as a firmware engineer is to pick the largest storage space that your product can afford.

This is rarely a sufficient answer in practice. Firmware concerns are rarely the primary consideration. Often, teams are pressured to use the cheapest parts possible to reduce the BOM cost and increase the profit margin on the device. Picking the part with the most memory without justification would be ill-advised.

Given that, I can’t recommend a magical overhead percentage or number that generally works. I can, however, give you pointers on what to think about when coming up with your margin of safety. By the end of this article you will have additional tools that will help you justify you need for your recommendation.

  1. Intended Product Lifetime
  2. Feature Roadmap
  3. Debug Support
  4. Firmware Update Strategies
  5. Non-volatile Storage
  6. Don’t Forget Security

Intended Product Lifetime

As we mentioned above, if the product supports updates, we know that if it exists long enough we will eventually fill up the memory. The scenario plays out like this:

  1. Product A is released
  2. Product A is updated, space starts to fill up
  3. Product B comes out, which has more memory and processing power, and also supports new software features and/or is accompanied by a backend re-architecture
  4. The team decides that they want Product A to support the same new features or same new backend architecture that Product B supports
  5. Product A doesn’t have sufficient memory to make that work
  6. Company has to decide between a) spending the time and money to try to make it work on Product A, b) dropping support for Product A, c) supporting two different backends for the two different products, or d) dropping new features or the new backend

For the record, I have never seen a company choose option D.

The reason the product lifetime was shortened here was due to memory running out. Otherwise, the product still functions acceptably. Since we can’t predict the future, we don’t know if this will happen to your product. But we have seen this scenario play out repeatedly, and on a long enough timeline you are likely to encounter it as well.

If you’re lucky, it might be an acceptably long time, like what happened for Sonos. However, your customers will not be happy to learn that their hardware is deprecated due to memory constraints, even if you did support it for “a long time”.

This timeline can be extended with a larger memory safety net. Will that provide us with any guarantees? No. But it does provide room for future growth, especially if you depend on external services, integrations, or communication protocols.

We think those extra dollars are worth it from a sustainability and customer service perspective. It’s one factor that helps ensure your hardware will last as long as possible.

Feature Roadmap

It is important to involve yourself in product planning. What features are on the wish list? What are the new products coming down the line?

If you don’t know what’s coming down the line, there’s no way you can pick a reliable margin of safety. It’s also important to know that even if you do know what’s coming down the line, you still might pick a number that is wrong – but at least you made your guess with as much information as you could.

Once you have an idea of features, spend some time trying to model these features. Look for libraries that you will need to integrate with, and see how big they are when compiled. See if there is a demo project out there that implements basic support for what you need, and see how big that is when compiled. If you have the time, build a quick demo within your code base as a sanity check.

Debug Support

You need to reserve margin for flashing debug software to your board. Inevitably, as space becomes constrained you cannot even use a debug release on your actual hardware. Then you’ll be stuck in the special hell that involves debugging -O3 microcontroller firmware.

We recommend reserving at least 20% overhead to support debugging. This might be reducing or disabling optimizations, at least in one module, or adding in additional logging support. For a more accurate estimation, compile your existing software in both debug mode (no optimizations, or -Og, with -g specified for debug symbols) and in release mode (e.g., -O2 or -O3). Compare the two numbers to come up with a “compression factor” and use that as a starting point for a safety margin. This is a really rough estimate: not all code is equally compressible. Add additional buffer to account for logging, print statements, or parametric data collection that may be added during validation.

If you disable assert calls in release mode, I recommend running the same experiment with a “release-mode-assert-enabled” build. That way, you can see the overhead you need to preserve just to support the use of assert statements to try to catch errors while debugging.

Firmware Update Strategies

Firmware update strategies are a major component of determining flash spacing, and one that is not often considered until after parts are selected. Figure out how your product is going to support updates over the long-term, and factor that into how you are calculating storage overhead.

There are three common update strategies:

  1. There is a bootloader and a firmware image, updates overwrite the firmware image. If something goes wrong, the bootloader waits for a new image.
  2. There is a bootloader and two “ping-pong” buffers. One buffer is active, the other is backup. When an update occurs, the backup buffer is overwritten, it becomes the active buffer, and the other buffer becomes backup. If something goes wrong, the bootloader switches to the backup firmware image.
  3. There is a bootloader, a firmware image, and a “failsafe” firmware image. If something goes wrong, the bootloader switches to the “failsafe” image, which knows how to call home and initiate an update.

Each of these approaches has a different memory layout.

All three of them require a bootloader. It would be good to have this bootloader ready before the final part selection is made. Then, you will have real memory sizes from which to calculate the requirements. The complexity of the bootloader will largely drive the size; a bootloader that can fall back to a failsafe image doesn’t need to know how to update itself, while a bootloader with no fallback option will need to be able to receive updates. A rough rule of thumb for bare metal or RTOS-based microcontroller projects is 64 kB, although much smaller and much larger bootloaders are easily created. Be sure to add some overhead for future growth if you have a current bootloader.

If you have a ping-pong buffer set up you will only have half the remaining space once the size of the bootloader is accounted for. So, with a 32 kB bootloader and 512 kB flash size, you only have 240 kB of usable space for a single firmware image. That’s less than it seems on the surface.

If you have a failsafe firmware image that will be flashed to the device, it is likewise good to create that before picking the final part. This firmware image just needs to know how to boot the device, turn on any components that are needed to perform an update, call home to ask for an update, receive the update, and apply it. Much of the standard firmware capabilities can be removed from the fail-safe version, since it is not needed for the basic operations of calling home and initiating an update. With 512 kB flash, let’s say you have a 32 kB bootloader and 64 kB failsafe firmware. That leaves you 416 kB for your primary firmware image.

Many product development efforts begin without a run-time firmware update scheme in place. The firmware image is flashed to the chip with a tool. Then, after the parts are already locked in, firmware update support is added. Sometimes, there isn’t sufficient leftover space to support the desired update mode. This is why it’s important to decide on an update strategy and to model it before the parts are locked.

Perhaps runtime updates should be the first feature we implement on any firmware project!

Note: All numbers used in this section are completely made up for demonstration purposes. Do not use them as estimates for program sizes, your application will be different.

Non-volatile Storage

Occasionally, products require some kind of non-volatile information storage. This is information that we want to write to the device at specific times. It should not be erased during normal operations, and it should persist across updates.

Some processors will provide non-volatile storage in a separate memory region, others do not. Some teams handle this using a dedicated EEPROM or flash chip on the PCB, others do not. If you don’t have one of these two options, you will likely want to allocate a reserved region in the internal flash memory for future non-volatile data storage needs.

This non-volatile storage can be used for any number of reasons:

  • Serial number
  • MAC addresses
  • Device-specific security keys
  • Run-time device settings
  • Factory calibration data:
    • IMU calibration & offset tables
    • Camera color offset table
    • Camera white balance offset table

Calibration and configuration data varies widely from one device to another, so you will need to understand what data will need to be stored. Check with the factory, electrical engineering, and application software teams at a minimum to figure out what configuration and calibration data might be needed. Then, double that number to account for future growth. If the number seems particularly large, you can reduce that overhead if you think it’s truly excessive (maybe you’re planning for a big 8kB table to be stored, but all other needs are in the 4–64 byte range).

At a minimum, reserve one 4 kB page for non-volatile storage. Ideally this will be in a region that supports “locking”, so the page cannot be modified accidentally.

Don’t Forget Security

Many product development efforts that we have observed don’t even consider security in the initial development stages. It’s usually tacked on after the initial launch, sometimes even after an incident occurs. And some companies never secure their firmware.

If you haven’t thought of security yet, now is a good time. Those TLS and encryption libraries are some of the most costly you’ll come across in terms of RAM and ROM. You might be surprised to learn that wolfSSL alone requires 30–100kB of ROM.

You might have much less overhead than you think once you factor security-related libraries into the equation.

What Did We Miss?

If you have other helpful rules of thumb for determining flash storage overhead for a new product, let us know in the comments below!

Further Reading

  1. Case Study: Sonos End-of-Life Strategy
  2. Embedded Systems Security Resources
  3. Field Atlas: Security

Share Your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.