Problems with Globbing in Build Systems

In our build systems courses (primarily Introduction to Build Systems using Make and Creating a Cross-Platform Build System for Embedded Projects with CMake), we explicitly recommend against file “globbing” (also called “wildcarding”). Partially, this is due to our own experiences with Make and CMake problems caused by globbing. We also think that it is better to be explicit, and the tradeoff in updating a build file list when a new file is added is extremely minimal. But more than that, the Meson team agrees in the sense that they do not even allow wildcarding in the system, and the CMake team explicitly recommends against it (see References).

However, we are often challenged on this point. Many people say that the problems are overblown. And, with CMake in particular, people often list workarounds (which often incur a repeated time penalty that quickly outstrips the one-time cost of updating a file in a list) or mention how recent changes in the language eliminate the risk (even when those changes are not uniformly functional across generators/platforms).

Because this comes up so often, we figured we should collect our logic and a collection of problems people encounter in a single location.

The General Flaw

The general problem is that globbing can mess with dependency rules, with subtleties that change from one build tool to another. New files can be created without having other dependencies notice this, causing inconsistent build states.

Beyond that, there is an inefficiency: it costs time to perform the check on every configuration run and/or rebuild. While the cost of performing the glob step depends on the platform and is often negligible from a human perspective, you (and every other developer) incur this time cost on each build, so even a minimal cost will add up over time. This can be offset by a one-time cost of creating a list of files (and each update is also a one-time cost).

Another problem is when you only want to compile specific files in a directory, which can happen for any number of common reasons:

  • Different build configurations or hardware variations would compile different .c files
  • Different toolchains or OSes require different files to be used
  • You want to exclude a specific file when compiling for a particular configuration
  • The HAL you’re using comes with example code that you want to keep but not compile
  • The library/HAL you’re using provides multiple implementations that you need to choose between
  • You want to start a development effort by getting 1-2 files working before bringing in others

Certainly, these restrictions can often be worked around by pruning your source tree, managing directory structures in a different way, or adding additional build logic to filter-out files. We note that each of these workarounds requires at least as much work as simply maintaining a list of files in the first place (e.g., now you are maintaining an explicit list of files to exclude). There may also be impediments to working around these issues in your system:

  • You may not be able to prune out files or change the directory structure if the dependency is provided via some method you don’t fully control, such as a submodule
  • You may receive a zip archive of source files from a vendor, and you would need to prune/reorganize the source tree for each delivery (error prone and time consuming)
    • You could certainly script the file deletion/relocation process when a new SDK zip archive is received, but this is still significantly more work than maintaining a manual listing of files you want to build

Finally, globbing proponents focus on its convenience. But we have to ask, how often are you adding or removing files to your build? Is it really often enough that it is a major burden on to adjust a file list? Does this convenience offset the required workarounds and performance penalties?

Actual Problems

Personally observed problems:

Problems reported by our members and students:

  • Feedback and Questions on “Introduction to Build Systems Using Make” – Course Support – Embedded Artistry Forum

    If I have to say all the truth, initially I used wildcarding also for the app source files… but that directory contains an assembly file, that of course was not included by the function. The crazy thing was that everything was compiled successfully! (I do not remember if there were some warnings, but for sure no error was issued)

    While testing on the hardware, of course I would realize that something was missing because probably nothing would have worked, but I noticed the mistake only by comparing my file with your solution. This demonstrates that it is probably better following what you and the CMake maintainers suggest 🤣

  • nRF52840 Blinky from Scratch – Software – Embedded Artistry Forum

    Also, I fell again into the file globbing trap: I added too many files to the target_sources command. In fact, in the first instance, I added all the arm_startup*.S, iar_startup*.s, which of course they were complaining when compiled with armgcc since they are for different toolchains. The same happened with system_nrf5*.c files.

References