Stack buffer overflows are a category of error that can wreak havoc on our programs, resulting in sporadic crashes or strange and unexpected program behaviors. A stack buffer overflow occurs when a program writes to a memory address on the stack which is outside of its current stack frame, often triggered by a buffer overflow on a local stack variable.
These errors also create potential security vulnerabilities in our programs. An attacker can deliberately trigger a stack buffer overflow as part of a “stack smashing” attack. When conditions are right, an attacker can insert executable instructions into the program and gain unintended access to a system.
Clang, GCC, and other related compilers provide built-in support for detecting stack buffer overflows and aborting a program if one is detected. This support can be enabled in our programs by setting a few compiler flags.
In this article, we’ll discuss stack smashing protection (SSP), related compiler flags, and an implementation that is suitable for use on microcontrollers.
Table of Contents:
- About Stack Smashing Protection
- The Clang and GCC Approach
- Our Implementation
- Putting it all Together
- Further Reading
About Stack Smashing Protection
Clang, GCC, and related compilers implement stack smashing protection (SSP) using StackGuard. The fundamental assumption behind the approach is that most stack buffer overflows occur by writing past the end of a function’s stack frame. In order to detect this case, a “canary value” is added to the stack before other values are declared. Before returning from the function, the stored canary value in the stack is checked. If it has been modified, a stack buffer overflow has occurred. A failure callback is invoked, which prints an error message and terminates the program. These additions are made by the compiler without your involvement. Compiler flags are used to control the degree that your program is checked.
The Clang and GCC Approach
Consider a trivial function written just for demonstration purposes:
void a_function(const char* input)
{
char buffer[12];
strcpy(buffer, input);
}
If SSP is enabled for your program, GCC and Clang will automatically transform the function into something resembling:
extern uintptr_t __stack_chk_guard;
noreturn void __stack_chk_fail(void);
void a_function(const char* input)
{
uintptr_t canary = __stack_chk_guard;
char buffer[12];
strcpy(buffer, input);
if ((canary = canary ^ __stack_chk_guard) != 0 )
{
__stack_chk_fail();
}
}
The two symbols inserted by the compiler are:
__stack_chk_guard, which contains the value of the stack protection canary word__stack_chk_fail, a callback function that is invoked when a stack buffer overflow is detected- This function shall abort the function that called it with a message that a stack buffer overflow has been detected, and then halt the program via
exit,abort, or a custom panic handler. - This function must not return!
- This function shall abort the function that called it with a message that a stack buffer overflow has been detected, and then halt the program via
If a stack buffer overflow is detected, the canary value will no longer match the value of __stack_chk_guard, and __stack_chk_fail() will be called, aborting the program.
GCC and Clang provide a set of related flags that control how stack protection is used in a program:
-fstack-protector: add stack protection to functions with localcharbuffers larger than 8 bytes, or calls toalloca- This can be configured by specifying a
--param=ssp-buffer-size=X, where X=8 by default
- This can be configured by specifying a
-fstack-protector-strong: increases coverage beyond-fstack-protectorby inserting stack protection under the following conditions:- A local variable’s address is used as part of the right hand side of an assignment or function argument
- Local variable is an array, regardless of array type or length
- Has a
structorunioncontaining an array, regardless of array type or length - Uses register local variables (although the
registerkeyword is outdated)
-fstack-protector-all: add stack protection to all function calls-fno-stack-protector: disable stack protection
Some compilers are built with -fstack-protector enabled by default, but the standard setting tends to be -fno-stack-protector by default.
The __stack_chk_guard and __stack_chk_fail symbols are normally supplied by a GCC library called libssp. On MacOS, they are implemented in libSystem. If you are linking with the compiler’s standard libraries, you will have these symbols. Some library variants may not supply these symbols for your target architecture. Some compiler flags, such as -nostdlib or -nodefaultlibs, will prevent the libraries from being included. Under these conditions, you will need to supply your own implementation.
Next, we’ll show you how to implement stack checking support on your systems.
Our Implementation
At Embedded Artistry, we maintain a libc implementation for use on microcontroller-based embedded projects. The implementation is incomplete, as we’ve focused primarily on the parts we believe to be suitable for microcontroller projects.
In our case, when we compile programs with -nostdlib and/or -nodefaultlibs, we exclude libssp. If we tried to link an application which has stack protection calls inserted, the linker will fail because these symbols are missing:
undefined reference to `__stack_chk_guard'
collect2: error: ld returned 1 exit status
To resolve these failures, we’ll implement support for these functions in our [libc](https://embeddedartistry.com/fieldmanual-terms/libc/). Because libc is consumed by other libraries, the goal is to provide default implementations that can be customized by the end-user.
Verifying the Failure Case
Before we begin implementing stack smashing support, we should take a moment to reproduce the linker error. This way, we can be sure that our changes actually do something.
By default, our libc build tells the library not to insert stack protection calls in the library and dependent programs. We will change the default configuration option value so that disabling is no longer the default.
option('disable-stack-protection', type: 'boolean', value: false,
description: 'Tell the compiler not to insert stack protection calls.', yield: true)
Next, we’ll add -fstack-protector-all to make sure that the compiler inserts stack protection symbols:
sample_app = executable('sample_app',
'app/main.c',
dependencies: libc_dep,
# Added flag below
c_args: '-fstack-protector-all',
link_args: [
linker_script_flags,
map_file.format(meson.current_build_dir()+'/sample_app'),
],
link_depends: linker_script_deps,
native: false
)
Running the build, I see the expected linker error for a missing symbol:
[2/4] Linking target test/sample_app
FAILED: test/sample_app
arm-none-eabi-gcc -o test/sample_app 'test/9f86d08@@sample_app@exe/app_main.c.o' -Wl,--as-needed -Wl,--no-undefined -Wl,--gc-sections -mcpu=cortex-m3 -mfloat-abi=soft -mabi=aapcs -mthumb -Wl,--start-group src/libc.a -Wl,--end-group -Wl,-Map,/Users/pjohnston/src/ea/framework/libc/buildresults/test/sample_app.map -nolibc -nostartfiles '-Wl,-rpath,$ORIGIN/../src' -Wl,-rpath-link,/Users/pjohnston/src/ea/framework/libc/buildresults/src
/Users/pjohnston/toolchain/gcc-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/9.2.1/../../../../arm-none-eabi/bin/ld: test/9f86d08@@sample_app@exe/app_main.c.o:(.rodata.main.cst4+0x0): undefined reference to `__stack_chk_guard'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
Now that we’ve reproduced the failure case, it’s time to provide an implementation for __stack_chk_guard and __stack_chk_fail.
Initial Approach
In src/crt, we’ll create a new file: stack_protection.c. We’ll need to include standard library headers:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
The first piece we’ll implement is __stack_chk_guard. We’ll declare the variable and provide a default value. Our libc is designed to support both 32-bit and 64-bit targets, so we will provide 32-bit and 64-bit canary values. Hardcoding a value in this way is the simplest approach to implementing __stack_chk_guard.
#if UINTPTR_MAX == UINT32_MAX
#define STACK_CHK_GUARD_VALUE 0xa5f3cc8d
#else
#define STACK_CHK_GUARD_VALUE 0xdeadbeefa55a857
#endif
uintptr_t __stack_chk_guard = STACK_CHK_GUARD_VALUE;
Next, we need to implement __stack_chk_fail. Our default implementation will print a message and call abort().
Because we’re providing a library that other systems will consume, we want to have a way for users to override the default operation. We’ll define this function using __attribute((weak)). Since the function is weakly-linked, a user can override the default implementation by defining an implementation of __stack_chk_fail in their program. One reason to do this would be to have __stack_chk_fail call a custom panic handler and await debugging.
We also mark this function as noreturn, which will tell the compiler to generate a warning if this function does end up returning.
__attribute__((weak,noreturn)) void __stack_chk_fail(void)
{
printf("Stack overflow detected! Aborting program.
");
abort();
}
Here’s the full file:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#pragma mark - Prototypes -
void __stack_chk_fail(void);
#pragma mark - Declarations -
#if UINTPTR_MAX == UINT32_MAX
#define STACK_CHK_GUARD 0xa5f3cc8d
#else
#define STACK_CHK_GUARD 0xdeadbeefa55a857
#endif
uintptr_t __stack_chk_guard = STACK_CHK_GUARD;
#pragma mark - Implementations -
__attribute__((weak,noreturn)) void __stack_chk_fail(void)
{
printf("Stack smashing detected! Aborting program.
");
abort();
}
The final step is to add this to the list of library files, in src/meson.build:
crt_files = files(
'crt/_Exit.c',
'crt/abort.c',
'crt/at_exit.c',
'crt/at_quick_exit.c',
'crt/crt.c',
'crt/cxa_atexit.c',
'crt/exit.c',
'crt/quick_exit.c',
# added below
'crt/stack_protection.c'
)
Testing
Of course, we need a test for our library to make sure our function works as expected.
With our library’s structure, C runtime (“CRT”) are not included for the library variant that is used on a build machine. These files are only used in “standalone” implementations. On a build machine, we leverage the system’s libraries to supply missing libc symbols that are needed to run the test programs.
To aid our testing, we will declare a standalone variable that can be consumed by a test program:
stack_protection_file = files('crt/stack_protection.c')
We’ll create a new program that can be used to trigger a stack buffer overflow. To do this, we’ll create a new file: test/app/stackcheck_main.c. The mainfunction for this program will call a function that triggers a stack buffer overflow:
#include <stdio.h>
#include <string.h>
void stack_overflows_here();
const char* buffer_long = "This is a long long string";
void stack_overflows_here()
{
char buffer_short[20];
strcpy(buffer_short, buffer_long);
printf("Overflow case run.
");
}
int main(void)
{
printf("Running stack overflow test program.
");
stack_overflows_here();
return 0;
}
To add this program to the build, we’ll define a new build target in test/meson.build that builds this test program. We’ll pull in the stack_protection_file variable as well as the related compilation flags variable for the stack protection code:
stackprotect_test = executable('stackprotect_test',
['app/stackcheck_main.c', stack_protection_file],
c_args: ['-fstack-protector-all', libc_native_stack_protect_flags],
link_args: native_map_file.format(meson.current_build_dir() + '/stackprotect_test'),
dependencies: libc_hosted_native_dep,
native: true,
build_by_default: meson.is_subproject() == false
)
Then we’ll register the test with Meson’s test runner. Meson allows us to note that this test should fail, since a stack buffer overflow will trigger an abort()and a non-zero exit value.
test('stackprotect_test',
stackprotect_test,
should_fail: true
)
Before we do anything else, we need to verify that our program will not fail if we disable stack protection. To test that, I’ll set the disable-stack-protection to true and remove -fstack-protector-all from our test executable’s build rule.
Compiling and running the program, we can see that the program runs without aborting:
./buildresults/test/stackprotect_test
Running stack overflow test program.
Overflow case run.
And Meson reports an unexpected pass (which triggers a “test failure”):
make test
ninja: Entering directory `buildresults'
[0/1] Running external command clear-test-results
ninja: Entering directory `buildresults'
[2/3] Running all tests.
1/3 printf_tests OK 0.39s
2/3 libc_tests OK 0.05s
3/3 stackprotect_test UNEXPECTEDPASS 0.02s
Now we’ll restore the flags so that SSP will be used.
Running the program, we see our output is included, the stack buffer overflow is detected, and we abort:
./buildresults/test/stackprotect_test
Running stack overflow test program.
Overflow case run.
Stack overflow detected! Aborting program.
Abort trap: 6
Meson test also catches the expected fail:
make test
ninja: Entering directory `buildresults'
[0/1] Running all tests.
1/3 printf_tests OK 0.34s
2/3 libc_tests OK 0.05s
3/3 stackprotect_test EXPECTEDFAIL 0.02s
Ok: 2
Expected Fail: 1
Fail: 0
Unexpected Pass: 0
Skipped: 0
Timeout: 0
Full log written to /Users/pjohnston/src/ea/framework/libc/buildresults/meson-logs/testlog.txt
We can confirm that our file is being included by the message itself, but can also remove our stack_protector.c file from the build and the OS X libSystem version. The program still aborts, and our message isn’t shown.
./buildresults/test/stackprotect_test
Running stack overflow test program.
Overflow case run.
Abort trap: 6
User-configurable Canary Value
Now, our implementation is relatively simple, but it is not as good as it could be. For one, we use a common fixed value for all programs that use our libc.
One improvement we can make is to allow users to define their own canary value:
#ifndef STACK_CHK_GUARD_VALUE
#if UINTPTR_MAX == UINT32_MAX
#define STACK_CHK_GUARD_VALUE 0xa5f3cc8d
#else
#define STACK_CHK_GUARD_VALUE 0xdeadbeefa55a857
#endif
#endif
To override the built-ins, the compiler needs to define STACK_CHK_GUARD_VALUE. With our build system, we can connect this to a new build configuration option:
option('stack-canary-value', type: 'string', value: '',
description: 'Override the default canary value. Supply a hexadecimal value, such as 0xdeadbeef. The canary length should match the word size of your processor.',
yield: true)
In our top-level meson.build file, we’ll read this option:
stack_canary = get_option('stack-canary-value')
In src/meson.build, we’ll check the value of the option. We can tell that a custom value is supplied because it will be a non-empty string.
libc_canary_compile_flag = []
if stack_canary != ''
libc_native_stack_protect_flags += '-DSTACK_CHK_GUARD_VALUE=' + stack_canary
libc_host_stack_protect_flags += '-DSTACK_CHK_GUARD_VALUE=' + stack_canary
endif
These flags are passed to the library targets, but not dependencies.
Generating the Guard Word at Runtime
Allowing a user to specify a custom canary value is a step in the right direction, but we can still do better. From a security perspective, we are better served by randomizing the guard word rather than providing a fixed value.
Embedded platforms vary widely, so it is best to provide a user configurable hook that can be used to generate a value for __stack_chk_guard. We’ll provide library users with an overridable function that can be used to set __stack_chk_guard during the program startup process.
We can declare a function that is marked with __attribute__((constructor)), ensuring that the code is run during the library startup process (before main() is called):
static void __attribute__((constructor)) __construct_stk_chk_guard()
{
if(__stack_chk_guard == 0)
{
__stack_chk_guard = __stack_chk_guard_init();
}
}
The __stack_chk_guard_init function is a new weakly-linked function that can be overridden by a user to set __stack_check_guard during boot. Our default value will simply return the pre-configured guard value.
__attribute__((weak)) uintptr_t __stack_chk_guard_init(void)
{
return STACK_CHK_GUARD_VALUE;
}
Now, the if(__stack_chk_guard != 0) conditional logic doesn’t work with our current setup. The simplest approach is to change the initialization of __stack_chk_guard:
uintptr_t __stack_chk_guard = 0;
However, this is not a suitable default behavior in all situations: it must be carefully used. Running our test program, for instance, I notice that the program now aborts prior to main, since we do not see the normal print statements:
./buildresults/test/stackprotect_test
Stack overflow detected! Aborting program.
Abort trap: 6
Using lldb, I can see that the offending function is __construct_stk_chk_guard:
bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00007fff5d8ab2c2 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff5d966bf1 libsystem_pthread.dylib`pthread_kill + 284
frame #2: 0x00007fff5d8156a6 libsystem_c.dylib`abort + 127
frame #3: 0x0000000100001147 stackprotect_test`__stack_chk_fail + 23
frame #4: 0x00000001000010e7 stackprotect_test`__construct_stk_chk_guard + 71
The reason for this particular failure is the use of -fstack-protector-all in the test program combined with the fact that we are changing the value of __stack_chk_guard in that function. Because we’ve changed the value, the check at the end of this function fails: the canary stored in the function preamble no longer matches __stack_chk_guard.
If we change this flag to -fstack-protect-strong:
stackprotect_test = executable('stackprotect_test',
['app/stackcheck_main.c', stack_protection_file],
# Modified line below
c_args: ['-fstack-protector-strong', libc_native_stack_protect_flags],
link_args: native_map_file.format(meson.current_build_dir() + '/stackprotect_test'),
dependencies: libc_hosted_native_dep,
native: true,
build_by_default: meson.is_subproject() == false
)
We’ll see the behavior we expect:
./buildresults/test/stackprotect_test
Overflow case run.
Stack overflow detected! Aborting program.
Abort trap: 6
Luckily, GCC and Clang provide us with a way to disable stack protection for a single function: __attribute__((no_stack_protector)). We’ll update our constructor to include this attribute:
/*
* Stack protection *must* be disabled for this function. In the case of
* -fstack-protector-all, this function will fail the check because it
* changes the value of __stack_chk_guard.
*/
static void __attribute__((constructor,no_stack_protector)) __construct_stk_chk_guard()
{
if(__stack_chk_guard == 0)
{
__stack_chk_guard = __stack_chk_guard_init();
}
}
We can restore the -fstack-protector-all flag, and now the test program behaves as we expect:
./buildresults/test/stackprotect_test
Overflow case run.
Stack overflow detected! Aborting program.
Abort trap: 6
Now, not everyone will want to take advantage of this. Perhaps it will cause problems in some setups, or perhaps constructors aren’t generated properly or linked properly. We can provide an option for disabling this behavior, if necessary:
option('disable-stk-guard-runtime-config', type: 'boolean', value: false,
description: 'Disables runtime configuration option for __stack_chk_guard. The program will use a fixed value.', yield: true)
We’ll read the value of this option:
disable_stk_guard_runtime_config = get_option('disable-stk-guard-runtime-config')
In src/meson.build, we’ll read the variable’s value and set the appropriate flags:
if disable_stk_guard_runtime_config
libc_native_stack_protect_flags += '-DDISABLE_STACK_CHK_GUARD_RUNTIME_CONFIG'
libc_host_stack_protect_flags += '-DDISABLE_STACK_CHK_GUARD_RUNTIME_CONFIG'
endif
Then, in stack_protector.c, we’ll use this define for conditional logic:
#ifdef DISABLE_STACK_CHK_GUARD_RUNTIME_CONFIG
uintptr_t __stack_chk_guard = STACK_CHK_GUARD_VALUE;
#else
uintptr_t __stack_chk_guard = 0;
static void __attribute__((constructor,no_stack_protector)) __construct_stk_chk_guard()
{
if(__stack_chk_guard == 0)
{
__stack_chk_guard = __stack_chk_guard_init();
}
}
__attribute__((weak)) uintptr_t __stack_chk_guard_init(void)
{
return STACK_CHK_GUARD_VALUE;
}
#endif // ! DISABLE_STACK_CHK_GUARD_RUNTIME_CONFIG
To test this functionality out, we’ll expand our test program. In main(), we will print the value of the canary:
int main(void)
{
extern uintptr_t __stack_chk_guard;
printf("Running stack overflow test program. Canary value: 0x%p
", __stack_chk_guard);
stack_overflows_here();
return 0;
}
We’ll also provide a strong definition for __stack_chk_guard_init(), supplying a new value:
uintptr_t __stack_chk_guard_init()
{
return 0xbeeffeeda5a5a5a5;
}
We can see that our custom value overrides the default:
./buildresults/test/stackprotect_test
Running stack overflow test program. Canary value: 0xBEEFFEEDA5A5A5A5
Overflow case run.
Stack overflow detected! Aborting program.
Abort trap: 6
Then we can set disable-stk-guard-runtime-config to true, recompile the program, and see that the default value is used instead:
./buildresults/test/stackprotect_test
Running stack overflow test program. Canary value: 0x0DEADBEEFA55A857
Overflow case run.
Stack overflow detected! Aborting program.
Abort trap: 6
Putting it all Together
Here is the full implementation in Embedded Artistry’s libc:

Great article but it confuses stack overflows and stack smashing. They are completely unrelated. Stack overflow occurs when a thread uses more stack space than was allocated to it. The stack frames typically remain intact in this case but are overwriting some unrelated information that will cause down-stream errors. Stack smashing (typically) occurs when a program error causes OOB writes to a stack-allocated array thus trashing the function frame on the stack. SSP often, but not always, helps with stack smashing. It rarely helps with overflows.
Thank you for pointing that out, I will clarify the terminology used in the article.
Thanks for the information. I have observed something strange with arm-none-eabi-gcc 9.3.1 20200408 release building code for cortex-m4. When I tested SSP, looked at objdump output it appeared the generated code was storing the address of __stack_chk_guard on the function’s stack and comparing the address in the function epilog, not the value. I also used a debugger stepped through the assembly code to verify and it is saving and comparing the address not the value of __stack_chk_guard. I compiled the code with arm-none-eabi 8.2.1 and GCC emits code that stores and compares the value of __stack_chk_guard. Seems to be a bug.
Wow, nice find. I don’t see any existing bugs for that in the bug tracker: https://bugs.launchpad.net/gcc-arm-embedded . That’s definitely worth reporting!
Watch out, as GCC versions 9.2.1 to 10.2 are hit by a bug. Safe seem to be up to 8.3.1, and after 10.2.1 (both ranges inclusive).
More details below:
– https://blog.inhq.net/posts/faulty-stack-canary-arm-systems/