Stacking and Overflowing

Ever since I got my first programming book at age 10, I loved programming. But it wasn’t until last year that I got interested in low-level programming – this fascinating world of bits and bytes, assembly and other untamed rawness. If I had to sum up what I’ve learned so far, it would be something like this:

Great minds have spent decades inventing clever abstractions that high-level programmers like me almost never see, allowing us to write a Hello World in one line.

Trying to understand how these abstractions work is an exciting journey and maybe you want to accompany me a little while I try to get a deeper understanding of what a stack overflow is.

Hello Fibonacci

Let’s start by looking at a naive fibonacci implementation in C:

int fib(int n) {  
    if (n == 0) {
        return 0;
    } else if (n == 1) {
        return 1;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

Did you spot the bug? While the above code works fine for any $n \ge 0$, it will fail miserably if you pass a negative number:

$ ./fib -1
Segmentation fault (core dumped)  

Ouch! Here’s the backtrace that gdb shows us:

(gdb) bt
#0  0x000000000040053a in fib (n=<error reading variable:
    Cannot access memory at address 0x7fffa16c0fec>) at fib.c:3
#1  0x0000000000400569 in fib (n=-174611) at fib.c:9
#2  0x0000000000400569 in fib (n=-174610) at fib.c:9
#3  0x0000000000400569 in fib (n=-174609) at fib.c:9
...

The problem is obvious: Because $n$ is less than $0$, we’re locked in the else branch, recursing until the stack overflows. A stupid, albeit easily fixable mistake. But let’s pause for a moment. What does it mean that the stack has overflown? And, what’s the stack anyway?

The stack: where variables live

There is this high-level concept of a variable. You take a value and assign a name to it so you can pass it around or change it. But computers don’t know variables. What they know is machine code and memory.1

You may have learned that a variable is “a storage location paired with an associated symbolic name”:2

That’s a beautiful abstraction, but let’s have a closer look at it. If every variable is a place in the memory, how does a program determine where exactly to put it? In C, the lingua franca of low-level programming, variables are placed in one of two locations: the stack and the heap. For now we’ll focus on the stack.

Every time you assign a variable in C, it will be placed on the stack, or more precisely in the function’s stack frame. This piece of memory basically stores information about the function’s variables, arguments and its caller (so it can continue there when finished). For our broken fibonacci implementation the stack looked like this:

Stack grew until it overflowed.3

Stack overflow

Why does a program crash when the stack overflows? There are two possibilities:

  1. On older OSs or embedded systems the program’s stack grows until it gets clobbered by the program’s heap.4 That’s because usually the stack grows downwards whereas the heap grows upwards. When these two meet, the program will go crazy and eventually crash.

  2. On modern OSs the stack size is fixed. Whenever a program’s stack grows beyond that size, the OS immediately kills it.

So, have we finally demystified the stack overflow? Almost! What’s still left is to find out how the OS knows when a stack overflow has occured. There are multiple approaches to this.

The first idea is to let the OS periodically check the current stack usage. For example it could simply check the current value of the stack pointer (the memory location of the topmost stack frame).5 The problem with this approach is that it only detects that the stack has overflown but doesn’t actively prevent the process from entering a corrupted state. To solve that we have to dig even deeper.

Virtual reality

In our models above, we’ve always assumed a flat, contiguous memory space. Any process can use as much memory as it can address. As soon as you want to implement an OS with multitasking, this will raise trouble. How do you share the physical memory with multiple processes? How do you enforce that a process modifies only its own memory?

One solution is called Virtual Memory. The process thinks it has a flat memory space, but in reality this virtual memory space is split into small contiguous segments called pages. Every page is mapped to a physical memory address (or is left unmapped).

Whenever a process tries to access some memory location, the CPU looks up the physical address of the associated page in the page table. If the page is unmapped, the CPU will generate a page fault so the OS can either load the page from disk if it was swapped, or kill the process with a segmentation fault.

Despite its complexity regarding both hardware and software support this design turns out to be very clever. First of all: while every process can use as much memory as it can address (e.g. $2^{32}$ bytes on a 32 bit computer), it won’t eat up memory it doesn’t actually use. To use more memory pages, it has to ask the OS.6 The OS on the other hand can swap out seldomly used pages to the disk to increase available RAM.

Another neat advantage is that processes can share memory. That may be the OS kernel or shared libraries. They all can be part of the virtual memory space of different processes without taking up extra space. Also, that technique can be used for inter-process communication.

And finally, what’s important for our case: the OS can use Virtual Memory to enforce memory protection. For example, it can mark heap and stack as not executable to prevent some types of exploits. How does that help with preventing stack overflows? When creating the stack, the OS will allocate an empty page right below the end of the stack. It will set up memory protection on that page so any access results in a segmentation fault which will cause the OS to kill the process.

Wrapping it up

Whew! That was a lot of stuff to just understand what a stack overflow is. Let’s sum it up:

In short, every function call adds a new stack frame to the stack. As the stack frame grows too big, it will either get clobbered by the heap, or hit a guard page triggering a segmentation fault.

Further reading

If you’re interested in more details, here are some great resources you should read:

  1. And I/O, but this doesn’t bother us here.

  2. From Wikipedia: Variable (computer science)

  3. Actually, this image is somewhat wrong. On x86 CPUs the stack grows downwards, see this Stack Overflow answer for details.

  4. Also called dynamic memory. It contains variables that live longer than the current function call.

  5. FreeRTOS does this.

  6. For example VirtualAlloc on Windows or mmap with MAP_ANONYMOUS on Linux.


Loading comments...