[prog] guide to memory in C, the stack, heap?

Thu Feb 16 13:08:15 EST 2006

> and it's a classical example, but you can think of a stack of
> dishes..normally, you begin building the stack from the bottom to the top,
> and then when you need a dish, you take the topmost one..and when you put
> another dish, you just put it on the top of your current stack..

This is exactly correct. A stack is a FILO data structure - the metaphor is
usually explained as a reference to the stacks of dishes at diners where they
sink into the counter. You're "stacking" things up - and you can only
remove/add things from the top. The name can be blamed on Edsger Dijkstra who
was quite proud of it.

The stack is a special part of memory that is used for local variables within
your functions. Every time you invoke a function, it gets a new "frame" on
the stack, with a new place to put each of the local variables. This has
implications for memory layout (the variables are always at the same 'place'
within the frame, but since the frame is at a different place...) but the
important thing is that when you exit the function, that 'frame' is removed
from the stack. Its space is thus immediately used for the next function
call, and all those variables are immediately inaccessible to you (after all,
you wouldn't want to access your variable after some other function had
stomped all over it).

In C/C++, any construct which introduces a new scope gets the equivalent of a
stack frame. In most cases this means that any time you use an open-brace
you are adding stack space; a close-brace is the end of a scope, so the stack
frame is popped off and the variables declared within that block go 'out of
scope'.

The same concept is used when programming in assembly right on the metal, and
there you actually use the stack memory byte by byte, saving off the values
of CPU registers as you add stack frames via subroutine calls. x86 CPUs in
particular are very limited for registers - so if you look at Intel assembly
code it's PUSH POP PUSH POP all day long.

The heap, on the other hand, is a huge block of memory that's managed by the
'heap allocator', which is the routine behind malloc() and free() that keeps
the data structure consistent. You ask it for a block of memory of a certain
size, it often rounds up for efficiency reasons, and it hands it back. It's
responsible for allocating memory in a consistent manner to all the processes
running, and trying to do it in such a way that a minimum of memory is
wasted. When you get a block of memory off the heap, you have to free it up
later because the heap won't know when you're done with it (unlike the stack,
where it's freed up when the current block ends). On the other hand you can
pass around a pointer to it, knowing it's all yours and won't disappear on
you when the block ends.

Typically you use heap allocation for data structures that are longer-term
rather than temporary - those that need to stick around outside the current
function call. But you have to remember to clean them up or you'll leak
memory. Stack variables are convenient because they're automatically cleaned
up, but you're limited in when you can use them.

This all gets a good deal more complicated in C++, where objects can be
allocated on the stack or the heap just like any other variable, but
internally they have something that amounts to their own stack (member
variables that are cleaned up when the object is destroyed).

The names for the stack and the heap derive from the names of the data
structures used to implement them - a stack is obviously implemented by a
stack, and the data structure used to track heap allocations is,
unsurprisingly, a heap.

http://en.wikipedia.org/wiki/Stack_%28data_structure%29 and
http://en.wikipedia.org/wiki/Heap are good reference points for more
information.

- John