[prog] C++ object creation

Kathryn Hogg kjh at flyballdogs.com
Wed May 28 14:07:51 EST 2003


> OK, so that's the stack. The heap is only a slightly different beast. It's
> called that because it's best visualised as an unstructured heap of memory
> in
> the middle of the system, from which all programs grab bits as and when
> they
> need it. You make a call to malloc() (or new, which is a C++ wrapper
> around
> malloc()), and the memory management system marks a small stretch of that
> heap as yours. You then muck around with it, doing whatever you like
> there,
> confident that no other program will use that piece of memory. When you're
> done with it, you call free() (or its wrapper, delete). This signals that
> you're done with that memory, and that other programs are now free to grab
> it
> (with malloc()) and use it for themselves. Of course, it is possible -
> indeed, very easy to accidentally read from or write to a section of
> memory
> you haven't been allocated.

On most architectures, each process has its own heapspace within its
private address space.  We get enough problems caused by poor memory
management within a process let alone if the the heap was shared amongst
processes.




> An interesting artefact of the way that the heap works is that you can
> only
> ever have pointers to heap memory.

You can have pointers to objects on the stack, you just need to make sure
that you don't use the pointers beyond the lifetime of the objects they
point to:

int f()
{
   Some_Gigantic_Object sgo;

   load(&ago);   // Since sgo is huge, we pass a pointer (in C++ we could
                 // use a reference)
}

We can also have durable pointers to objects created in in the global data
section, shared memory, or memory mapped files.

> Any variable declared in the normal way is put on the stack (even global
> ones, right down there at the bottom).

Global variables (a C++ class static members) are typically put in the bss
(aka DATA) section. A process can have multiple stacks (usually if its
multi-threaded) and all threads need to be able to access global data. 
Additionally, stacks can be quite small, we allocated 64K stacks per
thread by default in our software.

The BSS section initialzes pages to zero which convienently handles the
C/C++ guarantee that global variables are initialized to 0.



> So - back to ed's original question: Why do you choose one over the other?
> The
> answer is that a variable on the stack is popped off when the function in
> which it is defined returns.

And that is the source of many coding problems.  We've all done it.  If
you haven't then you are either a liar or extremely lucky that some latent
bugs haven't killed you yet:

int *f()
{
   int i = 0;
   return &i;
}

This is perfectly valid C/C++. But as Merydydd pointed out, you are
returning a pointer to variable that is out of scope.  Derefencing this
pointer later on will give you undefined results. If you're lucky, your
compiler will give you a warning.  Turn on as many warning options as you
can and pay attention to them!

> One last observation - not all languages are as flexible as C. Java, for
> example, stores *everything* on the heap, and just relies on its garbage
> collector to clean things up - even local variables within a function.
> This
> is, I am given to understand, one of the big reasons for its slowness.

Plenty of studies have shown that garbage collection is not necessarily a
road to slowness.  It frees programmers from having to write code to
(mis)manage memory so the program is free to spend its CPU cycles doing
what its supposed to.  Just like with virtual functions, a modern garbage
collector is going to do a better job that something simulated by the user
like reference counting.

-- 
Kathryn


More information about the Programming mailing list