[Courses] [C] Current topic: variables in C.

Mon Jun 10 23:39:30 EST 2002

On Sun, Jun 09, 2002 at 02:02:56AM -0400, Charlotte Oliver wrote:
> On Sat, Jun 08, 2002 at 11:36:14PM -0600, Val Henson wrote:
> > Oo, here's a good question:
> > 
> > "Why does C have this nasty int, char, long stuff?  Why didn't they
> > just say 8_bit_type and 16_bit_type instead of all these complicated
> > rules about int being at least 2 bytes but less than or equal to long,
> > etc., etc., etc.?"
> > 
> > If anyone is interested, pipe up.
> 
> Sure!  Why didn't they just make them more simplistic, or does that
> have to do with features being added over time?

See, this is how I felt for a long time.  I thought it would be much
simpler to say, chars are always 8 bits, integers are always 32 bits,
etc.

But we want to write platform independent code, right?  While
architectures are no longer quite so gloriously varied as they were in
the days of 7-bit bytes and 21-bit words, they do still vary quite a
lot.  As complicated and bizarre as the C type rules are, they make
writing platform independent code much easier.  Instead of having to
explicitly say, "If I'm on this architecture, use this size variable,"
you are just saying, "Give me a variable big enough to hold an
integer," or an address, or character.

In the Linux kernel, this is mostly a good thing but occasionally very
inconvenient.  It's good in that most of the source code is entirely
shared, without needing weird macros for variable declarations.  For
example, if you need to store a kernel address, you know you just have
to declare the variable as:

unsigned long tmp_address;

And it will work both on machines with 32-bit addresses and 64-bit
addresses (x86 and Alpha, for example).  But more often than in user
level code, the kernel code needs to know exactly how big a variable
is.  If you see declarations like this:

u8 thing_1;
u8 pad;
u16 thing_2;

Then you're probably seeing a part of the code that actually needs to
know the exact size of the variables. ("u8" is an
architecture-dependent macro - someone figures out what variable is 8
bits long on that architecture and then writes a header file that
defines "u8" to be "unsigned char" or whatever the proper variable
is.)  The place where I usually see this is where some PCI card or
other device needs to write into the kernel's memory.  The way these
cards work is you say, "Hey, card, here's an address you can write
to," and the card says, "Okay, I wrote the data you wanted in my
special hard-wired format."  If that special hard-wired format is 16
bytes long, we'd better be sure we're allocating 16 bytes and then
reading back those 16 bytes in the right way.

I have seen people who hate the variable size rules so much that they
always use "u8" and friends, but that's a very bad habit to get into.

This could be a lot longer, but I think I'll quit here...

Please feel free to ask questions about the various posts on C types.
We tend to get a little abstruse. :)

-VAL