[prog] normalizing tables

Rachel McConnell rachel at xtreme.com
Tue Oct 18 05:30:27 EST 2005


Kenneth Gonsalves wrote:
 > On Monday 17 Oct 2005 9:21 am, Wolf Rising wrote:
 >
 >>Open to any advice on how not to skip over 2nd normal form :-)
 >
 >
 > why do you want to not skip over the 2nd normal form? My understanding
 > is that at the start you try for the highest possible normal form, and
 > then make compromises for the sake of speed, ease of use etc. Maybe
 > textbooks do it your way, but in the real world it is done the other
 > way.

I expect Wolf is interested in the knowledge for its own sake, for her 
deeper understanding.

My take on it, which is almost entirely untutored, is that 2nd normal 
form is kind of a hybrid of 1st and 3rd, such that you have SOME foreign 
key references, but SOME tables with duplicate data.  I guess it's 
harder to do with a very small database design.  So Wolf, a possible 2nd 
normal form for your example, I think, might be this:

Donations(*donorNumber, *fundNumber, checkNumber, dateReceived, amount)

Donors (*donorNumber, donorName, donorPhone, classificationName,
     exemptStatus)

Fund(*fundNumber, fundName, fundDirector)

Anyone want to jump in and correct me?


I would also say, regarding your 3NF example, are you sure that 
Donations and Checks are different?  If your hypothetical organization 
accepts cash and/or credit card donations as well, then this makes 
sense, but in that case you need a PaymentType table, and some way of 
storing credit card & cash transaction data, and referencing it from the 
Donations table.  Otherwise, I think you might find that each line in 
Donations matches exactly one line in Checks, and vice versa.  In other 
words, they have a one-to-one relationship, and that means they should 
be in the same table.

FWIW, I've been designing databases for 3-4 years now without ever 
having properly learnt the normalization forms, so I can understand 
Kenneth's point too.  A lot of the design decisions are tradoffs 
depending on how you will be using the data, such as will table X be 
used for lots of lookups or will you have lots of inserts?  The tradeoff 
to consider is between storage space and performance.  It's impossible 
to design a theoretically perfect database without a usage context.

Rachel


More information about the Programming mailing list