[Techtalk] Efficiently modeling sets and subsets in lattice-like structure

mizioumt at netscape.net mizioumt at netscape.net
Wed May 30 21:40:37 UTC 2007


Hi Kelly,

I'm pretty sure there's no universal efficient solution to this 
problem, this must be the well known answer
you are looking for. The pure mathematics of 'very hard' could be very 
hard to express, though.
So before you choose a solution you need to find out more about the 
actual graphs you're working
On the other hand there are plenty of possible approaches to the graph 
storage problem.
They do seem to concentrate on trees (XML is one of the better ones). 
So you could approach the graph representation
problem as a tree of trees, or as a set of separately stored trees 
joined logically over common nodes.
I doubt you could find many solutions in that direction.
Then since a graph is a binary relation over a set of nodes you could 
represent it as a 2 field relational table in an obvious way.
SQl isn't very good with trees, though.
In any case on model 'a Bunch Of Regular Guys' could code themselves (I 
did it once!) is a representation
of the binary relation as a bit matrix. You'll need N*N bits where N is 
the number of nodes so it's not going to scale well
but if you do happen to have fewer than 90K nodes (1G matrix) you could 
use it directly. If you have more nodes but
the number of ones in the matrix is not too high you could compress the 
matrix somewhat at the expense of extraction/update performance.
If you have many nodes with lots of connections you're out of luck 
anyway.
For persistent storage in this case you could use a relational table in 
few obvious ways, or just a file.
Example:

7 11 15 X Y 15 20 25 Z 14 30
1 2 3 4 5 6 7 8 9 10 11 12

maintaining this array is a separate issue (or non issue), here a two 
field table in an RDBMS is definitely appropriate.

1 2 3 4 5 6 7 8 9 10 11 12
1 all zeroes
2 all zeroes
3 all zeroes
4 1 1 1 0 1 0 ... # 4 is X which is 7 11 15 Y
5 0 0 0 0 0 0 1 1 1 1 0 0 # 5 is Y which is 15 20 25 Z
6 and so on
7
8
9
10
11
12

now if your sets are ordered and {7,11} is not the same as {11,7} this 
isn't going to work at all...

Thanks,
Michael
-----Original Message-----
From: Kelly Jones <kelly.terry.jones at gmail.com>Bunch Of Regular Guys
To: techtalk at linuxchix.org; nmlug at nmlug.org; nmosug-l at mailman.swcp.com; 
cschat at cs.unm.edu; mysql at lists.mysql.com
Sent: Wed, 30 May 2007 2:53 pm
Subject: Efficiently modeling sets and subsets in lattice-like structure


Apologies for the mass cross-posting: I haven't been able to find a 
single 
answer or reference for the problem below (googling didn't help), and 
was hoping someone could point me to something helpful. I'm convinced 
there's a well-known answer here that I just can't find :( 
 
We're modeling a collection of (finite mathematical) sets, where sets 
may contain each other as subsets. 
 
For example, set X may be defined as "{7,11,15} + Y" meaning X 
contains 7, 11, 15, and all the members of set Y. Set Y could then be 
"{15,20,25} + Z", and Z might be "{7,14,30}" with no subsets. 
 
Of course, no set includes itself, directly or indirectly. 
 
However, one set may include many other subsets (eg, X may include 
both Y and Z), and one set may be included in many others (eg, X may 
be included in sets V and W). 
 
If the sets were nodes in a directed graph, the graph would be 
acyclic, but not a tree. I believe this is called a "lattice"(?). 
 
How can we efficiently model this "lattice" of sets either using SQL 
or some other technique? Specifically: 
 
% Add or remove members/subsets from a set and have the changes 
"bubble up" the lattice efficiently. 
 
% Find all members of a set X, efficiently traversing subsets. In 
other words, find all nodes/leaves of the subtree rooted at X 
 
% For a given set X, find all sets that contain X as a subset, 
directly or indirectly (eg, if Y contains X, and Z contains Y, return 
both Y and Z). In other words, find all the nodes that have X as a 
sub-node. 
 
The most promising lead I'd found was: 
 
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html 
 
(which is why I'm cc'ing the MySQL list), but this only works with 
trees, and I couldn't figure out how to modify it for lattices. 
 
-- We're just a Bunch Of Regular Guys, a collective group that's trying 
to understand and assimilate technology. We feel that resistance to 
new ideas and technology is unwise and ultimately futile. 
 
-- MySQL General Mailing List 
For list archives: http://lists.mysql.com/mysql 
To unsubscribe: 
http://lists.mysql.com/mysql?unsub=mizioumt@netscape.net 
 


________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and 
industry-leading spam and email virus protection.
=0


More information about the Techtalk mailing list