[Techtalk] Efficiently modeling sets and subsets in lattice-like structure
mizioumt at netscape.net
mizioumt at netscape.net
Wed May 30 21:40:37 UTC 2007
Hi Kelly,
I'm pretty sure there's no universal efficient solution to this
problem, this must be the well known answer
you are looking for. The pure mathematics of 'very hard' could be very
hard to express, though.
So before you choose a solution you need to find out more about the
actual graphs you're working
On the other hand there are plenty of possible approaches to the graph
storage problem.
They do seem to concentrate on trees (XML is one of the better ones).
So you could approach the graph representation
problem as a tree of trees, or as a set of separately stored trees
joined logically over common nodes.
I doubt you could find many solutions in that direction.
Then since a graph is a binary relation over a set of nodes you could
represent it as a 2 field relational table in an obvious way.
SQl isn't very good with trees, though.
In any case on model 'a Bunch Of Regular Guys' could code themselves (I
did it once!) is a representation
of the binary relation as a bit matrix. You'll need N*N bits where N is
the number of nodes so it's not going to scale well
but if you do happen to have fewer than 90K nodes (1G matrix) you could
use it directly. If you have more nodes but
the number of ones in the matrix is not too high you could compress the
matrix somewhat at the expense of extraction/update performance.
If you have many nodes with lots of connections you're out of luck
anyway.
For persistent storage in this case you could use a relational table in
few obvious ways, or just a file.
Example:
7 11 15 X Y 15 20 25 Z 14 30
1 2 3 4 5 6 7 8 9 10 11 12
maintaining this array is a separate issue (or non issue), here a two
field table in an RDBMS is definitely appropriate.
1 2 3 4 5 6 7 8 9 10 11 12
1 all zeroes
2 all zeroes
3 all zeroes
4 1 1 1 0 1 0 ... # 4 is X which is 7 11 15 Y
5 0 0 0 0 0 0 1 1 1 1 0 0 # 5 is Y which is 15 20 25 Z
6 and so on
7
8
9
10
11
12
now if your sets are ordered and {7,11} is not the same as {11,7} this
isn't going to work at all...
Thanks,
Michael
-----Original Message-----
From: Kelly Jones <kelly.terry.jones at gmail.com>Bunch Of Regular Guys
To: techtalk at linuxchix.org; nmlug at nmlug.org; nmosug-l at mailman.swcp.com;
cschat at cs.unm.edu; mysql at lists.mysql.com
Sent: Wed, 30 May 2007 2:53 pm
Subject: Efficiently modeling sets and subsets in lattice-like structure
Apologies for the mass cross-posting: I haven't been able to find a
single
answer or reference for the problem below (googling didn't help), and
was hoping someone could point me to something helpful. I'm convinced
there's a well-known answer here that I just can't find :(
We're modeling a collection of (finite mathematical) sets, where sets
may contain each other as subsets.
For example, set X may be defined as "{7,11,15} + Y" meaning X
contains 7, 11, 15, and all the members of set Y. Set Y could then be
"{15,20,25} + Z", and Z might be "{7,14,30}" with no subsets.
Of course, no set includes itself, directly or indirectly.
However, one set may include many other subsets (eg, X may include
both Y and Z), and one set may be included in many others (eg, X may
be included in sets V and W).
If the sets were nodes in a directed graph, the graph would be
acyclic, but not a tree. I believe this is called a "lattice"(?).
How can we efficiently model this "lattice" of sets either using SQL
or some other technique? Specifically:
% Add or remove members/subsets from a set and have the changes
"bubble up" the lattice efficiently.
% Find all members of a set X, efficiently traversing subsets. In
other words, find all nodes/leaves of the subtree rooted at X
% For a given set X, find all sets that contain X as a subset,
directly or indirectly (eg, if Y contains X, and Z contains Y, return
both Y and Z). In other words, find all the nodes that have X as a
sub-node.
The most promising lead I'd found was:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
(which is why I'm cc'ing the MySQL list), but this only works with
trees, and I couldn't figure out how to modify it for lattices.
-- We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.
-- MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/mysql?unsub=mizioumt@netscape.net
________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and
industry-leading spam and email virus protection.
=0
More information about the Techtalk
mailing list