[Techtalk] Efficiently modeling sets and subsets in lattice-like structure

Fri Jun 1 07:21:59 UTC 2007

Hello!

Kelly Jones:
> We're modeling a collection of (finite mathematical) sets, where sets
> may contain each other as subsets.
> [...]
> Of course, no set includes itself, directly or indirectly.
> [...]
> If the sets were nodes in a directed graph, the graph would be
> acyclic, but not a tree. I believe this is called a "lattice"(?).

Data structures for directed acyclic graphs (or DAGs) seem
(to me) to boil down to adjacency matrices or adjacency lists.

> How can we efficiently model this "lattice" of sets either using SQL
> or some other technique? 

My first try would be kind of a two-step process:

Have one table for direct set membership -- let's call it
"members", i.e. a table that lists set names and the
elements belonging to them.

Have a second table "subsets" that maintains a adjacency
matrix of the DAG representing the subset relations;
something like listing all pairs of set names where set A is
known to be subset (or superset) of set B.

Finding all members of set X would then mean, you first have
to find all subsets of X by looking at the "subset" table
and then find all members of all of those sets out of the
"members" table.

Adding elements to sets should be easy; removing elements
seems to raise more detail questions. For instance, if

X = {1, 2, 3} + Y
Y = {1, 4}

so that finding all members of X would return {1, 2, 3, 4}
and we would like to remove element "1" from X, what is
supposed to happen? Should "1" be removed from Y, too? I
guess, that's up to you ...

Hope this helps,
Regards,
Wolfgang