[Courses] Course idea/request (a second one)

Daniel. cristofd at hevanet.com
Fri Feb 20 02:42:31 EST 2004


>how to use regulare expressions, I can draw the finite state automa up on
>paper, but never manage to translate them to regulare expressions that are
>being used on cli apps

How to change a finite-state automaton into a regular expression? 
Here's a technique, though it's a bit slow to do by hand and there's 
probably a program to do it...
We're going to modify the graph gradually, and the edges are allowed 
to be labeled with regular expressions, not just single symbols. 
Special notations for the purpose of this email are:
-zero or more copies of string A, concatenated together: A*
-string A followed by string B: A.B
-string A OR string B: A|B
And that's the order of precedence too. Now, the technique.

-add a new start node to the graph, and make an edge from it to the 
old start node, labeled with the empty string
-add a new final node to the graph, and make edges to it from all the 
old final nodes, labeled with the empty string
-everywhere in the graph where there are two edges from node X to 
node Y, one labeled with string A and one with string B, replace them 
with a single edge from X to Y, labeled A|B
-from now on, the notation <XY> will denote the regular expression 
that labels the edge from node X to node Y

-now we eliminate nodes one at a time. Pick a node at random--call it 
X. Now for every pair of nodes other than X--call them Y and Z--we 
make the regular expression <YZ>|<YX>.<XX>*.<XZ> and that expression 
becomes the new <YZ>, replacing the old one. Basically, all the paths 
through node X are being replaced by direct paths that do not go 
through X. After we do this, we can remove X from the graph entirely.

Note that we must do this for every pair of nodes other than X in 
BOTH directions, I.e. we not only replace <MN> with 
<MN>|<MX>.<XX>*.<XN> but we also replace <NM> with 
<NM>|<NX>.<XX>*.<XM> AND for that matter we do it for each node other 
than X and ITSELF, i.e. we replace <MM> with <MM>|<MX>.<XX>*.<XM>

Okay. So after we made all these new direct paths to replace the old 
paths through X, we remove X from the graph. Then we pick a new node, 
do the same thing with all the paths that go through it, and remove 
it. We go on doing this until the only two nodes left are the new 
start node and the new final node that we made for the purposes of 
this procedure. The edge from the new start node to the new final 
node, at that point, will be labeled with a regular expression that 
is exactly the one recognized by the finite-state automaton we just 
destroyed.

Two more little notes. One, in making the new regular expressions, 
the components should be parenthesized where necessary, i.e. we 
really want to replace <YZ> with <YZ>|(<YX>).(<XX>)*.(<XZ>), though 
some of those parentheses may be superfluous. Two, the final regular 
expression may be tidier if we clean up the regular expressions a bit 
along the way, using some of the regular-expression equivalence rules.

If this is is unclear, or isn't vaguely what you were interested in, sorry.
-Daniel.
-- 
If at first you don't succeed, cut your losses.


More information about the Courses mailing list