[Courses] Course idea/request (a second one)
Daniel.
cristofd at hevanet.com
Fri Feb 20 02:42:31 EST 2004
>how to use regulare expressions, I can draw the finite state automa up on
>paper, but never manage to translate them to regulare expressions that are
>being used on cli apps
How to change a finite-state automaton into a regular expression?
Here's a technique, though it's a bit slow to do by hand and there's
probably a program to do it...
We're going to modify the graph gradually, and the edges are allowed
to be labeled with regular expressions, not just single symbols.
Special notations for the purpose of this email are:
-zero or more copies of string A, concatenated together: A*
-string A followed by string B: A.B
-string A OR string B: A|B
And that's the order of precedence too. Now, the technique.
-add a new start node to the graph, and make an edge from it to the
old start node, labeled with the empty string
-add a new final node to the graph, and make edges to it from all the
old final nodes, labeled with the empty string
-everywhere in the graph where there are two edges from node X to
node Y, one labeled with string A and one with string B, replace them
with a single edge from X to Y, labeled A|B
-from now on, the notation <XY> will denote the regular expression
that labels the edge from node X to node Y
-now we eliminate nodes one at a time. Pick a node at random--call it
X. Now for every pair of nodes other than X--call them Y and Z--we
make the regular expression <YZ>|<YX>.<XX>*.<XZ> and that expression
becomes the new <YZ>, replacing the old one. Basically, all the paths
through node X are being replaced by direct paths that do not go
through X. After we do this, we can remove X from the graph entirely.
Note that we must do this for every pair of nodes other than X in
BOTH directions, I.e. we not only replace <MN> with
<MN>|<MX>.<XX>*.<XN> but we also replace <NM> with
<NM>|<NX>.<XX>*.<XM> AND for that matter we do it for each node other
than X and ITSELF, i.e. we replace <MM> with <MM>|<MX>.<XX>*.<XM>
Okay. So after we made all these new direct paths to replace the old
paths through X, we remove X from the graph. Then we pick a new node,
do the same thing with all the paths that go through it, and remove
it. We go on doing this until the only two nodes left are the new
start node and the new final node that we made for the purposes of
this procedure. The edge from the new start node to the new final
node, at that point, will be labeled with a regular expression that
is exactly the one recognized by the finite-state automaton we just
destroyed.
Two more little notes. One, in making the new regular expressions,
the components should be parenthesized where necessary, i.e. we
really want to replace <YZ> with <YZ>|(<YX>).(<XX>)*.(<XZ>), though
some of those parentheses may be superfluous. Two, the final regular
expression may be tidier if we clean up the regular expressions a bit
along the way, using some of the regular-expression equivalence rules.
If this is is unclear, or isn't vaguely what you were interested in, sorry.
-Daniel.
--
If at first you don't succeed, cut your losses.
More information about the Courses
mailing list