The.Algorithm.Design.Manual.Springer-Verlag.1998
The.Algorithm.Design.Manual.Springer-Verlag.1998 The.Algorithm.Design.Manual.Springer-Verlag.1998
Finite State Machine Minimization ● Constructing machines from regular expressions - There are two approaches to converting a regular expression to an equivalent finite automaton, the difference being whether the output automaton is to be a nondeterministic or deterministic machine. The former is easier to construct but less efficient to simulate. The nondeterministic construction uses -moves, which are optional transitions that require no input to fire. On reaching a state with an -move, we must assume that the machine can be in either state. Using such -moves, it is straightforward to construct an automaton from a depth-first traversal of the parse tree of the regular expression. This machine will have O(m) states, if m is the length of the regular expression. Further, simulating this machine on a string of length n takes O(m n) time, since we need consider each state/prefix pair only once. The deterministic construction starts with the parse tree for the regular expression, observing that each leaf represents one of the alphabet symbols in the pattern. After recognizing a prefix of the text, we can be left in some subset of these possible positions, which would correspond to a state in the finite automaton. The derivatives method builds up this automaton state by state as it is needed. Even so, some regular expressions of length m require states in any DFA implementing them, such as . There is no way to avoid this exponential blowup in the space required. Note, however, that it takes linear time to simulate an input string on any automaton, regardless of the size of the automaton. Implementations: FIRE Engine is a finite automaton toolkit, written in C++ by Bruce Watson. It provides production-quality implementations of finite automata and regular expression algorithms. Several finite automaton minimization algorithms have been implemented, including Hopcroft's algorithm. Both deterministic and nondeterministic automata are supported. FIRE Engine has been used for compiler construction, hardware modeling, and computational biology applications. It is strictly a computing engine and does not provide a graphical user interface. FIRE Engine is available by anonymous ftp from ftp.win.tue.nl in the directory /pub/techreports/pi/watson.phd/. A greatly improved commercial version is available from www.RibbitSoft.com. Grail is a C++ package for symbolic computation with finite automata and regular expressions, from Darrell Raymond and Derrick Wood. Grail enables one to convert between different machine representations and to minimize automata. It can handle machines with 100,000 states and dictionaries of 20,000 words. All code and documentation are accessible from the WWW site http://www.csd.uwo.ca/research/grail, as well as pointers to a variety of other automaton packages. Commercial use of Grail is not allowed without approval, although it is freely available to students and educators. An implementation in C of a regular-expression matching algorithm appears in [BR95]. The source code for this program is printed in the text and is available on disk for a modest fee. A bare bones implementation in C of a regular-expression pattern matching algorithm appears in [GBY91]. See file:///E|/BOOK/BOOK5/NODE207.HTM (3 of 4) [19/1/2003 1:32:15]
Finite State Machine Minimization Section . XTango (see Section ) includes a simulation of a DFA. Many of the other animations (but not this one) are interesting and quite informative to watch. FLAP (Formal Languages and Automata Package) is a tool by Susan Rodger for drawing and simulating finite automata, pushdown automata, and Turing machines. Using FLAP, one can draw a graphical representation (transition diagram) of an automaton, edit it, and simulate the automaton on some input. FLAP was developed in C++ for X-Windows. See http://www.cs.duke.edu:80/ rodger/tools/tools.html. Notes: Aho [Aho90] provides a good survey on algorithms for pattern matching, and a particularly clear exposition for the case where the patterns are regular expressions. The technique for regular expression pattern matching with -moves is due to Thompson [Tho68]. Other expositions on finite automaton pattern matching include [AHU74]. Hopcroft [Hop71] gave an optimal algorithm for minimizing the number of states in DFAs. The derivatives method of constructing a finite state machine from a regular expression is due to Brzozowski [Brz64] and has been expanded upon in [BS86]. Expositions on the derivatives method includes Conway []. Testing the equivalence of two nondeterministic finite state machines is PSPACE-complete [SM73]. Related Problems: Satisfiability (see page ). string matching (see page ). Next: Longest Common Substring Up: Set and String Problems Previous: Cryptography Algorithms Mon Jun 2 23:33:50 EDT 1997 file:///E|/BOOK/BOOK5/NODE207.HTM (4 of 4) [19/1/2003 1:32:15]
- Page 595 and 596: Medial-Axis Transformation Implemen
- Page 597 and 598: Polygon Partitioning number of piec
- Page 599 and 600: Simplifying Polygons Next: Shape Si
- Page 601 and 602: Simplifying Polygons vertices and o
- Page 603 and 604: Shape Similarity Next: Motion Plann
- Page 605 and 606: Shape Similarity or how close it is
- Page 607 and 608: Motion Planning There is a wide ran
- Page 609 and 610: Motion Planning often arise in the
- Page 611 and 612: Maintaining Line Arrangements Think
- Page 613 and 614: Maintaining Line Arrangements Next:
- Page 615 and 616: Minkowski Sum where x+y is the vect
- Page 617 and 618: Set and String Problems Next: Set C
- Page 619 and 620: Set Cover Next: Set Packing Up: Set
- Page 621 and 622: Set Cover Figure: Hitting set is du
- Page 623 and 624: Set Packing Next: String Matching U
- Page 625 and 626: Set Packing Notes: An excellent exp
- Page 627 and 628: String Matching shouldn't try. Furt
- Page 629 and 630: String Matching and texts, I recomm
- Page 631 and 632: Approximate String Matching This sa
- Page 633 and 634: Approximate String Matching http://
- Page 635 and 636: Text Compression Next: Cryptography
- Page 637 and 638: Text Compression code string. ASCII
- Page 639 and 640: Cryptography Next: Finite State Mac
- Page 641 and 642: Cryptography ● How can I validate
- Page 643 and 644: Cryptography MD5 [Riv92] is the sec
- Page 645: Finite State Machine Minimization F
- Page 649 and 650: Longest Common Substring than edit
- Page 651 and 652: Longest Common Substring include [A
- Page 653 and 654: Shortest Common Superstring Finding
- Page 655 and 656: Software systems Next: LEDA Up: Alg
- Page 657 and 658: LEDA Next: Netlib Up: Software syst
- Page 659 and 660: Netlib Algorithms Mon Jun 2 23:33:5
- Page 661 and 662: The Stanford GraphBase Next: Combin
- Page 663 and 664: Algorithm Animations with XTango Ne
- Page 665 and 666: Programs from Books Next: Discrete
- Page 667 and 668: Handbook of Data Structures and Alg
- Page 669 and 670: Algorithms from P to NP Next: Compu
- Page 671 and 672: Algorithms in C++ Next: Data Source
- Page 673 and 674: Textbooks Next: On-Line Resources U
- Page 675 and 676: On-Line Resources Next: Literature
- Page 677 and 678: People Next: Software Up: On-Line R
- Page 679 and 680: Professional Consulting Services Ne
- Page 681 and 682: Index A Up: Index - All Index: A ab
- Page 683 and 684: Index A artists steal ASA ASCII asp
- Page 685 and 686: Index B binary representation - sub
- Page 687 and 688: Index C Up: Index - All Index: C C+
- Page 689 and 690: Index C clustering , , co-NP coding
- Page 691 and 692: Index C consulting services , conta
- Page 693 and 694: Index D Up: Index - All Index: D DA
- Page 695 and 696: Index D Dictionaries dictionaries -
Finite State Machine Minimization<br />
● Constructing machines from regular expressions - <strong>The</strong>re are two approaches to converting a<br />
regular expression to an equivalent finite automaton, the difference being whether the output<br />
automaton is to be a nondeterministic or deterministic machine. <strong>The</strong> former is easier to construct<br />
but less efficient to simulate.<br />
<strong>The</strong> nondeterministic construction uses -moves, which are optional transitions that require no<br />
input to fire. On reaching a state with an -move, we must assume that the machine can be in<br />
either state. Using such -moves, it is straightforward to construct an automaton from a depth-first<br />
traversal of the parse tree of the regular expression. This machine will have O(m) states, if m is<br />
the length of the regular expression. Further, simulating this machine on a string of length n takes<br />
O(m n) time, since we need consider each state/prefix pair only once.<br />
<strong>The</strong> deterministic construction starts with the parse tree for the regular expression, observing that<br />
each leaf represents one of the alphabet symbols in the pattern. After recognizing a prefix of the<br />
text, we can be left in some subset of these possible positions, which would correspond to a state<br />
in the finite automaton. <strong>The</strong> derivatives method builds up this automaton state by state as it is<br />
needed. Even so, some regular expressions of length m require states in any DFA<br />
implementing them, such as . <strong>The</strong>re is no way to avoid this<br />
exponential blowup in the space required. Note, however, that it takes linear time to simulate an<br />
input string on any automaton, regardless of the size of the automaton.<br />
Implementations: FIRE Engine is a finite automaton toolkit, written in C++ by Bruce Watson. It<br />
provides production-quality implementations of finite automata and regular expression algorithms.<br />
Several finite automaton minimization algorithms have been implemented, including Hopcroft's<br />
algorithm. Both deterministic and nondeterministic automata are supported. FIRE Engine has been used<br />
for compiler construction, hardware modeling, and computational biology applications. It is strictly a<br />
computing engine and does not provide a graphical user interface. FIRE Engine is available by<br />
anonymous ftp from ftp.win.tue.nl in the directory /pub/techreports/pi/watson.phd/. A greatly improved<br />
commercial version is available from www.RibbitSoft.com.<br />
Grail is a C++ package for symbolic computation with finite automata and regular expressions, from<br />
Darrell Raymond and Derrick Wood. Grail enables one to convert between different machine<br />
representations and to minimize automata. It can handle machines with 100,000 states and dictionaries of<br />
20,000 words. All code and documentation are accessible from the WWW site<br />
http://www.csd.uwo.ca/research/grail, as well as pointers to a variety of other automaton packages.<br />
Commercial use of Grail is not allowed without approval, although it is freely available to students and<br />
educators.<br />
An implementation in C of a regular-expression matching algorithm appears in [BR95]. <strong>The</strong> source code<br />
for this program is printed in the text and is available on disk for a modest fee. A bare bones<br />
implementation in C of a regular-expression pattern matching algorithm appears in [GBY91]. See<br />
file:///E|/BOOK/BOOK5/NODE207.HTM (3 of 4) [19/1/2003 1:32:15]