(but is slow in Java, Perl, PHP, Python, Ruby, ...) Introduction This is a tale of two approaches to regular expression matching. One of them is in widespread use in the standard interpreters for many languages, including Perl. The other is used only in a few places, notably most implementations of awk and grep. The two approaches have wildly different performance characteristics: Time to match a?3a3 a?a?a?aaa . The two graphs plot the time required by each approach to match the regular expression a? n a a n. Notice that Perl requires over sixty seconds to match a 29-character string.
Finite Automata
A finite automaton is always in one of its states, represented in the diagram by circles. As it reads the string, it switches from state to state. This machine has two special states: the start state s0 and the matching state s4. Start states are depicted with lone arrowheads pointing at them, and matching states are drawn as a double circle.
The machine we have been considering is called a deterministic finite automaton (DFA), because in any state, each possible input letter leads to at most one new state.
Converting Regular Expressions to NFAs
Regular Expression Search Algorithms
Now we have a way to test whether a regular expression matches a string: convert the regular expression to an NFA and then run the NFA using the string as input.
Read full article from Regular Expression Matching Can Be Simple And Fast
Finite Automata
A finite automaton is always in one of its states, represented in the diagram by circles. As it reads the string, it switches from state to state. This machine has two special states: the start state s0 and the matching state s4. Start states are depicted with lone arrowheads pointing at them, and matching states are drawn as a double circle.
The machine we have been considering is called a deterministic finite automaton (DFA), because in any state, each possible input letter leads to at most one new state.
Converting Regular Expressions to NFAs
Regular Expression Search Algorithms
Now we have a way to test whether a regular expression matches a string: convert the regular expression to an NFA and then run the NFA using the string as input.
Read full article from Regular Expression Matching Can Be Simple And Fast
No comments:
Post a Comment