Related pages:
The library works by building the syntax for regular expressions out of Java syntax. A regular expression pattern r
can be:
- A literal string:
"string"
- A literal character:
'char'
- A sequence of patterns:
s(r1,...,r2)
- A choice of patterns:
or(r1,...,r2)
- A zero-or-more repetition of a pattern:
rep(r)
(foo|bar|baz)*xoo
s(rep(or("foo","bar","baz")),"xoo")
The library works by converting each regular expression into an NFA on the fly. For whatever reason, I didn't implement it in a functional fashion, so you can't re-use a sub-expression in more than one regular expression. That is, the following breaks:
NFA foo = s("foo") ; NFA pattern = s(foo,foo) ;but
NFA pattern = s(s("foo"),s("foo")) ; // or s("foo","foo")
Read full article from Converting regular expressions into nondeterministic finite automata (NFAs): An implementation in Java
No comments:
Post a Comment