This is a (proof-of-concept) regular expression parser implemntation in LISP.
It was created as automatically generated deterministic finite automata and should be very fast in searching. State machine uses lookup table for the states, so effectively it is O(number of states), it could be compressed if range of symbols is huge.
For example for
[a|b]*abb regexp there are only 4 states.
That’s how it works:
$ ./regexp.lisp "[a|b]*abb" "qweabbzxcaabbasd" [a|b]*abb in qweabbzxcaabbasd -> abb [a|b]*abb in qweabbzxcaabbasd -> aabb
Regexp parser supports braces, ‘or’, ‘star’ and concatenation operations, the latter does not require a special symbol. It is rather simple to add another ones like dot for any single symbol or dash for the range of symbols. It is also not too complex to replace character-oriented comparison operations with different types and functions.
One can download sources from archive.