Consider the following automaton, fed with the stream a a a b
:
{
"states": {
"start": {
"transitions": [
{ "condition": "a", "target": "s1" } ] },
"s1": {
"terminal": true,
"transitions": [
{ "condition": "b", "target": "s1" } ] } } }
It will yield a single match, namely a a a b
, with or without allow_overlap
set.
Now consider the following automaton:
{
"states": {
"start": {
"transitions": [
{ "condition": "a", "target": "s2" },
{ "condition": "a", "target": "s1" } ] },
"s1": {
"transitions": [
{ "condition": "b", "target": "s1" },
{ "condition": "b", "target": "s2" } ] },
"s2": {
"terminal": true } } }
It is equivalent to the first one, but will yield different results when fed with the same events.
Namely, with allow_overlap
set to false
, it will only yield a
as a match.
On the other hand, when allow_overlap
is set to true
,
it will yield four matches a
, a b
, a b b
and a b b b
.
The reason is the following.
- In the first FSA, each time a token reaches state
s1
(which is terminal), it does not yield a match immediately, because the state has outgoing transitions; so it waits to see if the next events can get it further. In fact, it then spawns a new token that inhibits its parent; if the child token reaches a longer match, the parent is discarded; on the other hand, if the child token is discarded, the parent yields a match.
- In the second FSA, the final state has no outgoing transition, so every token reaching it yields a match immediately. This contradicts the intention that automata work greedily, so this is somehow a bug...