pchampin / fsa4streams Goto Github PK

View Code? Open in Web Editor NEW

2.0 4.0 0.0 51 KB

Finite State Automata for streams

Python 100.00%

fsa4streams's Introduction

Finite State Automata for streams

This is a library for running Finite State Automata (FSA) on bound and unbound streams of event.

An online documentation is available at: http://fsa4streams.readthedocs.org/en/latest/?badge=latest

Ideas for future extensions

export to dot (for graphical presentation)
toolkit for handling FSA's
- optimizing
- merging
more built-in matchers?
- prefix
- more flexible regexp (i.e. not enforcing ^ and $)
converter of regexp to automata

fsa4streams's People

Contributors

Stargazers

Watchers

fsa4streams's Issues

``silent`` is not correctly implemented

There are actually two problems:

silent on the first transition traversed by a token is simply ignored;
in non-deterministic automata, only the silent attribute of the first considered transition is correctly applied; all other transitions matching the event wrongly "inherit" the attribute of that transittion.

For example, feed the FSA below with abcd, and it will produce two matches abc and abd, while they should be bc and d. On the other hand, if you move the silent attribute from the transition s0→s2 to s0→s1, the matches become ac and ad, which is equally incorrect.

{
    "allow_overlap": true,
    "states": {
        "start": {
            "transitions": [
                {
                    "condition": "a",
                    "silent": true,
                    "target": "s0"
                }
             ]
         },
         "s0": {
             "transitions": [
                {
                    "condition": "b",
                    "target": "s1"
                },
                {
                    "condition": "b",
                    "silent": true,
                    "target": "s2"
                }
            ]
        },
        "s1": {
            "transitions": [
                {
                    "condition": "c",
                    "target": "success"
                }
           ]
        },    
        "s2": {
            "max_noise": 1,
            "transitions": [
                {
                    "condition": "d",
                    "target": "success"
                }
           ]
        },    
        "success": {
            "terminal": true
        }
    }
}

Allow terminal states to match immediately

The default behaviour of the FSA is to be greedy, i.e. to yield the longest possible match.

May be it could be useful to change this behaviour, either locally or globally.

Add an option ``max_noise_ratio``

Rather than limiting the absolute value of noise at the FSA level, it could be useful to limit the noise ration, i.e. the number of noisy events divided by the number of events taken into account.

Add an option ``noisy`` to transitions

For the moment, silent transitions can be use to accept events without keeping them in a match. However, such silent events are not considered as noise (hence they are not taken into account by the global max_noise limit).

It could be interesting to declare a transition as noisy rather than silent, meaning that the event would not be added to the match, but it would be counted as global noise.

Note that it would make no sense to have a transition with both noisy and silent set to true.

Add an option ``max_time_span``

This could be an option on the FSA itself or states (or may be just terminal states...).

The idea is to drop tokens whenever they span over too long a period of time. Note that a token reaching the max_time_span on a terminal state should then be considered as a match.

Note that, for this option to be very useful, it should be possible to assign a specific timestamp to each event. This could be an optional parameter of the method feed, which would default to [last timestamp+1], and would be expected to only increase.

Equivalent automata do not behave identically

Consider the following automaton, fed with the stream a a a b:

{
    "states": {
        "start": {
            "transitions": [
                { "condition": "a", "target": "s1" } ] },
        "s1": {
            "terminal": true,
            "transitions": [
                { "condition": "b", "target": "s1" } ] } } }

It will yield a single match, namely a a a b, with or without allow_overlap set.

Now consider the following automaton:

{   
    "states": {
        "start": {
            "transitions": [
                { "condition": "a", "target": "s2" },
                { "condition": "a", "target": "s1" } ] },
        "s1": {
            "transitions": [
                { "condition": "b", "target": "s1" },
                { "condition": "b", "target": "s2" } ] },
        "s2": {
            "terminal": true } } }

It is equivalent to the first one, but will yield different results when fed with the same events.
Namely, with allow_overlap set to false, it will only yield a as a match.
On the other hand, when allow_overlap is set to true,
it will yield four matches a, a b, a b b and a b b b.

The reason is the following.

In the first FSA, each time a token reaches state s1 (which is terminal), it does not yield a match immediately, because the state has outgoing transitions; so it waits to see if the next events can get it further. In fact, it then spawns a new token that inhibits its parent; if the child token reaches a longer match, the parent is discarded; on the other hand, if the child token is discarded, the parent yields a match.
In the second FSA, the final state has no outgoing transition, so every token reaching it yields a match immediately. This contradicts the intention that automata work greedily, so this is somehow a bug...

``default_transition`` on the ``start`` state does not work.

The following FSA should match any pair of consecutive events:

{
    "allow_overlap": true,
    "states": {
        "start": {
            "default_transition": {
                "target": "s1"
            }
        },
        "s1": {
            "default_transition": {
                "target": "s2"
            }
        },
        "s2": {
            "terminal": true
        }
    }
}

but it matches nothing, while the following works well:

{
    "allow_overlap": true,
    "states": {
        "start": {
            "transitions": [{
                "condition": ".*", "matcher": "regexp",
                "target": "s1"
            }]
        },
        "s1": {
            "default_transition": {
                "target": "s2"
            }
        },
        "s2": {
            "terminal": true
        }
    }
}

pchampin / fsa4streams Goto Github PK

fsa4streams's Introduction

Finite State Automata for streams

Ideas for future extensions

fsa4streams's People

Contributors

Stargazers

Watchers

fsa4streams's Issues

``silent`` is not correctly implemented

Allow terminal states to match immediately

Add an option ``max_noise_ratio``

Add an option ``noisy`` to transitions

Add an option ``max_time_span``

Equivalent automata do not behave identically

``default_transition`` on the ``start`` state does not work.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs