GithubHelp home page GithubHelp logo

fsa4streams's Introduction

Finite State Automata for streams

This is a library for running Finite State Automata (FSA) on bound and unbound streams of event.

An online documentation is available at: http://fsa4streams.readthedocs.org/en/latest/?badge=latest

Documentation Status

Ideas for future extensions

  • export to dot (for graphical presentation)
  • toolkit for handling FSA's
    • optimizing
    • merging
  • more built-in matchers?
    • prefix
    • more flexible regexp (i.e. not enforcing ^ and $)
  • converter of regexp to automata

fsa4streams's People

Contributors

pchampin avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fsa4streams's Issues

``silent`` is not correctly implemented

There are actually two problems:

  • silent on the first transition traversed by a token is simply ignored;
  • in non-deterministic automata, only the silent attribute of the first considered transition is correctly applied; all other transitions matching the event wrongly "inherit" the attribute of that transittion.

For example, feed the FSA below with abcd, and it will produce two matches abc and abd, while they should be bc and d. On the other hand, if you move the silent attribute from the transition s0โ†’s2 to s0โ†’s1, the matches become ac and ad, which is equally incorrect.

{
    "allow_overlap": true,
    "states": {
        "start": {
            "transitions": [
                {
                    "condition": "a",
                    "silent": true,
                    "target": "s0"
                }
             ]
         },
         "s0": {
             "transitions": [
                {
                    "condition": "b",
                    "target": "s1"
                },
                {
                    "condition": "b",
                    "silent": true,
                    "target": "s2"
                }
            ]
        },
        "s1": {
            "transitions": [
                {
                    "condition": "c",
                    "target": "success"
                }
           ]
        },    
        "s2": {
            "max_noise": 1,
            "transitions": [
                {
                    "condition": "d",
                    "target": "success"
                }
           ]
        },    
        "success": {
            "terminal": true
        }
    }
}

Allow terminal states to match immediately

The default behaviour of the FSA is to be greedy, i.e. to yield the longest possible match.

May be it could be useful to change this behaviour, either locally or globally.

Add an option ``max_noise_ratio``

Rather than limiting the absolute value of noise at the FSA level, it could be useful to limit the noise ration, i.e. the number of noisy events divided by the number of events taken into account.

Add an option ``noisy`` to transitions

For the moment, silent transitions can be use to accept events without keeping them in a match. However, such silent events are not considered as noise (hence they are not taken into account by the global max_noise limit).

It could be interesting to declare a transition as noisy rather than silent, meaning that the event would not be added to the match, but it would be counted as global noise.

Note that it would make no sense to have a transition with both noisy and silent set to true.

Add an option ``max_time_span``

This could be an option on the FSA itself or states (or may be just terminal states...).

The idea is to drop tokens whenever they span over too long a period of time. Note that a token reaching the max_time_span on a terminal state should then be considered as a match.

Note that, for this option to be very useful, it should be possible to assign a specific timestamp to each event. This could be an optional parameter of the method feed, which would default to [last timestamp+1], and would be expected to only increase.

Equivalent automata do not behave identically

Consider the following automaton, fed with the stream a a a b:

{
    "states": {
        "start": {
            "transitions": [
                { "condition": "a", "target": "s1" } ] },
        "s1": {
            "terminal": true,
            "transitions": [
                { "condition": "b", "target": "s1" } ] } } }

It will yield a single match, namely a a a b, with or without allow_overlap set.

Now consider the following automaton:

{   
    "states": {
        "start": {
            "transitions": [
                { "condition": "a", "target": "s2" },
                { "condition": "a", "target": "s1" } ] },
        "s1": {
            "transitions": [
                { "condition": "b", "target": "s1" },
                { "condition": "b", "target": "s2" } ] },
        "s2": {
            "terminal": true } } }

It is equivalent to the first one, but will yield different results when fed with the same events.
Namely, with allow_overlap set to false, it will only yield a as a match.
On the other hand, when allow_overlap is set to true,
it will yield four matches a, a b, a b b and a b b b.

The reason is the following.

  • In the first FSA, each time a token reaches state s1 (which is terminal), it does not yield a match immediately, because the state has outgoing transitions; so it waits to see if the next events can get it further. In fact, it then spawns a new token that inhibits its parent; if the child token reaches a longer match, the parent is discarded; on the other hand, if the child token is discarded, the parent yields a match.
  • In the second FSA, the final state has no outgoing transition, so every token reaching it yields a match immediately. This contradicts the intention that automata work greedily, so this is somehow a bug...

``default_transition`` on the ``start`` state does not work.

The following FSA should match any pair of consecutive events:

{
    "allow_overlap": true,
    "states": {
        "start": {
            "default_transition": {
                "target": "s1"
            }
        },
        "s1": {
            "default_transition": {
                "target": "s2"
            }
        },
        "s2": {
            "terminal": true
        }
    }
}

but it matches nothing, while the following works well:

{
    "allow_overlap": true,
    "states": {
        "start": {
            "transitions": [{
                "condition": ".*", "matcher": "regexp",
                "target": "s1"
            }]
        },
        "s1": {
            "default_transition": {
                "target": "s2"
            }
        },
        "s2": {
            "terminal": true
        }
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.