GithubHelp home page GithubHelp logo

pombredanne / aho-corasick-string-replacement Goto Github PK

View Code? Open in Web Editor NEW

This project forked from averykhoo/aho-corasick-string-replacement

0.0 1.0 0.0 1.82 MB

replace multiple strings with multiple other strings in a single pass

Python 100.00%

aho-corasick-string-replacement's Introduction

aho-corasick-string-replacement

replace multiple strings with multiple other strings in a single pass uses the aho-corasick string search algorithm, but only considers the first longest match

usage

  • process_text <-- replace stuff in a string
  • process_file <-- replace stuff in a text file
  • find_all <-- find stuff in a string
  • to_regex <-- convert the entire trie into a regex

todo

  • make readme
  • neaten and refactor code into multiple files
  • convert trie to DFA by computing suffix/failure links
  • parallel work sharing to make processing faster
  • add code history from 2016
  • split out the tokenizer maybe
  • cleanup and deconflict interfaces
    • list-style interface (with slices)
    • set-style interface (add, remove)
    • dict-style interface (like a set, but with keys)
    • iterator-style interface
    • regex interface (improve regex optimization, more like re.findall/match/etc)
  • __del__
  • disable fromkeys() once created

aho-corasick-string-replacement's People

Contributors

averykhoo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.