GithubHelp home page GithubHelp logo

gmh5225 / suffix_array Goto Github PK

View Code? Open in Web Editor NEW

This project forked from justinvanwinkle/suffix_array

0.0 1.0 0.0 1.04 MB

License: MIT License

C++ 52.02% Python 42.86% Emacs Lisp 0.42% Makefile 4.69%

suffix_array's Introduction

Suffix Array

A fork of divsufsort. It does suffix arrays and related stuff.

In [1]: s = 'ab' * 1000000000

In [2]: %timeit 'abababx' in s
1 loops, best of 3: 5.5 s per loop

In [3]: import suffix_array

In [4]: %time sa = suffix_array.SuffixArray(s)
CPU times: user 19.6 s, sys: 1.79 s, total: 21.3 s
Wall time: 21.4 s

In [5]: %timeit 'abababx' in sa
1000000 loops, best of 3: 540 ns per loop

In [6]: s.count('ab')
Out[6]: 1000000000

In [7]: sa.count('ab')
Out[7]: 1000000000

In [8]: %timeit s.count('ab')
1 loops, best of 3: 2.83 s per loop

In [9]: %timeit sa.count('ab')
1000000 loops, best of 3: 577 ns per loop

In [10]: s.count('x')
Out[10]: 0

In [11]: sa.count('x')
Out[11]: 0

In [12]: %timeit s.count('x')
1 loops, best of 3: 1.1 s per loop

In [13]: %timeit sa.count('x')
1000000 loops, best of 3: 309 ns per loop

This library counts overlapping matches, whereas string.count finds nonoverlapping.

In [14]: s.count('ab' * 50)
Out[14]: 20000000

In [15]: sa.count('ab' * 50)
Out[15]: 999999951

In [16]: %timeit s.count('ab' * 50)
1 loops, best of 3: 1.32 s per loop

In [17]: %timeit sa.count('ab' * 50)
100000 loops, best of 3: 6.02 µs per loop

Setup for development:

pip install -r requirements.txt
python setup.py develop

Testing:

py.test

suffix_array's People

Contributors

art99 avatar justinvanwinkle avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.