GithubHelp home page GithubHelp logo

saxpy's People

Contributors

nphoff avatar snim2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

saxpy's Issues

[Question] substring search

Thanks for sharing this.
I'm not an expert, so I would like to ask you, would it be possible to use your code to search for the top-k similar substrings.

For instance, given

target = [ sin(x)*2 for x in range(0, 100) ]
query  = [ sin(x)*2 for x in range(2, 5) ]

to retrieve the windows from target that are most similar to query?

I guess I should use s.batch_compare(tStrings,qString) somehow, but I'm not sure how.
Thanks a lot!

About scalingFactor

Hi,

I read the original paper about SAX and get really confused about the meaning of scalingFactor in your implementation.

I get a use case in which there are 7 template signals that are not in the same length and the signal being compared is also in the different length. When comparing that signal with templates, it seems that scalingFactor does not affect the final result, which is the template with minimum distance.

Would you please give some hints about how to use the library for my case?

Thanks in advance

import error

ImportError Traceback (most recent call last)
in ()
----> 1 from saxpy import SAX

ImportError: cannot import name 'SAX'

Missing return in normalize

Hi. In the normalize function, if np.nanstd(X) < self.eps: the code goes and constructs the list res with 0's and NaN's as appropriate. However, it doesn't return res, but falls through to the computation which should only be done if np.nanstd(X) >= self.eps. I think this is a bug. I can submit a patch if you like, but it's so simple, I figured it should just be added. Thanks.

Jeff Becker ([email protected])

Bug in implementation?

Hi Nathan,

I was trying to use your implementation, but I guess it contains some bugs, as far as I can figure it out.

The problem:

I have a sax_sequence A,
A: aaaaccccbbaa

and a longer sequence "sequence" C:
match score between subsequence, and A
^ indexes
| ^ |> subsequence in C
W: 0.000 (0, 24) bbbbbbbbbbbb
W: 0.000 (2, 26) bbbbbbbbbbbb
W: 0.000 (18, 42) aaaaccbbbbba
W: 0.000 (48, 72) aaaaccccbbaa
W: 0.000 (70, 94) bbbbbbbbbbbb
W: 0.000 (72, 96) bbbbbbbbbbbb
W: 0.000 (74, 98) bbbbbbbbbbbb
W: 0.000 (76, 100) bbbbbbbbbbbb

It hink it's a bug that bbbbbbbbbbbb is equal to aaaaccccbbaa, no?
The problem is that compareDict does not make sense (e.g. difference(a,b)=0 and difference(b,c)=0)
e.g. print s.compareDict
{'aa': 0, 'ac': 0.86, 'ab': 0, 'ba': 0, 'bb': 0, 'bc': 0, 'cc': 0, '**cb': 0, '**ca': 0.86}

Full case that triggers bug:

sequence = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 23.0, 73.0, 73.0, 75.0, 30.0, 16.0, 19.0, 27.0, 33.0, 19.0, 5.0, 20.0, 19.0, 13.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 10.0, 18.0, 16.0, 11.0, 30.0, 10.0, 39.0, 12.0, 2.0, 15.0, 16.0, 4.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 19.0, 6.0, 39.0, 27.0, 18.0, 20.0, 38.0, 34.0, 33.0, 10.0, 10.0, 15.0, 10.0, 8.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 9.0, 10.0, 10.0, 35.0, 25.0, 24.0, 18.0, 28.0, 18.0, 16.0, 18.0, 31.0, 10.0, 10.0, 15.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 8.0, 30.0, 25.0, 13.0, 13.0, 28.0, 27.0, 20.0, 13.0, 9.0, 11.0, 5.0, 8.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 11.0, 4.0, 18.0, 26.0, 13.0, 23.0, 16.0, 13.0, 15.0, 12.0, 17.0, 15.0, 24.0, 9.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 16.0, 0.0, 0.0, 10.0, 9.0, 3.0, 27.0, 15.0, 18.0, 23.0, 25.0, 16.0, 12.0, 23.0, 13.0, 16.0, 10.0, 8.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 13.0, 8.0, 28.0, 25.0, 19.0, 15.0, 23.0, 8.0, 23.0, 30.0, 28.0, 20.0, 25.0, 16.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 16.0, 10.0, 27.0, 24.0, 30.0, 27.0, 28.0, 41.0, 31.0, 25.0, 6.0, 25.0, 9.0, 9.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 15.0, 11.0, 25.0, 28.0, 15.0, 15.0, 23.0, 15.0, 23.0, 26.0, 15.0, 17.0, 12.0, 9.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 18.0, 12.0, 36.0, 28.0, 13.0, 21.0, 15.0, 19.0, 33.0, 36.0, 9.0, 6.0, 10.0, 6.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 12.0, 12.0, 42.0, 13.0, 23.0, 23.0, 49.0, 5.0, 6.0, 15.0, 13.0, 13.0, 11.0, 16.0, 2.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 4.0, 4.0, 2.0, 16.0, 25.0, 17.0, 16.0, 25.0, 18.0, 18.0, 25.0, 17.0, 13.0, 12.0, 4.0, 4.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 11.0, 3.0, 28.0, 20.0, 24.0, 21.0, 21.0, 21.0, 16.0, 32.0, 28.0, 15.0, 18.0, 15.0, 2.0, 11.0, 23.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 17.0, 13.0, 26.0, 15.0, 18.0, 15.0, 3.0, 0.0, 11.0, 19.0, 11.0, 17.0, 12.0, 4.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 18.0, 16.0, 26.0, 15.0, 19.0, 18.0, 20.0, 26.0, 11.0, 12.0, 10.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 12.0, 10.0, 45.0, 20.0, 15.0, 28.0, 20.0, 24.0, 16.0, 19.0, 20.0, 13.0, 19.0, 15.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 9.0, 4.0, 11.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, 2.0, 5.0, 1.0, 15.0, 8.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 15.0, 12.0, 21.0, 25.0, 15.0, 15.0, 26.0, 2.0, 0.0, 2.0, 0.0, 4.0, 12.0, 16.0, 18.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 16.0, 0.0, 10.0, 12.0, 6.0, 20.0, 0.0, 0.0, 1.0, 27.0, 19.0, 25.0, 3.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 10.0, 24.0, 11.0, 25.0, 17.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.0, 16.0, 6.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 26.0, 26.0, 17.0, 6.0, 18.0, 17.0, 8.0, 17.0, 4.0, 21.0, 12.0, 16.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 12.0, 0.0, 6.0, 5.0, 16.0, 18.0, 23.0, 32.0, 17.0, 25.0, 5.0, 12.0, 13.0, 0.0, 0.0, 0.0]

e.g. input assumed to be 30 days of each 24 points

motif = sequence[48:72]

s = SAX(12, 3)
(a_sax, a_indexes) = s.to_letter_rep(motif)
print "a_sax: %s" % a_sax
(sequence_strings, sequence_indexes) = s.sliding_window(sequence, len(sequence)/ len(motif)) x3x2ComparisonScores = s.batch_compare(sequence_strings,a_sax)_

count = 0.0
threshold = 0.1
print "A:\t\t_\t%s" % a_sax
for i, score in enumerate(x3x2ComparisonScores):
if score<threshold:
print "W: %.3f\t%s\t%s" % (score, sequence_indexes[i], sequence_strings[i])

Other questions:

Your input parameters are chosen somewhat strange in my opinion. I would expect to just pass, the gap_size in sliding_window and the size of interval for PAA (e.g. 2 for taking averages of two number in input sequence) in the SAX constructor.

I'm not sure if you still maintain this code, but it would be nice to have a correct implementation in python! However, if you choose not to maintain this code, I think it's better that you remove it as it does not seem to work properly...

Kind regards!
Len Feremans

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.