GithubHelp home page GithubHelp logo

oist-ncbc / spykesim Goto Github PK

View Code? Open in Web Editor NEW
20.0 6.0 3.0 4.45 MB

Extended edit similarity measurement for high dimensional discrete-time series signal (e.g., multi-unit spike-train).

Home Page: https://pypi.org/project/spykesim

License: MIT License

Makefile 0.10% Python 98.50% C 1.31% Shell 0.10%
neuroscience spike-trains editdistance neuroinformatics theoretical-neuroscience similarity-measures python

spykesim's People

Contributors

092975 avatar keitaw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

spykesim's Issues

Sequence detection from Profiles.

Hi, this is an interesting method. I am trying it on a set of data and I was wondering if you have plans to include detection/extraction of sequences; or do you have any suggestion of the approach?

I have plotting out the windows which belong to the same cluster, along with the spikes of those windows but they do not seem to match well with each other, nor with the cluster profiles in es.profiles. Do you have any suggestion?

Originally posted by @tuanpham96 in #1 (comment)

Barton Sternerg always returns first alignment for first 2 windows

So, I had noticed that the returned profile always resembles closely one of the first two windows (in temporal order) in a cluster, so I checked the code.

barton_sternberg returns mat[i] for some reason instead of the final alignment, where i is 1 and is not affected by the for loop that follows. So mat[i] that is returned actually always corresponds to alignment1 between windows 1 and 2, so it is NOT representative for the entire cluster. I assume this is a simple coding mistake, but it should be corrected before it affects potential users.

def barton_sternberg(mats_, sim_bp, niter):
..........
i, j = 1, 2 # for test
dp_max, dp_max_x, dp_max_y, bp, flip = sim_bp(
mats[i].astype(np.double), mats[j].astype(np.double))
al1, al2 = clocal_exp_editsim_align_alt(bp, dp_max_x, dp_max_y, mats[i], mats[j], flip)
mats[i] = al1
mats[j] = al2
al = (al1 + al2) / 2
processed[i] = True
processed[j] = True
..........
return mats[i]

Dimension mismatch between `mat` in `gen_profile`

I have occasionally ran into problems when I used certain min_cluster_size during clustering, and that affected gen_profile step. Apparently there was some dimension mismatch between the matrices when gen_profile is being run in this line:
https://github.com/KeitaW/spykesim/blob/a18ddc4680f893b20d4c2c214228e5912649d646/spykesim/editsim.pyx#L204

I believe it has to do with how mats are produced below, which does not take into account the sliding width, or assumes that slide=window
https://github.com/KeitaW/spykesim/blob/a18ddc4680f893b20d4c2c214228e5912649d646/spykesim/editsim.pyx#L193

Is that a bug? If so, can it be fixed with something like this:

for idx in indices:
      mat = self.binarray_csc[:, self.times[idx]:(self.times[idx]+self.window)].toarray()

Computing the profile twice

In editsim.pyx:

def gen_profile(self, th_=5, sigma=5):
    ...................................................
    for uidx, mats in zip(uidxs, mats_list):
        profile = regularize_profile(barton_sternberg(mats, self._sim_bp, 2*len(mats)))
        if profile.sum() >= th_:
            self.profiles[uidx] = regularize_profile(barton_sternberg(mats, self._sim_bp, 2*len(mats)))
    return self 

I think you could just use:
if profile.sum() >= th_:
self.profiles[uidx] = profile

because you are computing the same thing twice. This would save some time.

Question regarding Ternary operator in `clocal_exp_editsim`

I got the following question from https://github.com/rcojocaru.

"""
I have a short question about the code. In editsim.py --> clocal_exp_editsim(_withbp) you have this:

for col1 in range(nrow):
for col2 in range(ncol):
match = 0
for row in range(nneuron):
match += mat1[row, col1] * mat2[row, col2]
match = -10 if match == 0 else match
...
dp[col1+1, col2+1] = max4(
0,
down_score,
right_score,
dp[col1, col2] + match

Do you remember why you introduced this if clause for the case in which match is 0? I think it can have radical effects on the edit similarity score. For example, even if comparing identical sequences, instead of getting the maximum edit similarity score, the result would be alpha dependent because of this if clause. I would really appreciate your input before modifying core things.
"""

Clustering on similarity matrix

Hi.

In spykesim/editsim.pyx you perform the clustering directly on the similarity matrix, like this:

def clustering(self, min_cluster_size=5):
"""
Perform HDBSCAN clustering algorithm on the similarity matrix calculated by gensimmat
"""
self.clusterer = HDBSCAN(min_cluster_size=min_cluster_size)
self.cluster_labels = self.clusterer.fit_predict(self.simmat)

Given that self.simmat is a similarity matrix, I think it should be first converted to a distance matrix. Then, then clustering should be performed using the metric='precomputed' option of HDBSCAN. Something like this (assuming distmat is the distance matrix obtained from self.simmat):
self.clusterer = HDBSCAN(min_cluster_size=min_cluster_size, metric='precomputed')
self.cluster_labels = self.clusterer.fit_predict(self.distmat)

I tried the code in its current state in the tutorial, and I get 4 'valid' clusters (index>=0) and 5 windows classified as noise. If I implement these modifications, I get 3 equal 'valid' clusters and 45 windows classified as noise, which makes more sense to me.

Thanks!

Deprecated module warnings

Modules related to sklearn / six give deprecation warnings upon importing editsim. Are they necessary for general computation in spykesim? Should the lib be updated?

Error:

>>> from spykesim import editsim
/usr/lib/python3.7/site-packages/sklearn/externals/six.py:31: DeprecationWarning: The module is deprecated in version 0.21 and will be removed in version 0.23 since we've dropped support for Python 2.7. Please rely on the official version of six (https://pypi.org/project/six/).
  "(https://pypi.org/project/six/).", DeprecationWarning)
/usr/lib/python3.7/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)

System info:

Python 3.7.4 (default, Jul 16 2019, 07:12:58)
[GCC 9.1.0] on linux
Linux 5.2.9-arch1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux

Write test for the similarity calculation.

  • Create a branch for this issue.
  • Copy https://github.com/KeitaW/Chaldea/blob/master/chaldea/edit_sim.jl to this repo.
  • Write tests based on edt_sim.jl.
  • Write a draft function using numpy
  • Write a faster-version using Cython

Missing dependency? (hdbscan)

I tried to import editsim using the following expression:

from spykesim import editsim

and hdbscan was requested, even though previous dependency checks were successful. Maybe it should be added to the dependency list?

System info:

Python 3.7.4 (default, Jul 16 2019, 07:12:58)
[GCC 9.1.0] on linux
Linux 5.2.9-arch1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.