moldyn / mosaic Goto Github PK

View Code? Open in Web Editor NEW

23.0 23.0 2.0 44.25 MB

Correlation-based feature selection of Molecular Dynamics simulations

Home Page: https://moldyn.github.io/MoSAIC/

License: MIT License

Python 100.00%

mosaic's People

Contributors

Stargazers

Watchers

Forkers

braniii rnaimehaom

mosaic's Issues

similarity fit overflow

Hi everyone,

First of all congratulations on all your work on this exciting tool.

I have an input numpy array of shape (5852, 1297) and this is in the format (n_samples, n_features).
If I do:

d_array = np.load(“data”)
sim = mosaic.Similarity(
    metric='correlation’)
sim.fit(d_array])

I get:
ValueError: Correlation matrix is not symmetric. This should not occur and is probably caused by an overflow error.

The procedure actually works only if I reduce a lot the input matrix, as much as I can keep only 200 features.
The funny fact is that I, by mistake, transposed the matrix, so to get d_array of dimensions (1297, 5852) and in that case, it worked perfectly, but of course, it was conceptually wrong.

I tried to work around the problem by computing the correlation matrix with standard numpy:
R1 = np.corrcoef(d_array.T)
With this, I get the correlation matrix, but if I then feed this to the clustering function I get an error:
AssertionError: False not tri-state boolean.

As suggested by Georg Diez, I checked the format of my input data and it is np.float32.

Could you help me with this problem?

Thank you,
Elena

[bug] missing dependency `decorit`

Thx @dieJaegerIn for reporting this bug.

Traceback (most recent call last):
  File "/home/user/anaconda3/bin/mosaic", line 5, in <module>
    from mosaic.__main__ import main
  File "/home/user/anaconda3/lib/python3.9/site-packages/mosaic/__init__.py", line 13, in <module>
    from .umap_similarity import UMAPSimilarity
  File "/home/user/anaconda3/lib/python3.9/site-packages/mosaic/umap_similarity.py", line 16, in <module>
    from decorit import deprecated
ModuleNotFoundError: No module named 'decorit'

When using prettypyplot < 0.8.0 decorit is missing. Either add decorit to dependencies or require pplt>=0.8.0.

default value for resolution_parameter for CPM clustering

Taking the mean/median does usually not result in the desired number of clusters. (AMINO example e.g.)

"ValueError: Correlation matrix is not symmetric. This should not occur and is probably caused by an overflow error or too low dtype precision."

I have installed MoSAIC into a python 3.8 environment, and have run the following command:

python -m mosaic similarity -i test.dat -o output_similarity --metric correlation -v

My output looks like this:

MoSAIC SIMILARITY
~~~ Initialize similarity class
~~~ Load file test.dat
~~~ Fit input.
/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/numpy/lib/function_base.py:2854: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/numpy/lib/function_base.py:2855: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
Traceback (most recent call last):
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/mosaic/__main__.py", line 363, in <module>
    main()
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/mosaic/__main__.py", line 153, in similarity
    sim.fit(X)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/functools.py", line 912, in _method
    return method.__get__(obj, cls)(*args, **kwargs)
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/mosaic/similarity.py", line 208, in _
    corr = _correlation(X)
  File "<@beartype(mosaic._correlation_utils._correlation) at 0x7fdb541b9d30>", line 50, in _correlation
  File "/home/austin/miniconda3/envs/mosaic/lib/python3.8/site-packages/mosaic/_correlation_utils.py", line 109, in _correlation
    raise ValueError(
ValueError: Correlation matrix is not symmetric. This should not occur and is probably caused by an overflow error or too low dtype precision.

I have tried changing the shape of my input test.dat, but the result is always the same. I convert my npy file to dat format using a command like this: np.savetxt('test.dat', npy_array, fmt='%.4f')

I am not sure how I can circumvent this issue. I would like to use MoSAIC to reduce my feature set from 2070 to something more reasonable. There is no tutorial file available for determining similarity; the only example file that I see is MoSAIC/example/toy_matrix_paper, which is meant for testing the MoSAIC clustering function rather than the MoSAIC similarity function.

Let me know if there is anything else I can do or provide input files.

Thanks,
Austin

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble

moldyn / mosaic Goto Github PK

mosaic's People

Contributors

Stargazers

Watchers

Forkers

mosaic's Issues

similarity fit overflow

[bug] missing dependency `decorit`

default value for resolution_parameter for CPM clustering

"ValueError: Correlation matrix is not symmetric. This should not occur and is probably caused by an overflow error or too low dtype precision."

publish package to PyPi

[bug] load_clusters is broken on numpy v1.23.x+

Cuthill McKee not deterministic

Implement silhouette optimization

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs