GithubHelp home page GithubHelp logo

yalchik / vader Goto Github PK

View Code? Open in Web Editor NEW

This project forked from johanndejong/vader

2.0 2.0 1.0 265 KB

Deep learning for clustering of multivariate short time series with potentially many missing values

License: GNU Lesser General Public License v2.1

Python 99.92% Dockerfile 0.08%

vader's Introduction

Hi there ๐Ÿ‘‹

  • ๐Ÿ”ญ Iโ€™m currently working on a deep learning model called "VaDER" (https://github.com/yalchik/VaDER) that can cluster patients data.
  • ๐Ÿ‘ฏ Iโ€™m looking to collaborate on AI projects related to bio-, med-, pharma-, chemo- industries.

vader's People

Contributors

cojabi avatar johanndejong avatar yalchik avatar

Stargazers

 avatar  avatar

Forkers

cojabi

vader's Issues

vader.cluster method fails with ValueError: array must not contain infs or NaNs

Some jobs fail with the following error:

Job failed: 0a1c51a1-2ce1-423e-92ca-0f04db8cbf3e with err=array must not contain infs or NaNs, Traceback: Traceback (most recent call last):
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/vader_hyperparameters_optimizer.py", line 198, in run_cv_full_job
    result = job.run()
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/job/abstract_optimization_job.py", line 88, in run
    cv_fold_result = self._cv_fold_step(X_train, X_val, W_train, W_val)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/job/full_optimization_job.py", line 42, in _cv_fold_step
    test_latent_loss) = clustering_func(X_train, X_val, W_train, W_val)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/job/full_optimization_job.py", line 126, in _single_clustering
    effective_k = len(Counter(vader.cluster(X_train, W_train)))
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 562, in cluster
    return self._cluster(mu_tilde, mu_c, sigma2_c, phi_c)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 425, in _cluster
    p = np.array([f(mu_t, mu[i], sigma2[i], phi[i]) for i in np.arange(mu.shape[0])])
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 425, in <listcomp>
    p = np.array([f(mu_t, mu[i], sigma2[i], phi[i]) for i in np.arange(mu.shape[0])])
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 424, in f
    return np.log(self.eps + phi) + np.log(self.eps + multivariate_normal.pdf(mu_t, mean=mu, cov=np.diag(sigma2)))
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/stats/_multivariate.py", line 525, in pdf
    psd = _PSD(cov, allow_singular=allow_singular)
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/stats/_multivariate.py", line 158, in __init__
    s, u = scipy.linalg.eigh(M, lower=lower, check_finite=check_finite)
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/linalg/decomp.py", line 445, in eigh
    a1 = _asarray_validated(a, check_finite=check_finite)
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/_lib/_util.py", line 263, in _asarray_validated
    a = toarray(a)
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/numpy/lib/function_base.py", line 499, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

This error might be related to the interim calculations, since the input data set does not contain ins or NaNs.

vader.cluster method fails with numpy.linalg.LinAlgError: singular matrix

Sometimes the optimization jobs fail with the following error:

Job failed: ca88fd3f-f30c-4fcc-9580-a1943282c5ee and job_params_dict={'k': 2, 'n_hidden': [128, 8], 'learning_rate': 0.01, 'batch_size': 64, 'alpha': 1.0}
Traceback (most recent call last):
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/vader_hyperparameters_optimizer.py", line 199, in run_cv_full_job
    result = job.run()
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/job/abstract_optimization_job.py", line 88, in run
    cv_fold_result = self._cv_fold_step(X_train, X_val, W_train, W_val)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/hp_opt/job/full_optimization_job.py", line 52, in _cv_fold_step
    y_true = vader.cluster(X_val, W_val)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 616, in cluster
    return self._cluster(mu_tilde, mu_c, sigma2_c, phi_c)
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 475, in _cluster
    p = np.array([f(mu_t, mu[i], sigma2[i], phi[i]) for i in np.arange(mu.shape[0])])
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 475, in <listcomp>
    p = np.array([f(mu_t, mu[i], sigma2[i], phi[i]) for i in np.arange(mu.shape[0])])
  File "/home/iyalchyk/my_repo/VaDER/tensorflow2/vader/vader.py", line 474, in f
    return np.log(self.eps + phi) + np.log(self.eps + multivariate_normal.pdf(mu_t, mean=mu, cov=np.diag(sigma2)))
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/stats/_multivariate.py", line 525, in pdf
    psd = _PSD(cov, allow_singular=allow_singular)
  File "/home/iyalchyk/vader_venv/lib/python3.7/site-packages/scipy/stats/_multivariate.py", line 165, in __init__
    raise np.linalg.LinAlgError('singular matrix')
numpy.linalg.LinAlgError: singular matrix

Probably, the root cause is the same as in #4, and the Johann's commit 00d2f84 hasn't fixed the problem.

Johann mentioned that the problem starts by getting very high variance in some columns in "z" calculated during the "pre_fit" stage in the "z = self.map_to_latent(self.X, self.W, n_samp=10)" statement.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.