beringresearch / ivis Goto Github PK
View Code? Open in Web Editor NEWDimensionality reduction in very large datasets using Siamese Networks
Home Page: https://beringresearch.github.io/ivis/
License: Apache License 2.0
Dimensionality reduction in very large datasets using Siamese Networks
Home Page: https://beringresearch.github.io/ivis/
License: Apache License 2.0
Hello,
For JOSS review.
Is your feature request related to a problem? Please describe.
The R package lacks documentation of an application to a real-life dataset.
Describe the solution you'd like
Please add a vignette in the R package demonstrating at least an example application to a single-cell dataset. Basically, the equivalent of the scanpy workflow here.
A convenient way to use the pbmc3k
dataset for demonstration purposes is the Bioconductor TENxPBMCData package.
Suggested code:
library(TENxPBMCData)
tenx_ pbmc3k <- TENxPBMCData(dataset = "pbmc3k")
Ideally, consider using the vignette (or a separate one) to also give an introduction to the functionality of the R package.
It is not necessary to duplicate information already described in the documentation of the Python package (DRY principle); you may simply include a link to the main page.
Describe alternatives you've considered
A working example of an R workflow could also be included in the documentation of the Python package, although this is probably unnecessarily difficult to maintain.
Ideally, that example would be run and tested for every new release of the Python and R source code.
Additional context
Once you have an R vignette written, you should also consider using pkgdown to automatically create a GitHub website including the full package documentation.
A fitted Ivis
instance is not adequately preserved when joblib.dump()
is used to save it. Consequently, when Ivis
is used as part of a sklearn.pipeline.Pipeline
object with memory != None
, errors occur.
Two examples are provided herein: one with sklearn.pipeline.Pipeline
, and other with joblib
only (sklearn
uses joblib
in sklearn.pipeline.Pipeline
, so I thought this second example could help).
A virtual environment was created specifically for this project, wherein all modules specified in requirements.txt were installed. My setup runs an up-to-date version of Windows 10 (no WSL).
python=3.9.5
ivis=2.0.4
tensorflow=2.5.0
sklearn.pipeline.Pipeline
import tempfile
import ivis
from sklearn import datasets, ensemble, model_selection, pipeline, preprocessing, svm
X, y = datasets.load_iris(return_X_y=True)
pipeline_with_ivis = pipeline.Pipeline([
("normalize", preprocessing.MinMaxScaler()),
("project", None),
("classify", None),
], memory=tempfile.mkdtemp())
parameter_grid = {
"project": (ivis.Ivis(verbose=0),),
"project__k": (15,),
"classify": (ensemble.RandomForestClassifier(), svm.SVC()),
"classify__random_state": (2021,)
}
grid_search = model_selection.GridSearchCV(pipeline_with_ivis, parameter_grid, scoring="accuracy", cv=10, verbose=3,
return_train_score=True).fit(X, y) # should fail
Fitting 10 folds for each of 2 candidates, totalling 20 fits
[CV 1/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=1.000) total time= 11.3s
[CV 2/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=1.000) total time= 4.3s
[CV 3/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=1.000) total time= 8.6s
[CV 4/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=1.000) total time= 3.9s
[CV 5/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=0.800) total time= 6.4s
[CV 6/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=0.800) total time= 5.8s
[CV 7/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=1.000) total time= 4.5s
[CV 8/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=0.800) total time= 5.3s
[CV 9/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=0.667) total time= 4.3s
[CV 10/10] END classify=RandomForestClassifier(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=1.000, test=0.800) total time= 3.8s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 1/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 2/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 3/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
[CV 4/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
[CV 5/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 6/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
[CV 7/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 8/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 9/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:696: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 687, in _score
scores = scorer(estimator, X_test, y_test)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 199, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true,
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 236, in _score
y_pred = method_caller(estimator, "predict", X)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\metrics\_scorer.py", line 53, in _cached_call
return getattr(estimator, method)(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 331, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
warnings.warn(
[CV 10/10] END classify=SVC(), classify__random_state=2021, project=Ivis(verbose=0), project__k=15;, score=(train=nan, test=nan) total time= 0.0s
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_search.py:922: UserWarning: One or more of the test scores are non-finite: [0.88666667 nan]
warnings.warn(
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_search.py:922: UserWarning: One or more of the train scores are non-finite: [ 1. nan]
warnings.warn(
sklearn.pipeline.Pipeline
import ivis
import joblib
from sklearn import datasets, model_selection
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.33, random_state=42)
model = ivis.Ivis(k=15, batch_size=15, verbose=0).fit(X_train, y_train)
joblib.dump(model, "ivis.pkl")
new_model = joblib.load("ivis.pkl")
model.transform(X_test) # should work
new_model.transform(X_test) # should fail
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<USER_FOLDER>\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\211.7142.13\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "<USER_FOLDER>\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\211.7142.13\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "<REPOSITORY_ROOT>/playground3.py", line 20, in <module>
new_model.transform(X_test) # should fail
File "<REPOSITORY_ROOT>\venv\lib\site-packages\ivis\ivis.py", line 329, in transform
raise NotFittedError("Model was not fitted yet. Call `fit` before calling `transform`.")
sklearn.exceptions.NotFittedError: Model was not fitted yet. Call `fit` before calling `transform`.
As seen in the example with sklearn.pipeline.Pipeline
and sklearn.model_selection.GridSearchCV
, everything runs smoothly when Ivis
is fitted the first time for all folds. When the model is cached and retrieved for the subsequent runs, however, errors happen because at least Ivis.encoder
is missing. Upon experimentation, it was found that even after loading Ivis.encoder
, errors happened with the reloaded model, indicating that other important attributes were not properly pickled.
Although I never tested such functions, it seems that saving and loading capabilities were already developed for Ivis
in Ivis.save_model()
and Ivis.load_model()
. However, to ensure that Ivis
is pickleable, it would be ideal to transfer and adapt this functionality to Ivis.__getstate__()
and Ivis.__setstate__()
(the latter of which does not exist AFAIK) so that pickle
and joblib
know how to pickle an Ivis
instance. This would enable its employment in Pipeline
objects with memory != None
, thus significantly speeding up the hyper-parameter fine-tuning process performed by GridSearchCV
.
Running ivis.transform
on a pre-built model across different tensorflow sessions produces different embeddings. Embeddings are consistent within a session, but once tf session is restarted and model is reloaded, embeddings change.
This was introduced in 2.0 upgrade as earlier versions of ivis behaved as expected. Additionally, this only seems to affect larger datasets. I don't see this in iris dataset, but in 500k+ row dataset it's present.
Things I checked that seem to be ok:
model loading: model weights, optimizer weights all appear to be consistent between sessions. So this isn't an issue with incorrect initialisation
toggling GPU training: the bug seems to be present when running both CPU and GPU transformations.
data normalization: data normalization stays consistent i.e. input data is not altered between sessions.
Hello Folks,
thank you for all the work on this lib. I have a question about reproducibility: Is there a way to set a random seed or random state and get stable results?
I'm trying to achieve this with:
import random
import numpy
random.seed(42)
numpy.random.seed(42)
I'm aware that these are not threadsafe, so this may be the reason of the not reproducible results. Anyway, is there any way to enforce this?
To reproduce:
pip3 install tensorflow --upgrade
from sklearn.datasets import load_iris
from ivis import Ivis
X = load_iris()['data']
y = load_iris()['target']
# Supervised and unsupervised modes result in the same error
ivis = Ivis(k=5, batch_size=8).fit(X, y)
ivis.save_model('tmp.ivis')
model = Ivis()
model.load_model('tmp.ivis')
This results in AttributeError: 'Model' object has no attribute '_make_predict_function'
In https://bering-ivis.readthedocs.io/en/latest/oom_datasets.html, for out-of-memory datasets, you say to train on h5 files that exist on disk.
In my case, I can't use h5 files, but I could use a custom generator which yields numpy array batched data.
Is there a way to provide batched data through a custom generator function? Something like keras' fit_generator
.
Thank you
Propose to encode multi-label response variables using sklearn's MultiLabelBinarizer
.
Describe the bug
It seems chunk_size
in ivis.data.neighbour_retrieval.knn
is set to 0 for my dataset, which has shape (6, 784)
.
Stack trace
Building KNN index
100%|█████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 1318.07it/s]
Extracting KNN neighbours
Traceback (most recent call last):
File "main.py", line 17, in <module>
viz.visualize(*embeddings, dpi=150)
File "/Users/ryedida/Desktop/CSC522/userdata_mining/visualization/embeddings.py", line 76, in visualize
x = self._reduce_dims(arg)
File "/Users/ryedida/Desktop/CSC522/userdata_mining/visualization/embeddings.py", line 50, in _reduce_dims
return ivis.fit_transform(arg)
File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 336, in fit_transform
self.fit(X, Y, shuffle_mode)
File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 314, in fit
self._fit(X, Y, shuffle_mode)
File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 179, in _fit
self.neighbour_matrix = AnnoyKnnMatrix.build(X, path=self.annoy_index_path,
File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 60, in build
return cls(index, X.shape, path, k, search_k, precompute, include_distances, verbose)
File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 47, in __init__
self.precomputed_neighbours = self.get_neighbour_indices()
File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 93, in get_neighbour_indices
return extract_knn(
File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 189, in extract_knn
for i in range(0, data_shape[0], chunk_size):
ValueError: range() arg 3 must not be zero
Desktop (please complete the following information):
Additional context
Python 3.8. embedding_dims
was set to 2, k
was set to 3.
I would like to apply ivis on a high dim time series/sequence data. Is there a way to achieve this with the current version?
This will enable ivis to be used in sklearn GridSearchCV and utilise various scoring function.
In Python, Ivis.__init__
accepts a distance: str
keyword argument, which sets from a dictionary a predefined triplet loss function for that distance metric. Currently, one of the ways to provide a custom distance function is to monkeypatch the ivis.nn.losses.get_loss_functions
. Other ways to accomplish the same are even messier from the perspectives of usage and implementation.
The nature of dimensionality reduction, especially when dealing with one-hot-encoded categorical features, sometimes requires custom ways to calculate loss. Under the hood, ivis
has the ability to enable custom loss functions, but any such offerings need to be implemented in a clean and API-idiomatic manner.
A custom distance function requires its own triplet loss implementation. Ivis.__init__
could support an additional keyword argument (e.g. triplet_loss: Callable[..., ...] = ...
) for users to be able to pass their own.
Alternatively, it could simply be passed inside the existing distance
kwarg, with its signature changing to distance: Union[str, Callable[..., ...]]
.
Another way would be to make the losses dictionary built by ivis.nn.losses.get_loss_functions
a module-level loss function registrar.
Additionally, docs and examples need to be updated on how to correctly implement a custom loss function. With all currently available distance metrics, the triplet loss implementation follows a very similar pattern, and should not be too daunting to attempt to implement.
General questions about algorithm design and usage.
Hi,
It is a great new method to learn the low dimensional embedding of the high dimensional single cell data. But how about comparing to other scRNA-seq embedding methods? There are lots of methods for scRNA-seq dimension reduction, for example ZIFA [1], ZINB-Wave [2], DCA [3], scvi [4], scvis [5], scScope [6] etc. Most of them are zero-inflated matrix factorization analysis or denoising/zero-inflated auto-encoders. Thanks.
[1] Pierson, Emma, and Christopher Yau. "ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis." Genome biology 16.1 (2015): 241.
[2] Risso, Davide, et al. "A general and flexible method for signal extraction from single-cell RNA-seq data." Nature communications 9.1 (2018): 284.
[3] Eraslan, Gökcen, et al. "Single-cell RNA-seq denoising using a deep count autoencoder." Nature communications 10.1 (2019): 390.
[4] Lopez, Romain, et al. "Deep generative modeling for single-cell transcriptomics." Nature methods 15.12 (2018): 1053.
[5] Ding, Jiarui, Anne Condon, and Sohrab P. Shah. "Interpretable dimensionality reduction of single cell transcriptome data with deep generative models." Nature communications 9.1 (2018): 2002.
[6] Deng, Yue, et al. "Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning." Nature methods 16.4 (2019): 311.
I read the paper of this project, find the method is similar with umap(based on KNN). So,what's the differences between these methods?
Current behaviour when using distance='cosine'
throws an error:
AttributeError: module 'tensorflow.keras.losses' has no attribute 'cosine_distance'
Potential fix is changing module imports to: tf.compat.v1.losses.cosine_distance
Hello,
For JOSS review.
I am running into the following issue when running the example R code (given in this README.md) in an R console in my terminal.
The package was successfully installed in an R console in my terminal as described in #28
The main error below seems to be: UnboundLocalError: local variable 'a' referenced before assignment
.
> library(ivis)
>
> model <- ivis(k = 3, batch_size = 3)
Using TensorFlow backend.
/Users/kevin/miniconda3/envs/ivis/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7
return f(*args, **kwds)
>
> X = data.matrix(iris[, 1:4])
> model = model$fit(X)
Building KNN index
/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/data/knn.py:15: FutureWarning: The default argument for metric will be removed in future version of Annoy. Please pass metric='angular' explicitly.
index = AnnoyIndex(X.shape[1])
100%|███████████████████████████████████████████████████████████████████████| 150/150 [00:00<00:00, 127667.53it/s]
Extracting KNN from index
/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/data/knn.py:92: FutureWarning: The default argument for metric will be removed in future version of Annoy. Please pass metric='angular' explicitly.
self.index = AnnoyIndex(n_dims)
11%|████████▌ | 17/150 [00:00<00:00, 141.19it/s]
WARNING:tensorflow:From /Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Error in py_call_impl(callable, dots$args, dots$keywords) :
UnboundLocalError: local variable 'a' referenced before assignment
Detailed traceback:
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/ivis.py", line 209, in fit
self._fit(X, Y, shuffle_mode)
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/ivis.py", line 147, in _fit
triplet_network(base_network(self.model_def, input_size),
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/nn/network.py", line 41, in base_network
return default_base_network(input_shape)
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/nn/network.py", line 61, in default_base_network
x = AlphaDropout(0.1)(x)
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/keras/engine/base_layer.py", line 457, in __call__
output = self.call(inputs, **kwargs)
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/keras/layers/noise.py", line 165, in call
return K
>
> xy = model$transform(X)
Error in py_call_impl(callable, dots$args, dots$keywords) :
AttributeError: 'NoneType' object has no attribute 'predict'
Detailed traceback:
File "/Users/kevin/miniconda3/envs/ivis/lib/python3.7/site-packages/ivis/ivis.py", line 248, in transform
embedding = self.encoder.predict(X, verbose=self.verbose)
Session info reported below
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin18.5.0 (64-bit)
Running under: macOS Mojave 10.14.5
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ivis_1.1.3
loaded via a namespace (and not attached):
[1] compiler_3.6.0 BiocManager_1.30.4 Matrix_1.2-17 tools_3.6.0
[5] Rcpp_1.0.2 reticulate_1.13 grid_3.6.0 jsonlite_1.6
[9] lattice_0.20-38
Minor bug that I noticed while I was working in a conda-forge recipe for ivis
. In the setup script, the README.md
file is referenced, but it is not being packaged with the source (README.txt
is).
with open('README.md', encoding='utf-8') as f:
long_description = f.read()
When attempting to use save_model
after fitting a supervised Ivis instance, I get an error when trying to save. It looks like some part of the optimizer is not compatible to be pickled with python.
Replicate:
import ivis
i = ivis.Ivis(embedding_dims=10, n_epochs_without_progress=5)
i.fit(X, y)
i.save_model("model.ivis")
Traceback (most recent call last):
File "src/ivis_persist.py", line 69, in <module>
ivises[output].save_model(f"models/{output}.ivis")
File "/Users/pbaumgartner/anaconda3/envs/env/lib/python3.7/site-packages/ivis/ivis.py", line 404, in save_model
pkl.dump(self.model_.optimizer, f)
AttributeError: Can't pickle local object 'make_gradient_clipnorm_fn.<locals>.<lambda>'
System Info:
Running ivis==2.0.0
on macOS with python 3.7.
Before each epoch, tensorflow fills up a shuffle buffer:
Filling up shuffle buffer (this may take a while): 69 of 79
This is not optimal behaviour for large datasets. Potential solution here: tensorflow/tensorflow#30646 (comment)
Potentially use .toarray() within batch generator.
I'm trying Ivis for dimensionality reduction on Iris dataset. My code is as follows:
from ivis import Ivis
ivis_model = Ivis(embedding_dims=3, k=5, verbose=False, model='hinton')
ivis_data = ivis_model.fit_transform(data.drop(["species"], axis=1).values)
The problem is each of the time I run the above code chunk, I get different results. What can cause this unstability issue?
Describe the bug
Because tensorflow
is in setup.py
, installing ivis
results in the installation of the tensorflow
package, which is the CPU-only version of Tensorflow, even if the user already has tensorflow-gpu
installed. When the user subsequently uses Ivis (or anything else dependent on Tensorflow), the CPU version will be used. In order to utilize the GPU version of tensorflow, the user must uninstall tensorflow
and reinstall tensorflow-gpu
after installing ivis
.
To Reproduce
pip install ivis
Expected behavior
The user most likely does not expect the installation of a package that depends on Tensorflow to install the CPU version of Tensorflow when they already have the GPU version installed.
Additional context
This has been discussed in tensorflow/tensorflow#7166. One way around this problem is to remove tensorflow
from the list of requirements in setup.py
and include it in extras_require
(see Edward developer @dustinvtran's comment).
Also part of the JOSS-Review.
Please consider adding automated tests for the R package.
Hello,
How can we get reproducible results regarding the seed?
Is there an argument regarding the initial_state for example that we can pass?
Thanks,
Regards
Really excited to compare Ivis to UMAP on a project I am currently working on.
The server I have access to is a Windows 10 machine, with a Python 3.7 Anaconda environment.
Following the install instructions and trying to run the MNIST example, I am seeing the following error: TypeError: can't pickle annoy.Annoy objects
I have a dataset consisting of 1200D concatonated avg pooled FastText vectors (2 seperate documents going through 2 fasttext models for a total of 4 * 300d vectors per example).
This dataset properly reduces in dimensionality and projects properly when ran through UMAP, but when ran through Ivis (installed through pip), the loss always goes towards 1 and the embeddings for every example are exactly (err, nearly) the same.
This seems like a bug. I can't provide you the exact data to reproduce with but it is easy to generate some fasttext word vectors and try ivis on them.
Example of the what I am seeing for the same data (top is Ivis, bottom is UMAP)
I noticed that when Ivis compose a sklearn.pipeline.Pipeline
which is passed to sklearn.model_selection.GridSearch
to fine-tune hyper-parameters across all estimators/transformers, and GridSearch
has n_jobs=-1
(i.e., when executions within GridSearch
are parallel), errors are thrown. This does not happen when n_jobs=1
(i.e., when the executions within GridSearch
are sequential).
Since Pipeline
globally regulates the n_jobs
parameter, thus not supporting the parallelization of only specific steps, this problem forces the global use of n_jobs=1
, which sensibly slows down the fine-tuning process by underusing the computational power of the setup in which the script is being executed (even in parts where n_jobs=-1
would work).
A virtual environment was created specifically to this repository, wherein all modules described in requirements.txt
were installed. My setup runs an up-to-date version of Windows 10 (no WSL).
python=3.8.4
ivis=2.0.3
tensorflow=2.5.0
if __name__ == "__main__":
import tempfile
import ivis
from sklearn import datasets, ensemble, model_selection, pipeline, preprocessing
from os import environ
environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
X, y = datasets.load_iris(return_X_y=True)
pipeline_with_ivis = pipeline.Pipeline([
("normalize", preprocessing.MinMaxScaler()),
("project", ivis.Ivis()),
("classify", ensemble.RandomForestClassifier()),
], memory=tempfile.mkdtemp())
parameter_grid = {
"project__k": (15,),
"project__verbose": (True,),
"classify__random_state": (2021,)
}
grid_search = model_selection.GridSearchCV(pipeline_with_ivis, parameter_grid, scoring="accuracy", cv=10, n_jobs=-1,
return_train_score=True, verbose=3).fit(X, y)
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py:615: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\ivis\data\neighbour_retrieval\knn.py", line 212, in extract_knn
process.start()
File "C:\Python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Python38\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\joblib\externals\loky\backend\process.py", line 39, in _Popen
return Popen(process_obj)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\joblib\externals\loky\backend\popen_loky_win32.py", line 70, in __init__
child_env.update(process_obj.env)
AttributeError: 'KnnWorker' object has no attribute 'env'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_validation.py", line 598, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 341, in fit
Xt = self._fit(X, y, **fit_params_steps)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 303, in _fit
X, fitted_transformer = fit_transform_one_cached(
File "<REPOSITORY_ROOT>\venv\lib\site-packages\joblib\memory.py", line 591, in __call__
return self._cached_call(args, kwargs)[0]
File "<REPOSITORY_ROOT>\venv\lib\site-packages\joblib\memory.py", line 534, in _cached_call
out, metadata = self.call(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\joblib\memory.py", line 761, in call
output = self.func(*args, **kwargs)
File "<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\pipeline.py", line 754, in _fit_transform_one
res = transformer.fit_transform(X, y, **fit_params)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 350, in fit_transform
self.fit(X, Y, shuffle_mode)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 328, in fit
self._fit(X, Y, shuffle_mode)
File "<REPOSITORY_ROOT>\ivis\ivis.py", line 190, in _fit
self.neighbour_matrix = AnnoyKnnMatrix.build(X, path=self.annoy_index_path,
File "<REPOSITORY_ROOT>\ivis\data\neighbour_retrieval\knn.py", line 63, in build
return cls(index, X.shape, path, k, search_k, precompute, include_distances, verbose)
File "<REPOSITORY_ROOT>\ivis\data\neighbour_retrieval\knn.py", line 48, in __init__
self.precomputed_neighbours = self.get_neighbour_indices()
File "<REPOSITORY_ROOT>\ivis\data\neighbour_retrieval\knn.py", line 96, in get_neighbour_indices
return extract_knn(
File "<REPOSITORY_ROOT>\ivis\data\neighbour_retrieval\knn.py", line 236, in extract_knn
process.terminate()
File "C:\Python38\lib\multiprocessing\process.py", line 133, in terminate
self._popen.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'
warnings.warn("Estimator fit failed. The score on this train-test"
[...]
<REPOSITORY_ROOT>\venv\lib\site-packages\sklearn\model_selection\_search.py:922: UserWarning: One or more of the test scores are non-finite: [nan]
warnings.warn(
By coding and playing with the example above, I acquired the understanding that, since both sklearn
uses joblib
and ivis
uses multiprocessing
, these modules might not be playing well with each other for some reason.
I would discard the understanding that nested estimators/transformers with parallel routines would be the problem: estimators like sklearn.ensemble.RandomForestClassifier
can be set to have n_jobs=-1
without problem within the Pipeline
passed to GridSearchCV
.
I am particularly affected by this issue because I want to employ ivis
in projects that involve hyper-parameter fine-tuning using cross-validation via GridSearchCV
with concurrent executions. I attempted to diagnose the problem, but to no avail, which is why I bring this issue to your attention.
Observation: another part of this problem is a design choice that is not adherent to the sklearn
API guidelines, whose solution I propose and detail in #95. This issue does not cause the aforementioned error, but might cause other errors that could affect the same use scenario (Pipeline
in GridSearchCV
running in parallel).
Hello,
For JOSS review.
The installation instructions fail when run in the RStudio environment:
> devtools::install_github("beringresearch/ivis/R-package", force=TRUE)
Downloading GitHub repo beringresearch/ivis@master
✔ checking for file ‘/private/var/folders/cp/8rn2cs_x79zcbp_yb75ychg80000gq/T/Rtmpud6pnU/remotesbe4d59017fdb/beringresearch-ivis-bbccdb7/R-package/DESCRIPTION’ ...
─ preparing ‘ivis’:
✔ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘ivis_1.1.3.tar.gz’
* installing *source* package ‘ivis’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘ivis’:
.onLoad failed in loadNamespace() for 'ivis', details:
call: path.expand(path)
error: invalid 'path' argument
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/ivis’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/ivis’
Error: Failed to install 'ivis' from GitHub:
(converted from warning) installation of package ‘/var/folders/cp/8rn2cs_x79zcbp_yb75ychg80000gq/T//Rtmpud6pnU/filebe4d71713083/ivis_1.1.3.tar.gz’ had non-zero exit status
However, it does work fine when run in the console (Darwin Kernel Version 18.6.0: Thu Apr 25 23:16:27 PDT 2019; root:xnu-4903.261.4~2/RELEASE_X86_64 x86_64):
> devtools::install_github("beringresearch/ivis/R-package", force=TRUE)
Downloading GitHub repo beringresearch/ivis@master
checking for file ‘/private/var/folders/cp/8rn2cs_x79zcbp_yb75ychg80000gq/T/Rtmpvj2CT3/remotesc3827327cfb8/beringresearch-ivis-bbccdb7/R-package/DESCRIPTION’✔ checking for file ‘/private/var/folders/cp/8rn2cs_x79zcbp_yb75ychg80000gq/T/Rtmpvj2CT3/remotesc3827327cfb8/beringresearch-ivis-bbccdb7/R-package/DESCRIPTION’
─ preparing ‘ivis’:
✔ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘ivis_1.1.3.tar.gz’
* installing *source* package ‘ivis’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (ivis)
Moreover, the ivis
package (installed from the terminal) can be loaded from an R console in a terminal, but throws the following error when loaded in RStudio
> library(ivis)
Error: package or namespace load failed for ‘ivis’:
.onLoad failed in loadNamespace() for 'ivis', details:
call: path.expand(path)
error: invalid 'path' argument
This is most likely due to conda
not being on the PATH
in RStudio:
# RStudio
> system("echo $PATH")
/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/local/ncbi/igblast/bin:/Library/TeX/texbin:/opt/X11/bin:/opt/local/bin
# Console
> system("echo $PATH")
/Users/kevin/miniconda3/bin:/Users/kevin/miniconda3/condabin:/usr/local/opt/imagemagick@6/bin:/Users/kevin/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/ncbi/igblast/bin:/Library/TeX/texbin:/opt/X11/bin
Is there a recommended way to set up an environment to run ivis
in RStudio, or are users only expected to run it from a terminal R console?
Thanks!
Describe the bug
embedding_dims
set to be 1 will lead to ->
ValueError: Invalid reduction dimension 1 for input with 1 dimensions. for 'loss/stacked_triplets_loss/Sum' (op: 'Sum') with input shapes: [?], [] and with computed input tensors: input[1] = <1>.
To Reproduce
import numpy as np
from tensorflow.keras.datasets import mnist
from ivis import Ivis
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = np.reshape(X_train, (len(X_train), 28 * 28))
X_test = np.reshape(X_test, (len(X_test), 28 * 28))
model = Ivis(embedding_dims=1)
model.fit(X_train, Y_train)
Expected behavior
It should work when reducing to one dimension.
Versions
I am benchmarking this method on a cluster which uses a shared file system. The problem is that the Ivis
class creates a file, annoy.index
, without checking whether it first exists. This file is overwritten without giving any warning/error presumably causing issues for the previously existing program.
I see that this can be remedied by setting annoy_index_path
; however, I this limitation is not obvious from the documentation and is likely to cause confusion.
Actually it appears this cannot be done. This argument tells the program to load from this file, so there is no option to actually change the name of the index file ...
Edit:
Also, the annoy.index
file is still created even if I set build_index_on_disk=False
Hi,
I am reviewing for JOSS-Reviews.
I think I was able to install the package. Unfortunately when I run the example code, I run into the following error:
model <- ivis(k = 3)
Error in ivis_object$Ivis(embedding_dims = embedding_dims, k = k, distance = distance, :
attempt to apply non-function`
My session info:
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ivis_1.2.3
loaded via a namespace (and not attached):
[1] compiler_3.6.0 Matrix_1.2-17 tools_3.6.0 yaml_2.2.0 Rcpp_1.0.2 reticulate_1.13
[7] grid_3.6.0 jsonlite_1.6 lattice_0.20-38
X = data.matrix(iris[:, 1:4])
is not valid R. Omit the colon.
This is part of JOSS-Review
It would be nice to showcase the computed ivis visualisation of the data in R package. Consider adding the following code:
library(ggplot2)
dat <- data.frame(x=xy[,1],
y=xy[,2],
species=iris$Species)
ggplot(dat, aes(x=x, y=y)) + geom_point(aes(color=species)) + theme_classic()
From the README file:
both categorical and continuous features are handled well
How do we handle categorical features? Is one-hot-encoding enough?
In UMAP you can use different distances for one-hot-encoded categorical features (e.g. dice, jaccard etc.) and continuous features, then you perform an "intersection" (see lmcinnes/umap#58).
How can we handle mixed type datasets in ivis? Can we just use it on a dataset with continuous features and one-hot-encoded categorical features mixed together?
Thank you very much
Hi.
I'm trying out Ivis after reading your paper. Nice work and excellent documentation.
Could you clarify what you mean by "margin – The distance that is enforced between points by the triplet loss functions?" (Emphasis added.) This sounds like, "all distances are set to this value." From a quick read of your code, it sounds like this factor is added to all distances during loss calculations.
Also, your examples generally include a call to minmaxscaler. Is this required by Ivis, meaning that it makes assumptions about the scale of the distances between points, whether in the loss calculations or elsewhere?
Thanks!
Describe the bug
The install_ivis()
command throws an error if an "ivis" environment already exists because the reticulate::virtualenv_remove()
function is not imported in the NAMESPACE.
To Reproduce
Steps to reproduce the behavior:
> install_ivis()
[...successful installation...]
> install_ivis()
Creating a virtual environment (ivis)
Error in virtualenv_remove("ivis") :
could not find function "virtualenv_remove"
Expected behavior
Running install_ivis()
when it is already installed should not throw an error.
https://github.com/beringresearch/ivis/blob/master/R-package/R/install_ivis.R#L2
Additional context
I can see that the import is declared here.
It is just about roxygenizing the package to update the NAMESPACE file
Hello,
I'm trying to run the ivis examples (both the simple iris one and the mnist one, and I keep getting this error whenever the model fitting is being called (running this on Debian). Any thoughts?
In [7]: embeddings = ivis.fit_transform(mnist.data)
Error truncating file: Invalid argument
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-7-d5f1692c2b85> in <module>
----> 1 embeddings = ivis.fit_transform(mnist.data)
/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in fit_transform(self, X, Y, shuffle_mode)
289 """
290
--> 291 self.fit(X, Y, shuffle_mode)
292 return self.transform(X)
293
/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in fit(self, X, Y, shuffle_mode)
269 """
270
--> 271 self._fit(X, Y, shuffle_mode)
272 return self
273
/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/ivis.py in _fit(self, X, Y, shuffle_mode)
146 print('Building KNN index')
147 build_annoy_index(X, self.annoy_index_path,
--> 148 ntrees=self.ntrees, verbose=self.verbose)
149
150 datagen = generator_from_index(X, Y,
/opt/conda/envs/ivisumap/lib/python3.7/site-packages/ivis/data/knn.py in build_annoy_index(X, path, ntrees, verbose)
28
29 # Build n trees
---> 30 index.build(ntrees)
31 if platform.system() == 'Windows':
32 index.save(path)
Exception: Invalid argument
I was just looking at the hyperparameters section of the Ivis docs (https://bering-ivis.readthedocs.io/en/latest/hyperparameters.html). I'm a little confused on what observations exactly mean, so any help would be appreciated! Thanks!
Hi,
The section in the docs about metric learning here, talks about a classification_weight
parameter. However, the Ivis class does not have such a parameter.
Could anyone explain why that is?
Not a fully-baked feature request, just a directional hunch. I've found the conclusions from this paper Sampling Matters in Deep Embedding Learning pretty intuitive -- (1) the method for choosing negative samples is critical to the overall embedding, maybe more than the specific loss function, and (2) a distance-weighted sampling of negatives had some nice properties during training and better results compared to uniform random sampling or oversampling hard cases.
I'm brand-new to Annoy, not confident on the implementation details or performance changes here, but I suspect that the prebuilt index could be used for both positive and negative sampling. An example: the current approach draws random negatives in sequence and chooses the first index not in a neighbor list. A distance-weighted approach for choosing a negative for each triplet might work like this:
1/f(dist(i, j))
, where f(dist)
could be just 1/dist
, 1/sqrt(dist)
, etcAnnoy gives us the dist(i, j)
without much of a performance hit. Weighted choice of the candidate negatives puts a (tunable) thumb on the scale for triplets that contain closer/harder-negative matches.
This idea probably does increase some hyperparameter selection headaches. I think the impactful choices here are the size of the initial set of candidate negatives and (especially) f(dist)
.
This is part of the JOSS-Review
Please consider adding clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
This model consistently feels like a magic trick, thanks for contributing!
Bug
I'm running the ivis R package(v1.7.1) (more system details below). I can get model$fit() and model$transform() working just fine and producing substantive results. However, when the R process finishes and returns the fitted model, I'm seeing continued sky-high system usage. The R process calling ivis is definitely completed and back to a command prompt, but in htop I can see the RStudio GUI process (parent of the rsession process) occupying at least 2 full cores. Some process further down is not stopping when the R process gets the returned value. (Restarting the R session does kill it.)
I don't understand enough of the ivis-through-reticulate toolchain to provide more helpful diagnostics in this first report, but happy to run experiments and document further.
Environment
platform x86_64-apple-darwin15.6.0
arch x86_64
os darwin15.6.0
system x86_64, darwin15.6.0
status
major 3
minor 6.2
year 2019
month 12
day 12
svn rev 77560
language R
version.string R version 3.6.2 (2019-12-12)
nickname Dark and Stormy Night
To reproduce:
from sklearn.datasets import fetch_rcv1
from sklearn.utils import resample
rcv1 = fetch_rcv1()
rcv1.data.shape
X, y = resample(rcv1.data, rcv1.target, replace=False, n_samples=1000, random_state=1234)
ivis = Ivis(epochs=1)
ivis.fit(X)
embeddings = ivis.transform(X)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.