theislab / scgen Goto Github PK

View Code? Open in Web Editor NEW

247.0 9.0 52.0 70.67 MB

Single cell perturbation prediction

Home Page: https://scgen.readthedocs.io

License: GNU General Public License v3.0

Python 100.00%

transcriptomics deep-learning generative-model bioinformatics single-cell scrna-seq single-cell-genomics

scgen's Introduction

scGen

Introduction

scGen is a generative model to predict single-cell perturbation response across cell types, studies and species (Nature Methods, 2019). scGen is implemented using the scvi-tools framework.

Getting Started

What you can do with scGen:

Train on a dataset with multiple cell types and conditions and predict the perturbation effect on the cell type which you only have in one condition. This scenario can be extended to multiple species where you want to predict the effect of a specific species using another or all the species.
Train on a dataset where you have two conditions (e.g. control and perturbed) and predict on second dataset with similar genes.
Remove batch effect on labeled data. In this scenario you need to provide cell_type and batch labels to the method. Note that batch_removal does not require all cell types to be present in all datasets (batches). If you have dataset specific cell type it will preserved as before.
We assume there exist two conditions in you dataset (e.g. control and perturbed). You can train the model and with your data and predict the perturbation for the cell type/species of interest.
We recommend to use normalized data for the training. A simple example for normalization pipeline using scanpy:

import scanpy as sc

adata = sc.read(data)
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)

We further recommend to use highly variable genes (HVG). For the most examples in the paper we used top ~7000 HVG. However, this is optional and highly depend on your application and computational power.

Installation

Installation with pip

To install the latest version scGen via pip:

pip install scgen

or install the development version via pip:

pip install git+https://github.com/theislab/scgen.git

Examples

See examples at our documentation site.

Reproducing paper results

In order to reproduce paper results visit here.

References

Lotfollahi, Mohammad and Wolf, F. Alexander and Theis, Fabian J. "scGen predicts single-cell perturbation responses." Nature Methods, 2019. pdf

scgen's People

Contributors

Stargazers

Watchers

scgen's Issues

Batch effect removal with only one cell type

Hello,

Is it possible to remove batch effects when giving all the cells the same label?

Will this prevent the loss function from properly optimizing the parameter values in the neural network?

We have isolated the same cell type from many batches and hope to remove the batch effect to study subtypes.

Thank you in advance for your assistance!

Question about saving the scgen model

Hello,

I am trying to figure out how to save and load the models generated by scgen. I compared saving the model right after creating it using scgen.SCGEN(adata), to saving the model right after creating the model and saving once more after training the model. The resulting prediction seems very similar in both cases, so I am curious whether the model.train() function automatically saves the state of the model? If so, would I just be able to call scgen.SCGEN.load() to retrieve the trained model in both cases? Or am I doing something incorrectly?

Thank you!

AttributeError: module 'tensorflow' has no attribute 'placeholder'

Hi guys,

I had to upgrade my tensorflow to re-install diffxpy and this seems to have broken my scGen: I've got the AttributeError: module 'tensorflow' has no attribute 'placeholder'`.

I'm trying to find a workaround this, but it seems that it's something that scGen will have to fix at some point? Do you have any recommendations?

Thanks in advance!

Add annotations to corrected object

Algorithm not actually using selected TA indicators

I've been trying to implement some of my own custom TA indicators but couldn't see when they are actually being used.

Is my understanding correct that the selected TA aren't being treated? Or is it used elsewhere?

Thanks,
Yoaz

Please help with the train_pbmc.h5ad

hello,

I want to do the analysis in order to reproduce paper according to this notebook:
https://nbviewer.jupyter.org/github/M0hammadL/scGen_reproducibility/blob/master/Jupyter%20Notebooks/Fig2.ipynb

pbmc = sc.read("../data/train_pbmc.h5ad")

But I cannot find the train_pbmc.h5ad file. Could you help with this problem ?

Thanks!!!

batch_remove() function do not return cell name, only gene name in normalized matrix

Dear Naghipourfar and M0hammadL,
I encountered another bug with scGen in the "batch_removal" function. I tested this function one month ago. I am not sure if you have fixed it or not. So I just want to let you know here.
The batch_removal() function in utils.py return a normalized matrix without cell names, only gene names. For visualization tSNE, UMAP, clustering of this normalized matrix, there are no problem. But if you want to use these cells for downstream analysis, we need cells name.

I tested program with 2 batches. I fixed this bug by adding an observation adata.obs['cell_name'] to keep cells name. But I think you can do it better.
adata_latent.obs["cell_name"] = adata.obs["cell_name"].tolist()
corrected.obs["cell_name"] = all_shared_ann.obs["cell_name"].tolist()
corrected.obs["cell_name"] = all_shared_ann.obs["cell_name"].tolist() + all_not_shared_ann.obs[
"cell_name"].tolist()
corrected.obs_names = corrected.obs['cell_name']

Thanks,
Best,
Hoa Tran

Wrong path to save the model in the keras model

Hi. It seems the path you save the model is wrong. It should be saved in path self.model_to_use.

scgen/scgen/models/_vae_keras.py

Lines 481 to 483 in 8c6b052

 self.vae_model.save(os.path.join("vae.h5"), overwrite=True) 

 self.encoder_model.save(os.path.join("encoder.h5"), overwrite=True) 

 self.decoder_model.save(os.path.join("decoder.h5"), overwrite=True)

Is the cell type label a must?

Hi,

I am confused that why scgen needs to feed in the data with cell type labels. Does that means your scgen is a supervised learning method? Hoping your reply. Thank you.

Example scGen issues?

Hi, I am having issues with getting the Kang example to work. I am currently using scGen 1.1.5 with python 3.6 in PyCharm in Windows 10, and when I run scGen, I am finding differences between my results and what is shown in the Jupiter Notebook. I used the same code as from the Notebook as well. Here are the graphs I get when I run the example. Do you know what would cause this? Thank you so much for your help!

Edit: I realize that because this is a machine learning model so each iteration will not be identical, but after rerunning the example I am still not able to get the same degree of spread in the predicted cells as the example in the Notebook. Thanks!

predict() argument missing for 'cell_type_key' and 'condition_key'

Hi guys
I am not sure what arguments to pass to 'cell_type_key' and 'condition_key' following this error:

pred, delta = scg.predict(adata= train_new, adata_to_predict=unperturbed_cd4t,
... conditions={"ctrl": "control", "stim":"stimulated"})
Traceback (most recent call last):
File "", line 2, in
TypeError: predict() missing 2 required positional arguments: 'cell_type_key' and 'condition_key'

Thanks

scgen running on GPU?

Hi,

Can I wonder if it is possible to use scgen on GPU and if yes - do I need to add any specific parameters for that?

Thanks,
Veronika

missing dependencies

keras seems to be missing when installing scgen for the first time

How to save scgen trained model?

Hello. How can I save the scgen objects and scgen model? I have experienced running scgen 30hrs for three times because jupyter notebook crashed after training and during prediction.

I tried using pickle module to save the model (hoping that I could just load it again anytime instead or re-running scgen many times) but it doesn't seem to work and reverts an error. Advise, please. Thank you very much!

Update LossRecorder return

No longer necessary to have the kl_global part as we have the appropriate default in scvi-tools

scgen/scgen/_scgenvae.py

Line 131 in 4d633da

return LossRecorder(loss, rl, kld, kl_global=0.0)

batch_removal AssertionError

Hi,
I have a similar issue to #30 with batch_removal. The training of the network works, but running batch_removal throws an error:

adata = sc.read(path+"Aggr.adata.h5ad", cache=True)
network = scgen.VAEArithKeras(x_dimension= adata.shape[1], model_path="./models/batch")
network.train(train_data=adata, n_epochs=100)
corrected_adata =  scgen.batch_removal(network=network, adata=adata, batch_key="batch", cell_label_key="cell_type")

 ---------------------------------------------------------------------------
 AssertionError                            Traceback (most recent call last)
<ipython-input-20-83ab6f5c386a> in <module>
      1 t = time.process_time()
----> 2 corrected_adata =  scgen.batch_removal(network=network, adata=adata, batch_key="batch", cell_label_key="cell_type")
      3 elapsed_time = time.process_time() - t

~/anaconda3/lib/python3.7/site-packages/scgen/models/util.py in batch_removal(network, adata, batch_key, cell_label_key)
    287             temp_cell[batch_ind[study]].X = batch_list[study].X
    288         shared_ct.append(temp_cell)
--> 289     all_shared_ann = anndata.AnnData.concatenate(*shared_ct, batch_key="concat_batch", index_unique=None)
    290     if "concat_batch" in all_shared_ann.obs.columns:
    291         del all_shared_ann.obs["concat_batch"]

~/anaconda3/lib/python3.7/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1703             uns_merge=uns_merge,
   1704             fill_value=fill_value,
-> 1705             index_unique=index_unique,
   1706         )
   1707 

~/anaconda3/lib/python3.7/site-packages/anndata/_core/merge.py in concat(adatas, join, batch_key, batch_categories, uns_merge, index_unique, fill_value)
    470         # Current behaviour is mostly for backwards compat. It's like make_names_unique, but
    471         # unfortunately the behaviour is different.
--> 472         partial(merge_outer, batch_keys=batch_categories, merge=merge_same),
    473     )
    474 

~/anaconda3/lib/python3.7/site-packages/anndata/_core/merge.py in merge_dataframes(dfs, new_index, merge_strategy)
    403 
    404 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
--> 405     dfs = [df.reindex(index=new_index) for df in dfs]
    406     # New dataframe with all shared data
    407     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)

~/anaconda3/lib/python3.7/site-packages/anndata/_core/merge.py in <listcomp>(.0)
    403 
    404 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
--> 405     dfs = [df.reindex(index=new_index) for df in dfs]
    406     # New dataframe with all shared data
    407     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)

~/anaconda3/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    225         @wraps(func)
    226         def wrapper(*args, **kwargs) -> Callable[..., Any]:
--> 227             return func(*args, **kwargs)
    228 
    229         kind = inspect.Parameter.POSITIONAL_OR_KEYWORD

~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in reindex(self, *args, **kwargs)
   3854         kwargs.pop("axis", None)
   3855         kwargs.pop("labels", None)
-> 3856         return self._ensure_type(super().reindex(**kwargs))
   3857 
   3858     def drop(

~/anaconda3/lib/python3.7/site-packages/pandas/core/base.py in _ensure_type(self, obj)
     91         Used by type checkers.
     92         """
---> 93         assert isinstance(obj, type(self)), type(obj)
     94         return obj
     95 

AssertionError: <class 'pandas.core.frame.DataFrame'>

I will be very grateful for some help!

anndata==0.7.3
scanpy==1.5.2.dev7+ge33a2f33
scgen===1.1.5.dev2-3004bc0
scipy==1.4.1
scikit-learn==0.22.1
pandas==1.0.1

Installation of current master branch fails

pip install git+https://github.com/theislab/scgen.git

Collecting git+https://github.com/theislab/scgen.git
  Cloning https://github.com/theislab/scgen.git to /tmp/pip-req-build-r42psd9m
  Running command git clone -q https://github.com/theislab/scgen.git /tmp/pip-req-build-r42psd9m
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... error
    ERROR: Command errored out with exit status 1:
     command: /home/user/software/vens/scgen/bin/python /home/user/software/vens/scgen/lib64/python3.6/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpvl4em11d
         cwd: /tmp/pip-req-build-r42psd9m
    Complete output (31 lines):
    Using TensorFlow backend.
    Traceback (most recent call last):
      File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 207, in <module>
        main()
      File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 197, in main
        json_out['return_val'] = hook(**hook_input['kwargs'])
      File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 69, in prepare_metadata_for_build_wheel
        return hook(metadata_directory, config_settings)
      File "/tmp/pip-build-env-8ypsywiv/overlay/lib/python3.6/site-packages/flit/buildapi.py", line 27, in prepare_metadata_for_build_wheel
        metadata = make_metadata(module, ini_info)
      File "/tmp/pip-build-env-8ypsywiv/overlay/lib/python3.6/site-packages/flit/common.py", line 302, in make_metadata
        md_dict.update(get_info_from_module(module))
      File "/tmp/pip-build-env-8ypsywiv/overlay/lib/python3.6/site-packages/flit/common.py", line 113, in get_info_from_module
        docstring, version = get_docstring_and_version_via_import(target)
      File "/tmp/pip-build-env-8ypsywiv/overlay/lib/python3.6/site-packages/flit/common.py", line 97, in get_docstring_and_version_via_import
        m = sl.load_module()
      File "<frozen importlib._bootstrap_external>", line 399, in _check_name_wrapper
      File "<frozen importlib._bootstrap_external>", line 823, in load_module
      File "<frozen importlib._bootstrap_external>", line 682, in load_module
      File "<frozen importlib._bootstrap>", line 265, in _load_module_shim
      File "<frozen importlib._bootstrap>", line 684, in _load
      File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 678, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/tmp/pip-req-build-r42psd9m/scgen/__init__.py", line 3, in <module>
        from .models import *
      File "/tmp/pip-req-build-r42psd9m/scgen/models/__init__.py", line 1, in <module>
        from ._vae_keras import VAEArithKeras
      File "/tmp/pip-req-build-r42psd9m/scgen/models/_vae_keras.py", line 13, in <module>
        from reptrvae.utils import balancer, extractor, shuffle_data, remove_sparsity
    ModuleNotFoundError: No module named 'reptrvae'
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/user/software/vens/scgen/bin/python /home/user/software/vens/scgen/lib64/python3.6/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpvl4em11d Check the logs for full command output.

error at scgen.VAEArith()

I create a conda environment for scgen and I followed the installation instruction. It seems successful.
In Kang's example, I got this error message at
scg = scgen.VAEArith(x_dimension= train.shape[1], model_path="./models/test" )

WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38cc5ae410>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BatchNormalization.call of <tensorflow.python.layers.normalization.BatchNormalization object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dropout.call of <tensorflow.python.layers.core.Dropout object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING: Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f38d06c9090>>: AssertionError: Bad argument number for Name: 3, expecting 4

ValueError Traceback (most recent call last)
in
----> 1 scg = scgen.VAEArith(x_dimension= train.shape[1], model_path="./models/test" )

~/.local/lib/python3.7/site-packages/scgen/models/_vae.py in init(self, x_dimension, z_dimension, **kwargs)
44 self.init_w = tf.contrib.layers.xavier_initializer()
45 self._create_network()
---> 46 self._loss_function()
47 self.sess = tf.Session()
48 self.saver = tf.train.Saver(max_to_keep=1)

~/.local/lib/python3.7/site-packages/scgen/models/_vae.py in _loss_function(self)
156 self.vae_loss = tf.reduce_mean(recon_loss + self.alpha * kl_loss)
157 with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
--> 158 self.solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.vae_loss)
159
160 def to_latent(self, data):

~/.local/lib/python3.7/site-packages/tensorflow/python/training/optimizer.py in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
411
412 return self.apply_gradients(grads_and_vars, global_step=global_step,
--> 413 name=name)
414
415 def compute_gradients(self, loss, var_list=None,

~/.local/lib/python3.7/site-packages/tensorflow/python/training/optimizer.py in apply_gradients(self, grads_and_vars, global_step, name)
595 ([str(v) for _, v, _ in converted_grads_and_vars],))
596 with ops.init_scope():
--> 597 self._create_slots(var_list)
598 update_ops = []
599 with ops.name_scope(name, self._name) as name:

~/.local/lib/python3.7/site-packages/tensorflow/python/training/adam.py in _create_slots(self, var_list)
129 # Create slots for the first and second moments.
130 for v in var_list:
--> 131 self._zeros_slot(v, "m", self._name)
132 self._zeros_slot(v, "v", self._name)
133

~/.local/lib/python3.7/site-packages/tensorflow/python/training/optimizer.py in _zeros_slot(self, var, slot_name, op_name)
1153 named_slots = self._slot_dict(slot_name)
1154 if _var_key(var) not in named_slots:
-> 1155 new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
1156 self._restore_slot_variable(
1157 slot_name=slot_name, variable=var,

~/.local/lib/python3.7/site-packages/tensorflow/python/training/slot_creator.py in create_zeros_slot(primary, name, dtype, colocate_with_primary)
188 return create_slot_with_initializer(
189 primary, initializer, slot_shape, dtype, name,
--> 190 colocate_with_primary=colocate_with_primary)
191 else:
192 if isinstance(primary, variables.Variable):

~/.local/lib/python3.7/site-packages/tensorflow/python/training/slot_creator.py in create_slot_with_initializer(primary, initializer, shape, dtype, name, colocate_with_primary)
162 with distribution_strategy.extended.colocate_vars_with(primary):
163 return _create_slot_var(primary, initializer, "", validate_shape, shape,
--> 164 dtype)
165 else:
166 return _create_slot_var(primary, initializer, "", validate_shape, shape,

~/.local/lib/python3.7/site-packages/tensorflow/python/training/slot_creator.py in _create_slot_var(primary, val, scope, validate_shape, shape, dtype)
72 shape=shape,
73 dtype=dtype,
---> 74 validate_shape=validate_shape)
75 variable_scope.get_variable_scope().set_partitioner(current_partitioner)
76

~/.local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py in get_variable(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint, synchronization, aggregation)
1494 constraint=constraint,
1495 synchronization=synchronization,
-> 1496 aggregation=aggregation)
1497
1498

~/.local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py in get_variable(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint, synchronization, aggregation)
1237 constraint=constraint,
1238 synchronization=synchronization,
-> 1239 aggregation=aggregation)
1240
1241 def _get_partitioned_variable(self,

~/.local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py in get_variable(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint, synchronization, aggregation)
560 constraint=constraint,
561 synchronization=synchronization,
--> 562 aggregation=aggregation)
563
564 def _get_partitioned_variable(self,

~/.local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py in _true_getter(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, constraint, synchronization, aggregation)
512 constraint=constraint,
513 synchronization=synchronization,
--> 514 aggregation=aggregation)
515
516 synchronization, aggregation, trainable = (

~/.local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource, constraint, synchronization, aggregation)
862 tb = [x for x in tb if "tensorflow/python" not in x[0]][:5]
863 raise ValueError("%s Originally defined at:\n\n%s" %
--> 864 (err_msg, "".join(traceback.format_list(tb))))
865 found_var = self._vars[name]
866 if not shape.is_compatible_with(found_var.get_shape()):

ValueError: Variable encoder/dense/kernel/Adam/ already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

File "/.local/lib/python3.7/site-packages/scgen/models/_vae.py", line 158, in _loss_function
self.solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.vae_loss)
File "/.local/lib/python3.7/site-packages/scgen/models/_vae.py", line 46, in init
self.loss_function()
File "", line 1, in
scg = scgen.VAEArith(x_dimension= train.shape[1], model_path="./models/test" )
File "<...>/miniconda3/envs/scgen/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<...>/miniconda3/envs/scgen/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
if (await self.run_code(code, result, async=asy)):

Give tutorials adapted first cell of scvi-tools tutorials so they run in colab

See first cell here: https://docs.scvi-tools.org/en/stable/user_guide/notebooks/api_overview.html

Error in "pred, delta" part. KeyError: 'condition'

Hello. I encountered an error as shown below. May I ask if this error mean that I cannot run an h5ad data if the condition in my actual data is not categorized as "condition"? My experiment condition label on data is "response", and not "condition". Or this error means different?

SCRIPT:

pred, delta = scg.predict(adata=train_new, adata_to_predict=NR_CD4T,
                          conditions={"ctrl": "NR", "stim": "R"}, cell_type_key="celltype", condition_key="response")

ERROR:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/envs/scgen_test/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2890             try:
-> 2891                 return self._engine.get_loc(casted_key)
   2892             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'condition'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-12-f1fd2259e7a8> in <module>
      1 pred, delta = scg.predict(adata=train_new, adata_to_predict=NR_CD4T,
----> 2                           conditions={"ctrl": "NR", "stim": "R"}, cell_type_key="celltype", condition_key="response")

~/anaconda3/envs/scgen_test/lib/python3.7/site-packages/scgen/models/_vae_keras.py in predict(self, adata, conditions, cell_type_key, condition_key, adata_to_predict, celltype_to_predict, obs_key)
    320         """
    321         if obs_key == "all":
--> 322             ctrl_x = adata[adata.obs["condition"] == conditions["ctrl"], :]
    323             stim_x = adata[adata.obs["condition"] == conditions["stim"], :]
    324             ctrl_x = balancer(ctrl_x, cell_type_key=cell_type_key, condition_key=condition_key)

~/anaconda3/envs/scgen_test/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2900             if self.columns.nlevels > 1:
   2901                 return self._getitem_multilevel(key)
-> 2902             indexer = self.columns.get_loc(key)
   2903             if is_integer(indexer):
   2904                 indexer = [indexer]

~/anaconda3/envs/scgen_test/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2891                 return self._engine.get_loc(casted_key)
   2892             except KeyError as err:
-> 2893                 raise KeyError(key) from err
   2894 
   2895         if tolerance is not None:

KeyError: 'condition'

Metadata shuffled when applying batch correction

Description:
When trying to run batch_removal metadata (anndata.obs) get shuffled and are not corresponding to the obs_names.
It is probably related to/ a consequence of: #7

Version:
scgen: '1.1.4'

Code to reproduce:

import scgen
import scanpy as sc
import pandas as pd
import anndata

class DummyNet: # To not require full training for reproducing the issue

    def to_latent(self, data, *unused_args, **unused_kwargs):
        return data

    def reconstruct(self, data, *unused_args, **unused_kwargs):
        return data

metadata = pd.DataFrame({
    'cell_type': [
        'celltyp1', 'celltyp1', 'celltyp1', 'celltyp2', "celltyp2", "celltyp2",
        "celltype3"
    ],
    'batch': [
        'batch1', 'batch1', 'batch2', 'batch1', 'batch2', 'batch2', 'batch2'
    ]
})

metadata.index = metadata.cell_type + "_" + metadata.batch + '_' + metadata.index.astype(str)

metadata=metadata.sample(frac=1) # To shuffle the dataframe

test_data  = anndata.AnnData(np.zeros((metadata.index.size,100)),obs=metadata)

scgen_results = scgen.batch_removal(DummyNet(),test_data) 

scgen_metadata = scgen_results.obs.copy()

comparison = pd.merge(metadata,scgen_metadata,left_index=True,right_index=True, suffixes=('_original','_scgen'))

print(comparison)

Thanks in advance for your help.

AttributeError: 'SCGEN' object has no attribute 'summary_stats'

Dear developers,

Many thanks for creating this exciting method. I experienced an issue while following the SCGEN: Batch-Removal tutorial. I used the example data and followed every step as is in the tutorial.

When I run model = scgen.SCGEN(train), the following error occurred:

AttributeError Traceback (most recent call last)
/tmp/ipykernel_1027254/2139969373.py in
----> 1 model = scgen.SCGEN(train)

/nfs/research/ysong/anaconda3/envs/scgen-env/lib/python3.7/site-packages/scgen/_scgen.py in init(self, adata, n_hidden, n_latent, n_layers, dropout_rate, **model_kwargs)
59
60 self.module = SCGENVAE(
---> 61 n_input=self.summary_stats.n_vars,
62 n_hidden=n_hidden,
63 n_latent=n_latent,

AttributeError: 'SCGEN' object has no attribute 'summary_stats'

Checking the source code I do not find the definition of attribute summary_stats for object SCGEN indeed.

Python version: Python 3.7.12
scGen version: 2.1.0
Means to install scGen: pip install git+https://github.com/theislab/scgen.git

May I kindly ask how to resolve this issue?

Cheers,
Yuyao Song

Bimodal data?

Hello! I would like to ask for your advice/opinion about my results.

I integrated 2 independent scRNA-seq datasets from different studies using Seurat and then performed scGen. On the figure attached below, it appears that the R condition has 2 modes, which makes the prediction somewhere in between those modes.

I wonder if you could give me advice on how to get a workaround on this, like whether this figure gives an insight that the two datasets are not compatible to be integrated to begin with. Or are there other tools which I can use to investigate this to resolve my prediction inaccuracy?

AttributeError: module 'tensorflow' has no attribute 'placeholder'

loss.py", line 209, in getRealObservables
loadRealObservs = tf.placeholder(tf.bool, name='OBSERVS_GATE')
AttributeError: module 'tensorflow' has no attribute 'placeholder'

OS: Ubuntu
Tensorflow Version : 1.15
python version : 3.6.1

Update main documentation page

https://github.com/theislab/scgen/blob/pytorch/docs/index.rst

This should have much of the info of the readme.

install error: ERROR: No matching distribution found for tensorflow==1.13

Hi,

I get this error when trying to install scGen:
ERROR: No matching distribution found for tensorflow==1.13

Additionally is there a conda package I could install? That would make installation easier.

Thanks!

Number of epochs

Hello. Do you have any recommendations on how to decide about how many epochs to run to get the best prediction? Thank you!

TypeError: Can't instantiate abstract class SCGEN with abstract methods setup_anndata

Hi,
Thanks for your excellent contribution. I recently encountered a bug when I ran " scgen.SCGEN(train)". The error information is:

TypeError Traceback (most recent call last)
in ()
----> 1 model = scgen.SCGEN(train)
2 model.save("../model_batch_removal.pt", overwrite=True)

TypeError: Can't instantiate abstract class SCGEN with abstract methods setup_anndata

However, this code works well one week ago with the same data. Could you pls help me with this issue? Thank you.

Brian

ValueError: could not convert integer scalar

I am trying to replicate the results of the perturbation experiment with some of my own data but am running into an error

>>> pred, delta = model.predict(
    ctrl_key='18h',
    stim_key='24h',
    celltype_to_predict='tbx16'
)
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/net/vol1/home/.local/lib/python3.7/site-packages/scgen/_scgen.py", line 155, in predict
    ctrl_adata.X = ctrl_adata.X.A
  File "/net/gs/vol3/software/modules-sw-python/3.7.7/scvi-tools/0.10.1/Linux/CentOS7/x86_64/lib/python3.7/site-packages/anndata/_core/anndata.py", line 684, in X
    self._adata_ref._X[oidx, vidx] = value
  File "/net/gs/vol3/software/modules-sw-python/3.7.7/scvi-tools/0.10.1/Linux/CentOS7/x86_64/lib/python3.7/site-packages/scipy/sparse/_index.py", line 116, in __setitem__
    self._set_arrayXarray_sparse(i, j, x)
  File "/net/gs/vol3/software/modules-sw-python/3.7.7/scvi-tools/0.10.1/Linux/CentOS7/x86_64/lib/python3.7/site-packages/scipy/sparse/compressed.py", line 808, in _set_arrayXarray_sparse
    self._zero_many(*self._swap((row, col)))
  File "/net/gs/vol3/software/modules-sw-python/3.7.7/scvi-tools/0.10.1/Linux/CentOS7/x86_64/lib/python3.7/site-packages/scipy/sparse/compressed.py", line 929, in _zero_many
    i, j, offsets)
ValueError: could not convert integer scalar

I found scverse/anndata#339 and tried

>>> model.adata = model.adata.copy()
>>> pred, delta = model.predict(
    ctrl_key='18h',
    stim_key='24h',
    celltype_to_predict='tbx16'
)

but got the same result.

I had no issues with training. Strangely, I don't seem to have this issue if I subset the data and only predict on 5% of data or if I use the sample data provided. Is there a size limit for the number of cells that this package will work with?

I am not using conda.

Relevant packages:
sc.version
'1.7.2'
scipy.version
'1.6.3'
np.version
'1.20.3'
anndata.version
'0.7.6'
scgen.version
'2.0.0'

I have tried to update my libraries to the most current versions. Any guidance you could provide would be great

Normalization question for batch_removal

Hello, thanks for such a great tool that's been ranked as a high performer!

I am hoping to understand the normalization process.
In the readme, it is mentioned to normalize the data as follows:
import scanpy as sc adata = sc.read(data) sc.pp.normalize_total(adata) sc.pp.log1p(adata)

However, in the tutorial, we see warning message:
corrected_adata = model.batch_removal()
WARNING Make sure the registered X field in anndata contains unnormalized count data.

Should the data be normalized or contain raw counts if batch_removal() is run? Where would be the fix be made if unnormalized counts are used? model.X or train.X before/after training?

There's also the warning of filtering, if that could be addressed too, I think it would make the tutorial much more clearer.

Thanks!

Questions about balancer function

Hi, I found that in the python file "util.py" in the models folder, the statement in the balancer function (line 190) that assigns values to balanced_data.obs[condition_key] is "balanced_data.obs[condition_key] = np.concatenate(all_ data_label)" , is this a misspelling or am I misunderstanding? I think it should be "balanced_data.obs[condition_key] = np.concatenate(all_data_condition)". I wonder if this has any effect on the results?

ImportError: cannot import name 'shuffle_data'

There seems to be a small issue regarding the name of the function shuffle_adata from utils. I guess the import in the VAE class should be fixed?

      File "/tmp/pip-req-build-b426j88u/scgen/models/_vae.py", line 8, in <module>
        from .util import balancer, extractor, shuffle_data
    ImportError: cannot import name 'shuffle_data'

batch_removal() throws TypeError: concatenate() missing 1 argument: 'self'

Hi!

I think that I have got a similar issue to #1 (which should be fixed), I have a merged object with 3 batches and 35 cell types. Training of the network works like a charm, but the batch_removal step throws the following error:

adata_sc = adata.copy()
network = scgen.VAEArith(x_dimension= adata_sc.shape[1], model_path="../../data/output/models/batch")
network.train(train_data=adata_sc, n_epochs=20)
corrected_adata = scgen.batch_removal(network, adata_sc, batch_key="orig.ident", cell_label_key="orig.celltype")   

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-9b3ce39d7217> in <module>
----> 1 corrected_adata = scgen.batch_removal(network, adata_sc, batch_key="orig.ident", cell_label_key="orig.celltype")

~\Anaconda3\envs\UMCU\lib\site-packages\scgen\models\util.py in batch_removal(network, adata, batch_key, cell_label_key)
    293             temp_cell[batch_ind[study]].X = batch_list[study].X
    294         shared_ct.append(temp_cell)
--> 295     all_shared_ann = anndata.AnnData.concatenate(*shared_ct, batch_key="concat_batch")
    296     if "concat_batch" in all_shared_ann.obs.columns:
    297         del all_shared_ann.obs["concat_batch"]

TypeError: concatenate() missing 1 required positional argument: 'self'

Any help would be much appreciated!

scanpy==1.4.6 anndata==0.7.1 umap==0.3.10 numpy==1.18.1 scipy==1.4.1 pandas==1.0.1 scikit-learn==0.22.1 statsmodels==0.11.1 python-igraph==0.8.0 louvain==0.6.1 scgen==1.1.4

to_latent problem

Hi!
I tried to use to_latent method on the trained model and got the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-38e7b4c7913b> in <module>
----> 1 sc18z = network.to_latent(sc18)

~/.local/lib/python3.6/site-packages/scgen/models/_vae.py in to_latent(self, data)
    175                     Returns array containing latent space encoding of 'data'
    176         """
--> 177         latent = self.sess.run(self.z_mean, feed_dict={self.x: data, self.size: data.shape[0], self.is_training: False})
    178         return latent
    179 

~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    948     try:
    949       result = self._run(None, fetches, feed_dict, options_ptr,
--> 950                          run_metadata_ptr)
    951       if run_metadata:
    952         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1140             feed_handles[subfeed_t] = subfeed_val
   1141           else:
-> 1142             np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
   1143 
   1144           if (not is_tensor_handle_feed and

~/.local/lib/python3.6/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    536 
    537     """
--> 538     return array(a, dtype, copy=False, order=order)
    539 
    540 

ValueError: setting an array element with a sequence.

What am I doing wrong?

I'm using pip scgen=1.1.1

Test all functionality in tests

Tutorial Data gets not found?

Hi,

can you provide a new link or path for the training datasets in this tutorial
https://scgen.readthedocs.io/en/latest/tutorials/scgen_perturbation_prediction.html
and this tutorial?
https://nbviewer.jupyter.org/github/M0hammadL/scGen_reproducibility/blob/master/Jupyter%20Notebooks/Fig2.ipynb

I got 404 for the
train = sc.read("./tests/data/train_kang.h5ad",
backup_url="https://goo.gl/33HtVh")
And couldn't find the corresponding dataset named as "train_pbmc.h5ad"

Thanks!

Predict function should have `ctrl_key` and `stim_key` params instead of `conditions` dict.

It should also have proper typing and documentation.

Keras 2.4.3 incompatibility with TensorFlow 2.1?

Hi Theis Lab,

After creating a conda environment and installing only the scGen package, I tried to run "import scgen" as a test. However, I receive an error that seems like a package incompatibility: Keras requires TensorFlow 2.2 or higher. After installing TensorFlow 2.2, multiple other issues kept popping up so I reinstalled scgen. I then tried to downgrade only Keras to 2.3.0, which seemed to work and allowed me to run "import scgen" without compile errors. Will this still allow full scGen functionality? I am using Windows 10. Thank you!

Add annotations to corrected object

Hi @falexwolf and @M0hammadL ,

Thanks very much for developing this tool. Is giving me amazing results!

However, right now I have an issue. The tool takes batch and cell_types as main labels, then I run this as in your notebook and everything works nicely. However, whenever I try to add a new set of lables as follows:

adata_hvg_scGen = adata_hvg.copy()
adata_hvg_scGen.obs["batch"] = adata_hvg.obs["source"].tolist()
adata_hvg_scGen.obs["cell_type"] = adata_hvg.obs["leiden"].tolist()
adata_hvg_scGen.obs["location"] = adata_hvg.obs["location"].tolist()
adata_hvg_scGen.obs["method"] = adata_hvg.obs["method"].tolist()

the labels are all jumbled around and they don't correspond in the UMAP. I also noticed that the index is removed from after applying adata_corrected = scgen.batch_removal(adata_network, adata_hvg_scGen)

Do you know how to solve this?

Thanks

FutureWarning: is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead if not is_categorical(df_full[k]):

Hello. I am running scGen on my project in Jupyter notebook then when I run the prediction part, I got a "FutureWarning". I wonder what this means and does it affect my scGen run and how to resolve it if it does? Thank you!

SCRIPT:
unperturbed_cd4t = train[((train.obs["cell_type"] == "CD4T") & (train.obs["condition"] == "control"))]
PROMPT:

/home/levinbioinformatics/anaconda3/envs/scgen_test/lib/python3.7/site-packages/anndata/_core/anndata.py:1094: FutureWarning: is_categorical is deprecated and will be removed in a future version.  Use is_categorical_dtype instead
  if not is_categorical(df_full[k]):

Can I use scGen for Drug Responder and Non-responder prediction?

Hello. I have a Seurat object where all cells are each labeled as "R" for responder and "NR" for non-responder under the Seurat object metadata column "response". I wonder if I can use scGen to predict a patient's scRNAseq data as: R or NR. I only have responder and non-responder labels on the cells after drug treatment, not a "before and after" scRNAseq. scGen can predict the single-cell gene expression, but can I use it to identify people as drug responder and non-responder? Any insights on how the implementation will be?

Issue with scgen.batch_removal function

Dear Naghipourfar and M0hammadL,

Thanks for very nice work. It will be very useful for my work.
I have tested scGen and there is a small bug at the "batch_removal" function. So I just want to let you know here.

I just test scGen with 2 batches, batch 0 and batch 1 and only one cell type with index 1. Training process goes well, but the batch removal step has a bug:
The bug is at line 291 of util.py:
all_shared_ann = sc.AnnData.concatenate(*shared_ct, batch_key="concat_batch")
// after concatenate function, we suppose to have a pandas dataframe: all_shared_ann.obs["concat_batch"], but the program do not return this dataframe, only return all_shared_ann.obs["batch"], and all_shared_ann.obs["cell_type"].

So at the next line:
del all_shared_ann.obs["concat_batch"]
// this dataframe does not exist, program throw error here.

The same idea with the line 301 of util.py:
del all_corrected_data.obs["concat_batch"]

When I comment 2 bugs above, the program works well and give me a corrected matrix.
The versions of package I use are:
scgen: 1.0.0.dev25+347e176
anndata: 0.6.18
scanpy: 1.4
numpy: 1.14.5

Thanks,
Hoa Tran

`scgen.batch_removal` doesn't save `adata.raw`

Hi @M0hammadL and @Naghipourfar ,

I have been working with scGen for a while now and it's giving great results.

There is one issue though, that is causing me trouble. Once you apply the scgen.batch_removal function, the newly reconstructed adata doesn't have the raw. After inspecting this in the code, I realise that this is not included.

Do you think you could fix it?

The results from the main HVGs we use for the batch removal give amazing results, but sometimes you want to explore the expression of other genes that are not included in the HVG set. And just adding the adata1.raw to adata2.raw results in a jumbled expression matrix.

Questions on batch removal tutorial

Hi,
I'm running the batch removal tutorial provided in:
https://scgen.readthedocs.io/en/latest/tutorials/scgen_batch_removal.html

My interest is the feature of you're software of producing a corrected expression matrix after the batch removal.
I need some clarifications before moving on my dataset, in the dataset you provided pancreas.h5ad which was load in python as train, i can found:

>>> train AnnData object with n_obs × n_vars = 2448 × 14693 obs: 'n_cells-0', 'n_cells-1', 'n_cells-2', 'n_cells-3' var: 'celltype', 'sample', 'n_genes', 'batch', 'n_counts', 'louvain' uns: 'celltype_colors', 'louvain', 'neighbors', 'pca', 'sample_colors' obsm: 'PCs' varm: 'X_pca', 'X_umap' varp: 'distances', 'connectivities'

i usually don't work with AnnData object, but if i understand well we have 2448 gene expression values over 14693 cells.
In the same object i have:

>>>train.raw.X <14693x24516 sparse matrix of type '<class 'numpy.float32'>' with 55503411 stored elements in Compressed Sparse Row format>

here we have 24516 genes expression value for the 14693 cells.
after the step

corrected_adata = model.batch_removal()

we have the same situation but in corrected_adata.X i have different values with respect to train.X .
So I assume a subsample of the genes was made in the starting dataset and the corrected expression matrix is the one i found in corrected_adata.X, i wonder if this filtering was done for reduce the computational weight only in the tutorial, retaining a subset of significant genes, or because a preprocessing step of this kind is mandatory.

Sorry if it's trivial, but i was not clear to me.
As supplementary comment i want to tell you about the code in the preprocessing step

train = scgen.setup_anndata(train, batch_key="batch", labels_key="cell_type", copy=True)
i obtain the error

Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'scgen' has no attribute 'setup_anndata'

using instead:
train = scgen.SCGEN.setup_anndata(train, batch_key="batch", labels_key="cell_type", copy=True)
i have no error.

It is the correct way to do?

pip install scanpy needs version bump

Following the scvi tools bump a couple days ago, looks like pip install scgen installation version does not work. The Nightly build still works fine.

Steps to reproduce:

Create new environment
-pip install scgen
running import scgen gives ImportError: cannot import name 'setup_anndata' from 'scvi.data'

ValueError: Variable encoder/dense/kernel/Adam/ already exists if model is trained in a loop

Hello

First of all congratulations for this impressive work! I am very curious to find out how far we can go with VAEs and single cell data.

Unfortunately, I encountered an error when I tried to train a model for different data sets in a loop. It seems the variable for the optimizer is not reset if a new instance of VAEArith is created. Here is short snippet to demonstrate the problem (tested against scgen 1.1.3 from pypi).

import scgen
import numpy as np
import pandas
from anndata import AnnData

for i in range(1, 3):

    data = pandas.DataFrame(np.random.random((i * 100, 50)))
    obs = pandas.DataFrame(index=data.index)
    obs['batch'] = (['BatchA'] * i * 50) + (['BatchB'] * i * 50)
    obs['cell_type'] = ['TypeA'] * i * 100
    adata = AnnData(data, obs)

    network = scgen.VAEArith(x_dimension=adata.shape[1],
                             model_path="./models/batch-" + str(i))
    network.train(train_data=adata, n_epochs=100)
    corrected_adata = scgen.batch_removal(network, adata)

And here is the traceback:

Traceback (most recent call last):
  File "loop-bug.py", line 17, in <module>
    network = scgen.VAEArith(x_dimension=adata.shape[1], model_path="./models/batch-" + str(i))
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/scgen/models/_vae.py", line 46, in __init__
    self._loss_function()
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/scgen/models/_vae.py", line 158, in _loss_function
    self.solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.vae_loss)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 413, in minimize
    name=name)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 597, in apply_gradients
    self._create_slots(var_list)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/adam.py", line 131, in _create_slots
    self._zeros_slot(v, "m", self._name)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 1155, in _zeros_slot
    new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 190, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 164, in create_slot_with_initializer
    dtype)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 74, in _create_slot_var
    validate_shape=validate_shape)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1496, in get_variable
    aggregation=aggregation)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1239, in get_variable
    aggregation=aggregation)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 562, in get_variable
    aggregation=aggregation)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 514, in _true_getter
    aggregation=aggregation)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 864, in _get_single_variable
    (err_msg, "".join(traceback.format_list(tb))))
ValueError: Variable encoder/dense/kernel/Adam/ already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/scgen/models/_vae.py", line 158, in _loss_function
    self.solver = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.vae_loss)
  File "/home/user/software/vens/scgen/lib64/python3.6/site-packages/scgen/models/_vae.py", line 46, in __init__
    self._loss_function()
  File "loop-bug.py", line 17, in <module>
    network = scgen.VAEArith(x_dimension=adata.shape[1], model_path="./models/batch-" + str(i))

This seems to be related to issue #10. However, I tried to test this code against the current master branch but the installation of scgen fails. I created a another issue (#12) for this.

Questions about redesign for latent space

For scGEN, could I set specific latent space? For example, I intend to only use the latent space movement part and decoder part, is it ok? Thanks.

Only runs on single core?

Hi,
I'm trying to run the scgen_batch_removal.ipynb example notebook and currently only runs on a single core. In another environment it used to run using multiple cores but now I am not able to reproduce it.

Is there something you could recommend?

Current logging.print_versions:
scanpy==1.4.6 anndata==0.7.1 umap==0.3.10 numpy==1.17.0 scipy==1.3.0 pandas==0.24.2 scikit-learn==0.21.2 statsmodels==0.10.0 tensorflow==1.13.0-rc2 scgen==1.1.4

Thank you!

ImportError: cannot import name 'setup_anndata' from 'scvi.data'

Hi,
Thanks for your nice work!
I want to run the tutorial for batch-removal, but some errors appear.
I have followed the installation tutorial to complete the installation, but do not install PyTorch.
When I am import scgen, I meet this ImportError.

And there is an error when downloading the data, a server exception occurs when the download is about to be completed

Best wishes,
Shang

ImportError: cannot import name 'IndexMixin' from 'scipy.sparse.sputils'

Hi. I'm having an import error. When I try to import scgen, I encounter the error below and attached as image. ALSO, whether I import scgen or scanpy alone, same "IndexMixin" error occurs. My scipy version is 1.5.2. I tried using version 1.4 and 1.2.1 (because I saw from other online issues that this should resolve it), but it didn't work for my case, the error is still the same. Help, please!

ImportError: cannot import name 'IndexMixin' from 'scipy.sparse.sputils' (/home/levinbioinformatics/anaconda3/envs/scgen-env/lib/python3.7/site-packages/scipy/sparse/sputils.py)

	self.vae_model.save(os.path.join("vae.h5"), overwrite=True)
	self.encoder_model.save(os.path.join("encoder.h5"), overwrite=True)
	self.decoder_model.save(os.path.join("decoder.h5"), overwrite=True)

theislab / scgen Goto Github PK

scgen's Introduction

scGen

Introduction

Getting Started

Installation

Installation with pip

Examples

Reproducing paper results

References

scgen's People

Contributors

Stargazers

Watchers

Forkers

scgen's Issues

I create a conda environment for scgen and I followed the installation instruction. It seems successful. In Kang's example, I got this error message at scg = scgen.VAEArith(x_dimension= train.shape[1], model_path="./models/test" )

Hi, Thanks for your excellent contribution. I recently encountered a bug when I ran " scgen.SCGEN(train)". The error information is:

Recommend Projects

Recommend Topics

Recommend Org

Jobs

I create a conda environment for scgen and I followed the installation instruction. It seems successful.
In Kang's example, I got this error message at
scg = scgen.VAEArith(x_dimension= train.shape[1], model_path="./models/test" )

Hi,
Thanks for your excellent contribution. I recently encountered a bug when I ran " scgen.SCGEN(train)". The error information is: