Dear scDRS devs, Hi, I tried following the tutorial (<a href="https:

Same error with v1.0.3: <div class="snippet-clipboard-content notranslate position

AssertionError when running quick test after installation about scdrs HOT 7 CLOSED

hoholee commented on August 22, 2024

AssertionError when running quick test after installation

from scdrs.

Comments (7)

martinjzhang commented on August 22, 2024 2

Fixed. The issue is due to a small discrepancy between different pandas versions.
#85

from scdrs.

martinjzhang commented on August 22, 2024

Hi, v1.0.3 is in the main branch. We may have updated the test data. Can you install from the main branch and run the tests again?

from scdrs.

hoholee commented on August 22, 2024

Same error with v1.0.3:

$ python -m pytest tests/test_CLI.py -p no:warnings
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.12.2, pytest-8.1.1, pluggy-1.4.0
rootdir: /home/jul307/software/scDRS
configfile: pyproject.toml
collected 3 items

tests/test_CLI.py F..                                                                                                                                                                                                                  [100%]

================================================================================================================== FAILURES ==================================================================================================================
____________________________________________________________________________________________________________ test_score_cell_cli _____________________________________________________________________________________________________________

    def test_score_cell_cli():
        """
        Test CLI `scdrs compute-score`
        """
        # Load toy data
        ROOT_DIR = scdrs.__path__[0]
        H5AD_FILE = os.path.join(ROOT_DIR, "data/toydata_mouse.h5ad")
        COV_FILE = os.path.join(ROOT_DIR, "data/toydata_mouse.cov")
        assert os.path.exists(H5AD_FILE), "built-in data toydata_mouse.h5ad missing"
        assert os.path.exists(COV_FILE), "built-in data toydata_mouse.cov missing"

        tmp_dir = tempfile.TemporaryDirectory()
        tmp_dir_path = tmp_dir.name
        dict_df_score = {}
        for gs_species in ["human", "mouse"]:
            gs_file = os.path.join(ROOT_DIR, f"data/toydata_{gs_species}.gs")
            # call compute_score.py
            cmds = [
                f"scdrs compute-score",
                f"--h5ad_file {H5AD_FILE}",
                "--h5ad_species mouse",
                f"--gs_file {gs_file}",
                f"--gs_species {gs_species}",
                f"--cov_file {COV_FILE}",
                "--ctrl_match_opt mean_var",
                "--n_ctrl 20",
                "--flag_filter_data False",
                "--weight_opt vs",
                "--flag_raw_count False",
                "--flag_return_ctrl_raw_score False",
                "--flag_return_ctrl_norm_score False",
                f"--out_folder {tmp_dir_path}",
            ]
            subprocess.check_call(" ".join(cmds), shell=True)
            dict_df_score[gs_species] = pd.read_csv(
                os.path.join(tmp_dir_path, f"toydata_gs_{gs_species}.score.gz"),
                sep="\t",
                index_col=0,
            )
        # consistency between human and mouse
        assert np.all(dict_df_score["mouse"].pval == dict_df_score["human"].pval)

        df_res = dict_df_score["mouse"]

        REF_COV_FILE = os.path.join(
            ROOT_DIR, "data/toydata_gs_mouse.ref_Ctrl20_CovConstCovariate.score.gz"
        )
        df_ref_res = pd.read_csv(REF_COV_FILE, sep="\t", index_col=0)
>       compare_score_file(df_res, df_ref_res)

tests/test_CLI.py:58:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

df_res =                                         raw_score  norm_score   mc_pval      pval  nlog10_pval     zscore
index       ...00 -10.000000
J10_B003899_S130.mus-7-0-1               4.460493   -1.627243  1.000000  0.956739     0.019207  -1.714034
df_res_ref =                                         raw_score  norm_score   mc_pval      pval  nlog10_pval     zscore
index       ...00 -10.000000
J10_B003899_S130.mus-7-0-1               4.460493   -2.305674  1.000000  0.991680     0.003628  -2.394591

    def compare_score_file(df_res, df_res_ref):
        """
        Compare df_res
        """

        col_list = ["raw_score", "norm_score", "mc_pval", "pval"]
        for col in col_list:
            v_ = df_res[col].values
            v_ref = df_res_ref[col].values
            err_msg = "Inconsistent values: {}\n".format(col)
            err_msg += "|{:^15}|{:^15}|{:^15}|{:^15}|\n".format(
                "OBS", "REF", "DIF", "REL_DIF"
            )
            for i in range(v_.shape[0]):
                err_msg += "|{:^15.3e}|{:^15.3e}|{:^15.3e}|{:^15.3e}|\n".format(
                    v_[i],
                    v_ref[i],
                    v_[i] - v_ref[i],
                    np.absolute((v_[i] - v_ref[i]) / v_ref[i]),
                )
>           assert np.allclose(v_, v_ref, rtol=1e-2, equal_nan=True), err_msg
E           AssertionError: Inconsistent values: norm_score
E             |      OBS      |      REF      |      DIF      |    REL_DIF    |
E             |   4.445e+00   |   6.326e+00   |  -1.881e+00   |   2.973e-01   |
E             |   6.038e+00   |   5.916e+00   |   1.216e-01   |   2.056e-02   |
E             |   4.697e+00   |   5.552e+00   |  -8.552e-01   |   1.540e-01   |
E             |   5.186e+00   |   7.299e+00   |  -2.112e+00   |   2.894e-01   |
E             |   6.072e+00   |   5.779e+00   |   2.927e-01   |   5.065e-02   |
E             |  -6.976e-01   |  -5.614e-01   |  -1.362e-01   |   2.427e-01   |
E             |  -1.192e+00   |  -1.582e+00   |   3.897e-01   |   2.463e-01   |
E             |  -2.219e+00   |  -2.312e+00   |   9.325e-02   |   4.033e-02   |
E             |   1.216e+00   |   1.157e+00   |   5.952e-02   |   5.146e-02   |
E             |  -4.155e+00   |  -3.166e+00   |  -9.896e-01   |   3.126e-01   |
E             |   2.262e+00   |   1.505e+00   |   7.576e-01   |   5.035e-01   |
E             |  -2.240e+00   |  -3.798e+00   |   1.558e+00   |   4.102e-01   |
E             |   7.692e-01   |   1.052e+00   |  -2.824e-01   |   2.686e-01   |
E             |   2.888e-01   |  -1.237e-01   |   4.126e-01   |   3.334e+00   |
E             |  -4.752e-01   |  -8.706e-01   |   3.954e-01   |   4.541e-01   |
E             |  -3.281e+00   |  -3.768e+00   |   4.869e-01   |   1.292e-01   |
E             |  -1.792e+00   |  -2.232e+00   |   4.397e-01   |   1.970e-01   |
E             |  -7.435e-01   |  -6.558e-01   |  -8.775e-02   |   1.338e-01   |
E             |  -3.577e-01   |  -4.232e-01   |   6.545e-02   |   1.547e-01   |
E             |  -1.968e+00   |  -2.191e+00   |   2.235e-01   |   1.020e-01   |
E             |  -3.799e-01   |  -2.172e-01   |  -1.626e-01   |   7.487e-01   |
E             |   7.900e-02   |  -1.761e-01   |   2.551e-01   |   1.449e+00   |
E             |   8.555e-01   |   7.654e-01   |   9.011e-02   |   1.177e-01   |
E             |  -2.135e-01   |  -3.305e-01   |   1.170e-01   |   3.541e-01   |
E             |  -1.905e+00   |  -2.228e+00   |   3.232e-01   |   1.451e-01   |
E             |  -3.454e+00   |  -2.705e+00   |  -7.495e-01   |   2.771e-01   |
E             |  -2.037e+00   |  -2.207e+00   |   1.692e-01   |   7.670e-02   |
E             |  -4.795e-01   |  -3.563e-01   |  -1.232e-01   |   3.458e-01   |
E             |  -2.691e+00   |  -3.141e+00   |   4.506e-01   |   1.434e-01   |
E             |  -1.627e+00   |  -2.306e+00   |   6.784e-01   |   2.942e-01   |
E
E           assert False
E            +  where False = <function allclose at 0x7f4a28366270>(array([ 4.4454584 ,  6.037902  ,  4.6971283 ,  5.186194  ,  6.071957  ,\n       -0.6976079 , -1.1924832 , -2.2186813 , ...900415,  0.8554982 , -0.21349816, -1.9051081 ,\n       -3.4541266 , -2.037314  , -0.47953042, -2.690723  , -1.6272427 ]), array([ 6.3260064 ,  5.916272  ,  5.5523157 ,  7.2986684 ,  5.7792473 ,\n       -0.5613674 , -1.5821338 , -2.3119287 , ...612725,  0.7653889 , -0.33054087, -2.228345  ,\n       -2.7046354 , -2.2065454 , -0.35630605, -3.1413238 , -2.3056736 ]), rtol=0.01, equal_nan=True)
E            +    where <function allclose at 0x7f4a28366270> = np.allclose

tests/test_method_score_cell_main.py:76: AssertionError
------------------------------------------------------------------------------------------------------------ Captured stdout call ------------------------------------------------------------------------------------------------------------
******************************************************************************
* Single-cell disease relevance score (scDRS)
* Version 1.0.3
* Martin Jinye Zhang and Kangcheng Hou
* HSPH / Broad Institute / UCLA
* MIT License
******************************************************************************
Call: scdrs compute-score \
--h5ad-file /home/jul307/software/scDRS/scdrs/data/toydata_mouse.h5ad \
--h5ad-species mmusculus \
--cov-file /home/jul307/software/scDRS/scdrs/data/toydata_mouse.cov \
--gs-file /home/jul307/software/scDRS/scdrs/data/toydata_human.gs \
--gs-species hsapiens \
--ctrl-match-opt mean_var \
--weight-opt vs \
--adj-prop None \
--flag-filter-data False \
--flag-raw-count False \
--n-ctrl 20 \
--min-genes 250 \
--min-cells 50 \
--flag-return-ctrl-raw-score False \
--flag-return-ctrl-norm-score False \
--out-folder /scratch/tmpggtt845u

Loading data:
--h5ad-file loaded: n_cell=30, n_gene=2500 (sys_time=0.1s)
First 3 cells: ['N1.MAA000586.3_8_M.1.1-1-1', 'F10.D041911.3_8_M.1.1-1-1', 'A17_B002755_B007347_S17.mm10-plus-7-0']
First 5 genes: ['Pip4k2a', 'Chd7', 'Atp6v0c', 'Exoc3', 'Pex5']
--cov-file loaded: covariates=['covariate'] (sys_time=0.1s)
n_cell=30 (30 in .h5ad)
First 3 cells: ['N1.MAA000586.3_8_M.1.1-1-1', 'F10.D041911.3_8_M.1.1-1-1', 'A17_B002755_B007347_S17.mm10-plus-7-0']
First 5 values for 'covariate': [10, 10, 10, 10, 10]
--gs-file loaded: n_trait=1 (sys_time=0.1s)
Print info for first 3 traits:
First 3 elements for 'toydata_gs_human': ['Mrps33', 'Cyp4f13', 'Kazald1'], [1.0, 1.0, 1.0]

Preprocessing:
Too few genes for 20*20 bins, setting n_mean_bin=n_var_bin=15

Computing scDRS score:
Trait=toydata_gs_human, n_gene=250: 6/30 FDR<0.1 cells, 6/30 FDR<0.2 cells (sys_time=0.4s)
******************************************************************************
* Single-cell disease relevance score (scDRS)
* Version 1.0.3
* Martin Jinye Zhang and Kangcheng Hou
* HSPH / Broad Institute / UCLA
* MIT License
******************************************************************************
Call: scdrs compute-score \
--h5ad-file /home/jul307/software/scDRS/scdrs/data/toydata_mouse.h5ad \
--h5ad-species mouse \
--cov-file /home/jul307/software/scDRS/scdrs/data/toydata_mouse.cov \
--gs-file /home/jul307/software/scDRS/scdrs/data/toydata_mouse.gs \
--gs-species mouse \
--ctrl-match-opt mean_var \
--weight-opt vs \
--adj-prop None \
--flag-filter-data False \
--flag-raw-count False \
--n-ctrl 20 \
--min-genes 250 \
--min-cells 50 \
--flag-return-ctrl-raw-score False \
--flag-return-ctrl-norm-score False \
--out-folder /scratch/tmpggtt845u

Loading data:
--h5ad-file loaded: n_cell=30, n_gene=2500 (sys_time=0.0s)
First 3 cells: ['N1.MAA000586.3_8_M.1.1-1-1', 'F10.D041911.3_8_M.1.1-1-1', 'A17_B002755_B007347_S17.mm10-plus-7-0']
First 5 genes: ['Pip4k2a', 'Chd7', 'Atp6v0c', 'Exoc3', 'Pex5']
--cov-file loaded: covariates=['covariate'] (sys_time=0.0s)
n_cell=30 (30 in .h5ad)
First 3 cells: ['N1.MAA000586.3_8_M.1.1-1-1', 'F10.D041911.3_8_M.1.1-1-1', 'A17_B002755_B007347_S17.mm10-plus-7-0']
First 5 values for 'covariate': [10, 10, 10, 10, 10]
--gs-file loaded: n_trait=1 (sys_time=0.0s)
Print info for first 3 traits:
First 3 elements for 'toydata_gs_mouse': ['Mrps33', 'Cyp4f13', 'Kazald1'], [1.0, 1.0, 1.0]

Preprocessing:
Too few genes for 20*20 bins, setting n_mean_bin=n_var_bin=15

Computing scDRS score:
Trait=toydata_gs_mouse, n_gene=250: 6/30 FDR<0.1 cells, 6/30 FDR<0.2 cells (sys_time=0.3s)
------------------------------------------------------------------------------------------------------------ Captured stderr call ------------------------------------------------------------------------------------------------------------
Computing control scores: 100%|██████████| 20/20 [00:00<00:00, 272.68it/s]
Computing control scores: 100%|██████████| 20/20 [00:00<00:00, 286.57it/s]
========================================================================================================== short test summary info ===========================================================================================================
FAILED tests/test_CLI.py::test_score_cell_cli - AssertionError: Inconsistent values: norm_score
======================================================================================================== 1 failed, 2 passed in 37.78s ========================================================================================================

from scdrs.

hoholee commented on August 22, 2024

I've also tried scDRS v.1.0.3 with multiple versions of Python (3.8-3.12), and the test only passed with Python 3.8 for some reason:

python -m pytest tests/test_CLI.py -p no:warnings
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.19, pytest-8.1.1, pluggy-1.4.0
rootdir: /home/jul307/software/scDRS
configfile: pyproject.toml
plugins: anyio-3.7.1
collected 3 items

tests/test_CLI.py ...                                                                                                                                                                                                                  [100%]

============================================================================================================= 3 passed in 46.72s =============================================================================================================

from scdrs.

KangchengHou commented on August 22, 2024

Somewhat strangely, I couldn't replicate this error using either python 3.9 / 3.10.

For example in https://colab.google/ (3.10)

!python --version
!pip install git+https://github.com/martinjzhang/scDRS.git

import os
import pandas as pd
import scdrs

DATA_PATH = scdrs.__path__[0]
H5AD_FILE = os.path.join(DATA_PATH, "data/toydata_mouse.h5ad")
COV_FILE = os.path.join(DATA_PATH, "data/toydata_mouse.cov")
GS_FILE = os.path.join(DATA_PATH, "data/toydata_mouse.gs")

# Load .h5ad file, .cov file, and .gs file
adata = scdrs.util.load_h5ad(H5AD_FILE, flag_filter_data=False, flag_raw_count=False)
df_cov = pd.read_csv(COV_FILE, sep="\t", index_col=0)
df_gs = scdrs.util.load_gs(GS_FILE)

# Preproecssing .h5ad data compute scDRS score
scdrs.preprocess(adata, cov=df_cov)
gene_list = df_gs['toydata_gs_mouse'][0]
gene_weight = df_gs['toydata_gs_mouse'][1]
df_res = scdrs.score_cell(adata, gene_list, gene_weight=gene_weight, n_ctrl=20)

print(df_res.iloc[:4])

from scdrs.

hoholee commented on August 22, 2024

Strange indeed... Maybe something is wrong with my conda. But I can't think of any reason why only the norm_score is affected and why this is Python version-dependent.

Thanks for the efforts in pinpointing the issue. I'm closing this for now unless someone else runs into this. But I'd recommend updating the installation instructions in the tutorial to v.1.0.3.

from scdrs.

martinjzhang commented on August 22, 2024

I replicated this issue (with the exact norm_score values as @hoholee's) using conda + py39 on a local HPC. This might be a Python version issue. I will look into this matter further.

from scdrs.

AssertionError when running quick test after installation about scdrs HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs