teichlab / cellphonedb Goto Github PK

License: MIT License

Python 98.16% Dockerfile 0.12% R 1.61% Shell 0.11%

cellphonedb's Introduction

CellPhoneDB

⚠️ We have updated CellPhoneDB to v3 and migted CellPhoneDB to a new repository. Please refer to ventolab/CellphoneDB for the most up to date version.

What is CellPhoneDB?

CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions. Subunit architecture is included for both ligands and receptors, representing heteromeric complexes accurately. This is crucial, as cell-cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies.

CellPhoneDB integrates existing datasets that pertain to cellular communication and new manually reviewed information. Databases from which CellPhoneDB gets information are: UniProt, Ensembl, PDB, the IMEx consortium, IUPHAR.

CellPhoneDB can be used to search for a particular ligand/receptor, or interrogate your own single-cell transcriptomics data.

Starting to use CellPhoneDB

To start using CellPhoneDB, you can use our interactive web application (cellphonedb.org) and run in the analysis in our private cloud, or just run CellPhoneDB with your own computational resources. (The latter is preferable if you are going to work with big datasets).

Installing CellPhoneDB

NOTE: Works with Python v3.6 or greater. If your default Python interpreter is for v2.x (you can check it with python --version), calls to python/pip should be substituted by python3/pip3.

We highly recommend using an isolated python environment (as described in steps 1 and 2) using conda or virtualenv but you could of course omit these steps and install via pip immediately.

Create python=>3.6 environment

Using conda: conda create -n cpdb python=3.7
Using virtualenv: python -m venv cpdb

Activate environment

Using conda: source activate cpdb
Using virtualenv: source cpdb/bin/activate

Install CellPhoneDB pip install cellphonedb

Running CellPhoneDB Methods

Please, activate your environment if you didn't previously

Using conda: source activate cpdb
Using virtualenv: source cpdb/bin/activate

To use the example data, please download meta/counts test data. i.e.

curl https://raw.githubusercontent.com/Teichlab/cellphonedb/master/in/example_data/test_counts.txt --output test_counts.txt
curl https://raw.githubusercontent.com/Teichlab/cellphonedb/master/in/example_data/test_meta.txt --output test_meta.txt

Note: counts file can be a text file or h5ad (recoommended), h5 or a path to a folder containing mtx/barcode/features.

Example with running the statistical method

cellphonedb method statistical_analysis test_meta.txt test_counts.txt

Example without using the statistical method

Using text files

cellphonedb method analysis test_meta.txt test_counts.txt

Using h5ad count file

cellphonedb method analysis test_meta.txt test_counts.h5ad

Please check the results documentation in order to understand the results.

Optional Parameters

~ Optional Method parameters:

--counts-data: [ensembl | gene_name | hgnc_symbol] Type of gene identifiers in the counts data
--project-name: Name of the project. A subfolder with this name is created in the output folder
--iterations: Number of iterations for the statistical analysis [1000]
--threshold: % of cells expressing the specific ligand/receptor
--result-precision: Number of decimal digits in results [3]
--output-path: Directory where the results will be allocated (the directory must exist) [out]
--output-format: Output format of the results files (extension will be added to filename if not present) [txt]
--means-result-name: Means result filename [means]
--significant-means-result-name: Significant mean result filename [significant_means]
--deconvoluted-result-name: Deconvoluted result filename [deconvoluted]
--verbose/--quiet: Print or hide CellPhoneDB logs [verbose]
--subsampling: Enable subsampling
--subsampling-log: Enable subsampling log1p for non log-transformed data inputs !!mandatory!!
--subsampling-num-pc: Subsampling NumPC argument (number of PCs to use) [100]
--subsampling-num-cells: Number of cells to subsample to [1/3 of cells]

~ Optional Method Statistical parameters

--pvalues-result-name: P-values result filename [pvalues]
--pvalue: P-value threshold [0.05]
--debug-seed: Debug random seed -1. To disable it please use a value >=0 [-1]
--threads: Number of threads to use. >=1 [4]

Usage Examples

Set number of iterations and threads

cellphonedb method statistical_analysis yourmetafile.txt yourcountsfile.txt --iterations=10 --threads=2

Set project subfolder

cellphonedb method analysis yourmetafile.txt yourcountsfile.txt --project-name=new_project

Set output path

mkdir custom_folder
cellphonedb method statistical_analysis yourmetafile.txt yourcountsfile.txt --output-path=custom_folder

Subsampling

cellphonedb method analysis yourmetafile.txt yourcountsfile.txt --subsampling --subsampling-log false --subsampling-num-cells 3000

Plotting statistical method results

In order to plot results from the statistical methods, you need to run it first.

Currently there are two plot types available: dot_plot & heatmap_plot

Once you have the needed files (means & pvalues) you can proceed as follows:

cellphonedb plot dot_plot

cellphonedb plot heatmap_plot yourmeta.txt

`dot_plot`

This plot type requires ggplot2 R package installed and working

You can tweak the options for the plot with these arguments:

--means-path: The means output file [./out/means.txt]
--pvalues-path: The pvalues output file [./out/pvalues.txt]
--output-path: Output folder [./out]
--output-name: Filename of the output plot [plot.pdf]
--rows: File with a list of rows to plot, one per line [all available]
--columns: File with a list of columns to plot, one per line [all available]
--verbose / --quiet: Print or hide CellPhoneDB logs [verbose]

Available output formats are those supported by R's ggplot2 package, among others they are:

pdf
png
jpeg

This format will be inferred from the --output-name argument

To plot only desired rows/columns (samples for rows and columns based in example data files):

cellphonedb plot dot_plot --rows in/rows.txt --columns in/columns.txt

`heatmap_plot`

This plot type requires pheatmap R package installed and working This plot type includes two features count & log_count

You can tweak the options for the plot with these arguments:

--pvalues-path: The pvalues output file [./out/pvalues.txt]
--output-path: Output folder [./out]
--count-name: Filename of the output plot [heatmap_count.pdf]
--log-name: Filename of the output plot using log-count of interactions [heatmap_log_count.pdf]
--count-network-name: Filename of the output network file [count_network.txt]
--interaction-count-name: Filename of the output interactions-count file [interactions_count.txt]
--pvalue: pvalue threshold to consider when plotting [0.05]
--verbose / --quiet: Print or hide cellphonedb logs [verbose]

Available output formats are those supported by R's pheatmap package, among others they are:

pdf
png
jpeg

This format will be inferred from the --count-name & --log-name arguments.

Using different database versions

CellPhoneDB databases can be updated from the remote repository through our tool. Furthermore, available versions can be listed and downloaded for use.

To use one of those versions, a user must provide the argument --database <version_or_file> to the method to be executed.

If the given parameter is a readable database file, it will be used as is. Otherwise it will use some of the versions matching the selected version.

If the selected version does not exist in the local environment it will be downloaded from the remote repository. (See below.) If no --database argument is given in methods execution, it will use the latest local version available.

Downloaded versions will be stored in a user folder under ~/.cpdb/releases

Listing remote available versions

The command to list available versions from the remote repository is:

cellphonedb database list_remote

Listing local available versions

The command to list available versions from the local repository is:

cellphonedb database list_local

Download version

The command to download a version from the remote repository is:

cellphonedb database download

cellphonedb database download --version <version_spec|latest>

whereby version_spec must be one of the listed in the database list_remote command. If no version is specified or latest is used as a version_spec, the newest available version will be downloaded

Generating user-specific custom database

A user can generate custom databases and use them. In order to generate a new database, a user can provide his/her own lists.

These lists can be: genes, proteins, complexes and/or interactions. In the generation process they will get merged with the ones from the CellPhoneDB release sources. The user lists have higher precedence than the ones included in CellPhoneDB package.

To generate such a database the user has to issue this command:

cellphonedb database generate

Generate specific parameters:

--user-protein: Protein input file
--user-gene: Gene input file
--user-complex: Complex input file
--user-interactions: Interactions input file
--fetch: Some lists can be downloaded from original sources while creating the database, eg: uniprot, ensembl. By default, the snapshots included in the CellPhoneDB package will be used; to enable a fresh copy --fetch must be appended to the command
--result-path: Output folder
--log-file: Log file
--user-interactions-only: Use only interactions provided.

Result database file is generated in the folder out with cellphonedb_user_{datetime}.db. The user defined input tables will be merged with the current CellPhoneDB input tables. To use this database, please use --database parameter in methods. E.g:

 cellphonedb method statistical_analysis in/example_data/test_meta.txt in/example_data/test_counts.txt --database out/cellphonedb_user_2019-05-10-11_10.db

Examples for user-specific custom database

To add or correct some interactions

Input:
- your_custom_interaction_file.csv: Comma separated file (use mandatory columns!) with interactions to add/correct.
Command:
```
cellphonedb database generate --user-interactions your_custom_interaction_file.csv 
```
Result:

New database file with CellPhoneDB interactions + user custom interactions. For duplicated interactions, user lists overwrite the CellPhoneDB original data.
To use only user-specific interactions

Input:
- your_custom_interaction_file.csv: Comma separated file (use mandatory columns!) with interactions to use.
Command:
```
cellphonedb database generate --user-interactions your_custom_interaction_file.csv --user-interactions-only
```
Result:

New database file with only user custom interactions.
To correct any protein data

Input:
- your_custom_protein_file.csv: Comma separated file (use mandatory columns!) with proteins to overwrite.
Command:
```
cellphonedb database generate --user-protein your_custom_protein_file.csv 
```
Result:

New database file with CellPhoneDB interactions + user custom interactions. For duplicated interactions or proteins, user lists overwrite the CellPhoneDB original data.
To add some interactions and correct any protein data

Input:
- your_custom_interaction_file.csv: Comma separated file (use mandatory columns!) with interactions to add/correct.
- your_custom_protein_file.csv: Comma separated file (use mandatory columns!) with proteins to overwrite.
Command:
```
cellphonedb database generate --user-interactions your_custom_interaction_file.csv --user-protein your_custom_protein_file.csv 
```
Result:

New database file with CellPhoneDB interactions + user custom interactions. On duplicated interactions or proteins, user list overwrites CellPhoneDB original data.
To update remote sources (UniProt, IMEx, ensembl, etc.)

IMPORTANT

This command uses external resources allocated in external servers. The command may not end correctly if external servers are not available. The timing of this step depends on external servers and the user's internet connection and can take a lot of time.

Input:
- your_custom_interaction_file.csv: Comma separated file (use mandatory columns!) with interactions to add/correct.
- your_custom_protein_file.csv: Comma separated file (use mandatory columns!) with proteins to overwrite.
Command:
```
cellphonedb database generate --fetch 
```
Result:

New database file with CellPhoneDB interactions + user custom interactions. For duplicated interactions or proteins, user lists overwrite the CellPhoneDB original data.

Some lists can be downloaded from original sources while creating the database, e.g. uniprot, ensembl. By default, the snapshots included in the CellPhoneDB package will be used---to enable a fresh copy --fetch must be appended to the command.

In order to use specific lists those can be specified like this --user-protein, --user-gene, --user-complex, --user-interactions, --user-interactions-only followed by the corresponding file path.

The database file can be then used as explained below. The intermediate lists used for the generation will be saved along the database itself.

As the lists are processed, then filtered, and lastly collected, two versions may exist: _generated is the unfiltered one whereas _input is the final state prior being inserted in the database.

Contributing to CellPhoneDB

CellPhoneDB is an open-source project. If you are interested in contributing to this project, please let us know.

You can check all project documentation in the Docs section

cellphonedb's People

Contributors

Stargazers

Watchers

Forkers

zorrodong jshilts herpelinckt rferrando hichamaffia arutik tuqiang2014 chapuzzo hkailee saezlab mief shengxinbaixiaosheng amz965 franciscogrisanti liaoscience yejg2017 misaka-dayu ivanovaos antonio-miranda evanbiederstedt hkhllyzh mengchengyao kant yixf-self kyrenexu charlenez95 chen318liang multitalk yanwengong yu-1011 xinzj marypiper lucasesbs johnwang1997 gokceneraslan akhileshkaushal rstatistics psonin bradmonk eba28 mcclo kaukrise stefanpeidli yewero yuhanh sryim xiahaohao hjames1 dhtc fengwei-li chenmengpin lisa-jammy namhuynhnc pkhatri94 charlesyip2016 wilsonyangliu rgunaratna denghb001 adrianodemarino syssynbio ni11235 zandigohar qianhuixu anykine eijynagai rubenchazarra user12abio prete samuelbunga spike11-hk shaobo-bio 17691821 jing-xinxing dongfang1021 mayunlong89 nicolas-eng karlwu8500s realzehuali daisyyr wnivers nofalouard ianhsu99 natnaelt meijian abuchin huangzhongyu fr5ctal sunny1dayisagoodday swifilaboroka z676521995 wangjun-hub gmh123-jpg maj18 sygongcode 2019surbhi ewowiredu hasihays 54qianzhou chancen-king nine-sarayut

cellphonedb's Issues

Results rounded to 1 decimal place

Hi, thanks for a great package! I ran the code and the results look good, but all of the p-values are rounded to a single decimal point so its impossible to tell the difference between significant hits and noise. Is there a way to get the full p-values?

Unsupported operand types

cellphonedb method statistical_analysis meta.txt celldb_counts.txt --project-name=CellPhoneDB10 --iterations=10
I tried the above line of code. It was running for a few seconds and I got this error
TypeError: ("unsupported operand type(s) for +: 'float' and 'str'", 'occurred at index ENSG00000107562').

Example file did not have an issue and was working fine.
I double checked my file formats and the example file. There seems to be no difference.

Any help is appreciated.
Thank you very much!

Running cellphonedb analysis on multiple samples

Hello,

Thanks for the updated version of cellphonedb, I am going to try out the new features.

I have a question about running this analysis on 10 samples. Should I run the analysis 10 times by taking cells from 1 sample in one run ?

If I combine all the cells from all 10 samples in the same analysis, will the interactions also include inter-sample interactions ? What is the best way to avoid that ?

Thank you.

receptor_a and receptor_b are the same True/False

Hi,thanks for your great database and tool . When i was using the cellphonedb to predict functional interactions, I found some interacting_pair have the same annotation (True True or False False ) at receptor_a and receptor_b column in significant_means.txt file.

Just like that:
CPI-SC0C3E2D267 F10_aMb2 complex simple:P00742 complex:aMb2 complex ENSG00000126218 True False False curated ,,,

Would you please tell me what they mean? Thank you very much.

should the input data be raw data or normalized data?

When I use normalized data as input(data from seurat with log format), I found the output contained only very few genes( only 18 genes ). I think it might be caused by negative value in the normalized input. So I change to use raw data as input and get 1873 genes in deconvoluted.txt, but I wonder whether cellphone will normalize it or not? Thank you for your work and look forward for your reply^_^

ValueError: not enough values to unpack (expected 2, got 0)

Hi,
I have been trying to run cellphonedb but I am having an error.
I am adding below the command line, outputs and details of Input.
Please let me know what to change.
Thanks in advance,
Julie

cellphonedb method analysis pah.subset.metada.cpdb.txt pah.subset.count.cpdb.txt --counts-data gene_name

error

/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.cluster.k_means_ module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.cluster. Anything that cannot be imported from sklearn.cluster is now part of the private API.
warnings.warn(message, FutureWarning)
[ ][APP][21/01/20-15:24:00][WARNING] Latest local available version is v2.0.0, using it
[ ][APP][21/01/20-15:24:00][WARNING] User selected downloaded database v2.0.0 is available, using it
[ ][CORE][21/01/20-15:24:00][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][21/01/20-15:24:00][INFO] Using custom database at /home/jrodor2/.cpdb/releases/v2.0.0/cellphone.db
[ ][APP][21/01/20-15:24:00][INFO] Launching Method cpdb_analysis_local_method_launcher
[ ][APP][21/01/20-15:24:00][INFO] Launching Method _set_paths
[ ][APP][21/01/20-15:24:00][INFO] Launching Method _load_meta_counts
[ ][CORE][21/01/20-15:24:01][INFO] Launching Method cpdb_method_analysis_launcher
[ ][CORE][21/01/20-15:24:01][INFO] Launching Method _counts_validations
[ ][CORE][21/01/20-15:24:01][INFO] [Non Statistical Method] Threshold:0.1 Precission:3
[ ][CORE][21/01/20-15:24:01][INFO] Running Simple Prefilters
[ ][CORE][21/01/20-15:24:02][INFO] [Non Statistical Method] Threshold:0.1 Precision:3
[ ][CORE][21/01/20-15:24:02][INFO] Running Complex Prefilters
[ ][APP][21/01/20-15:24:02][ERROR] Unexpected error
Traceback (most recent call last):
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", line 207, in analysis
subsampler,
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 98, in cpdb_analysis_local_method_launcher
subsampler)
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 113, in cpdb_method_analysis_launcher
result_precision)
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/cellphonedb/src/core/methods/cpdb_analysis_method.py", line 41, in call
deconvoluted.drop_duplicates(inplace=True)
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/pandas/core/frame.py", line 4331, in drop_duplicates
duplicated = self.duplicated(subset, keep=keep)
File "/home/jrodor2/cpdb-venv/lib/python3.5/site-packages/pandas/core/frame.py", line 4385, in duplicated
labels, shape = map(list, zip(*map(f, vals)))
ValueError: not enough values to unpack (expected 2, got 0)

input #my metadata file looks like this

Cell cluster
pah3_CGGACACGTTGAGTTC c3
contA_CACACTCTCTGTTTGT c1
contA_CTCGTCATCCGTCAAA c7
pah3_CTCGAGGGTTTGACAC c0
pah2_TCTCATAAGCGTTCCG c0
pah2_GAGCAGATCCAAGCCG c3
pah2_GGCTGGTGTTACAGAA c0
contB_TCAGCAACATATGGTC c1
pah1_GCACATATCAAGGTAA c3

count table (showing subset 10 rows/5 columns)

Gene pah3_CGGACACGTTGAGTTC contA_CACACTCTCTGTTTGT contA_CTCGTCATCCGTCAAA pah3_CTCGAGGGTTTGACAC
Rp1 0 0 0 0
Sox17 0 0 0 2.74082443266566
Mrpl15 0 0 0 0
Lypla1 0 0 0 0
Gm37988 0 0 0 0
Tcea1 1.65068087096815 0 1.2849200801299 0
Atp6v1h 0 0 0 1.76357478287235
Rb1cc1 0 0 0 0
4732440D04Rik 0 0 0 0

fail to run plot

Hi, all
I tried to use the example data to test if the cellphonedb has been installed correctly. I can successfully use the tool "method",but failed with the "plot". The error info is as follows. Can anyone help me?

R[write to console]: Error: package or namespace load failed for ‘methods’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/software/R-3.6.1/lib64/R/library/methods/libs/methods.so':
/software/R-3.6.1/lib64/R/library/methods/libs/methods.so: undefined symbol: Rf_allocS4Object

R[write to console]: Error: package or namespace load failed for ‘utils’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/software/R-3.6.1/lib64/R/library/utils/libs/utils.so':
/software/R-3.6.1/lib64/R/library/utils/libs/utils.so: undefined symbol: R_NilValue

R[write to console]: Error: package or namespace load failed for ‘grDevices’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so':
/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so: undefined symbol: R_NilValue

R[write to console]: Error: package or namespace load failed for ‘graphics’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so':
/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so: undefined symbol: R_NilValue

R[write to console]: Error: package or namespace load failed for ‘stats’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so':
/software/R-3.6.1/lib64/R/library/grDevices/libs/grDevices.so: undefined symbol: R_NilValue

R[write to console]: During startup -
R[write to console]: Warning messages:

R[write to console]: 1: package "methods" in options("defaultPackages") was not found

R[write to console]: 2: package ‘utils’ in options("defaultPackages") was not found

R[write to console]: 3: package ‘grDevices’ in options("defaultPackages") was not found

R[write to console]: 4: package ‘graphics’ in options("defaultPackages") was not found

R[write to console]: 5: package ‘stats’ in options("defaultPackages") was not found

R[write to console]: 6: package ‘methods’ in options("defaultPackages") was not found

R[write to console]: Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/software/R-3.6.1/lib64/R/library/methods/libs/methods.so':
/software/R-3.6.1/lib64/R/library/methods/libs/methods.so: undefined symbol: Rf_allocS4Object

[ ][APP][05/09/19-08:20:40][ERROR] Unexpected error
Traceback (most recent call last):
File "/software/Python-3.7.3/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/plot_terminal_api_endpoints/plot_terminal_commands.py", line 72, in heatmap_plot
pvalue=pvalue)
File "/software/Python-3.7.3/lib/python3.7/site-packages/cellphonedb/src/plotters/r_plotter.py", line 36, in wrapper
from rpy2 import robjects
File "/software/Python-3.7.3/lib/python3.7/site-packages/rpy2/robjects/init.py", line 17, in
from rpy2.robjects.robject import RObjectMixin, RObject
File "/software/Python-3.7.3/lib/python3.7/site-packages/rpy2/robjects/robject.py", line 58, in
class RObjectMixin(object):
File "/software/Python-3.7.3/lib/python3.7/site-packages/rpy2/robjects/robject.py", line 70, in RObjectMixin
__show = _get_exported_value('methods', 'show')
File "/software/Python-3.7.3/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 28, in _
cdata = function(*args, **kwargs)
File "/software/Python-3.7.3/lib/python3.7/site-packages/rpy2/rinterface.py", line 773, in call
raise embedded.RRuntimeError(_rinterface._geterrmessage())
rpy2.rinterface_lib.embedded.RRuntimeError: Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/software/R-3.6.1/lib64/R/library/methods/libs/methods.so':
/software/R-3.6.1/lib64/R/library/methods/libs/methods.so: undefined symbol: Rf_allocS4Object

Results visualization

P values >0.05 not being filtered

Hey,

Great application, I'm just worried that many of the interactions in my results files have P values over 0.05 for all cluster-cluster comparisons (I've used the default statistical pipeline).

Why might this be the case? Also, could you outline the parameters an interaction needs to meet before being included in the output (the high P value interactions don't necessarily have high mean values, so the cutoffs used don't seem intuitive).

Thank you for any information!

heatmap_plot

Hi,

thanks for the great and very useful tool!

However, I have a problem when I try to generate the heatmap_plot. I get the following error:

Error: Missing argument "meta-path".

Thanks for your help!

Best

Stephan

large sparse matrix as input

Hi, thanks for the great tool!
would it be possible to accept large matrices in the mtx format instead of txt? or did you think of a fast routine to convert sparse matrices to an input that the tool would accept?
I am writing my large matrix to txt using a mix of R and numpy/pandas but it takes forever so maybe I'm doing something wrong.
thank you!
biola

--result-path in cellphonedb database generate not taken properly

Hi,

When running cellphonedb database generate --result-path, result-path is appended to the working directory, even if an absolute path is given.

Inconsistent receptor-ligand order

Thank you for curating a list of protein receptor-ligand interactions, very helpful!

I noticed that the majority of interactions are listed as LIGAND_RECEPTOR in the database, but every now and then you run across the opposite.

Ex: Most of the Wnt interactions are WNT_FZD, but a few like FZD1_WNT3A are reversed.

Is it possible to make these consistent in a future release? Or Is there a simple way to pass a custom interaction list within the CLI?

Kyle

p-values of 0.0

TypeError: cannot do label indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [nan] of <class 'float'>

Dear all,
I ran cellphonedb with my own data but encountered this error . I have checked the format of my input data but found no difference with the test example.
Thank you!

output gene name format

Hi!

Awesome software! Thank you!

I was wondering what is the format of the output gene name. I noticed that some are gene symbols some are not (i.e. FcRn complex)

Some details about the CellphoneDB

Hi,all
I have used the cellphonedb and I think it is a great tool!
But I have some question about the cellphonedb.
1, How do you use the permutation test to calculate the significant intercellular interactions?
2, In the pvalue.txt , Why is interaction clusterA_clusterB different from interaction clusterB_clusterA?
3, In the interacting_pair of pvalue.txt, can I know which one is the receptor and which one is the receptor? And how can I distinguish
the receptor – ligand?
4, Could you build a database of intercellular interactions in mice?
Thanks!

How to calculate the forceNetwork between cells based on the results of cellphonedb？

How to calculate the forceNetwork between cells based on the results of cellphonedb

Filtering the results and visualization

Hello,
I used CellPhoneDB for my data and I want to visualize it in Cytoscape. I prepared a proper network file but it is too big to have a proper visualization. I used significant_means file as the significant results and used it. Network has 29 nodes and 12310 edges and the node with the lowest degree has 258 degrees. How can I filter these results? I've seen in the publication you said "edges with more than 30 interactions" so basically you removed the interacting pairs below 30?
I have looked for documentation for results but could not find any information about this value. What does it represent actually?

Thank you in advance

different results between subsampled data and the complete data

Hi,
I recently tried cellphone DB on my subsampled data and the complete data and the results are different.

Here are the command I used for each:
cellphonedb method statistical_analysis /home/yxiao832/SeuratObject/NC_NASH_nuclei_3_9merge/metadata.txt /home/yxiao832/SeuratObject/NC_NASH_nuclei_3_9merge/Count_HumanID.txt --threads 12 --subsampling --subsampling-log false --subsampling-num-cells 3000

cellphonedb method statistical_analysis /home/yxiao832/SeuratObject/NC_NASH_nuclei_3_9merge/metadata.txt /home/yxiao832/SeuratObject/NC_NASH_nuclei_3_9merge/Count_HumanID.txt --threads 12

I can also send you the result if necessary.

Thank you!
Yang

ValueError: You are trying to merge on int64 and object columns

Hello, I am getting this error when I am trying to run cellphonedb with our dataset. I am not sure what this error means and what is triggering it. Any comments/help would be appreciated. Thanks.

[ ][CORE][21/03/19-16:39:15][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][APP][21/03/19-16:39:15][INFO] Launching Method cpdb_analysis_local_method_launcher
[ ][APP][21/03/19-16:39:15][INFO] Launching Method _set_paths
[ ][APP][21/03/19-16:39:15][INFO] Launching Method _load_meta_counts
[ ][CORE][21/03/19-16:53:50][INFO] Launching Method cpdb_method_analysis_launcher
[ ][CORE][21/03/19-16:53:50][INFO] Launching Method _counts_validations
[ ][CORE][21/03/19-16:54:41][INFO] [Non Statistical Method] Threshold:0.1 Precission:3
[ ][CORE][21/03/19-16:54:41][INFO] Running Simple Prefilters
[ ][CORE][21/03/19-16:54:57][INFO] [Non Statistical Method] Threshold:0.1 Precision:3
[ ][CORE][21/03/19-16:54:57][INFO] Running Complex Prefilters
[ ][APP][21/03/19-16:54:57][ERROR] Unexpected error
Traceback (most recent call last):
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", li$
result_precision,
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 84, in cpdb_analysis_local_method_launcher
result_precision)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 92, in cpdb_method_analysis_launcher
result_precision)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_method.py", line 29, in call
result_precision)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_complex_method.py", line 25, in call
complex_compositions)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_complex_method.py", line 290, in prefilters
counts_multidata = cluster_counts_filter.filter_by_gene(counts, genes)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/cellphonedb/src/core/models/cluster_counts/cluster_counts_filter.py", line 11, in filter_by_gene
clusters_filtered = pd.merge(cluster_counts, genes, left_on='gene', right_on=right_column)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 61, in merge
validate=validate)
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 555, in init
self._maybe_coerce_merge_keys()
File "/ihome/crc/install/python/anaconda3.7-5.3.1_genomics/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 983, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on int64 and object columns. If you wish to proceed you should use pd.concat

Generate customized database

Dear developers:
I was wondering that is that possible to generate my own database if I only have gene interaction table.
It seems that cellphonedb database generate needs four data files.
I think I can infer protein table and gene table by querying database like Uniprot.
So the question might be how to generate complex table for cellphonedb, or is that table necessary ?

Heatmap order of cells

Hi!

Is there a way to tell cellphoneDB a specific way to order the heatmap rows and columns? I'm trying to compare two datasets and it always generates heatmaps that have some hierarchical clustering done on them.

Similarly, what is the output file being parsed into pheatmap? I can also just plot that object onto pheatmap to get the order of the columns and rows in a specific way.

Thanks!

Lorenz

Output format

All the documentation claims that the output files are in .csv format, yet all the defaults are set to .txt and the actual contents of all the output files are tab-separated.

[ERROR] Invalid Counts data

Hey there! I am new to cellphonedb, I have tried to export my normalised count data into the same format as the example you have then run the statistical_analysis method on my meta and count files, but I keep getting the error: Invalid Counts data, I have called count data from my Seurat object, then wrote a txt table:

human_count <- human.Seurat@assays$RNA@counts %>% as.data.frame() %>% rownames_to_column() %>% rename(Gene = rowname)

write.table(human_count,"human_count_integrated_30PC.txt",sep="\t")

Is there anything I did wrong? Could you suggest a solution?

input cell labels as integers

When the meta and counts file have cell labels that are integers e.g. range(0, 2535), I get the following error:

[ERROR] Invalid Counts data: Some cells in meta didnt exist in counts columns. Maybe incorrect file format.

I checked that the labels in meta exactly match the labels in the counts columns. Replacing the labels with strings e.g. cell0, cell1. fixes this issue. There might be some type problem in parsing the input?

Error when running cellphonedb on too many cells

Hi,

I'm receiving the following unexpected error in the 'Running Statistical Analysis' step:

File "/Linux/redhat_7_x86_64/pkgs/anaconda3_5.3.1/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result:
...
...
[1364 rows x 3481 columns]]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647")'

This error only occurs when I am running cellphonedb on larger datasets.

Thank you for your help!

index and meta process error

Thanks for your amazing work, I was using cellphonedb to analyze my dataset and encouter erros below:
File "/data5/lijiaming/tools/cpdb-venv/lib/python3.7/site-packages/pandas/core/indexing.py", line 2009, in _valie_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

During handling of the above exception, another exception occurred:

...
File "/data5/lijiaming/tools/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/preprocessors/method_precessors.py", line 35, in meta_preprocessor
raise ProcessMetaException
cellphonedb.src.core.exceptions.ProcessMetaException.ProcessMetaException: Error processing Meta data

I think the format of my file is just like the test file, and the numbers of row of meta file and column of count file are consistent.
wonder if there any solution?

How to use user defined ligand-receptor datasets

Hi CellphoneDB team,

I want to infer cell-cell communication with my owe database. I try to use the code below.

cellphonedb database generate --user-interactions your_custom_interaction_file.csv --user-interactions-only

So what kind of csv file should I input? Could you show me an example of this csv file. Thank you very mauch!

Best

complex prefileter error

Hi,

Thanks for creating the tool! I am trying to pipe my expression and metadata from R to cellphonedb. But no matter how I transformed the expression data (raw, downsample, downsample and log-transformed) it seems to stuck at prefilter step of cellphonedb analysis. Could you help me to pin down where the error is coming from?

Thank you!
Yuqi
###################
This is how I format the data for output in R:
dim(sampTab_db)
[1] 266 2

dim(expDat)
[1] 56391 266

write.table(sampTab_db[,c("Cell", "cell_type")], file = "sampTab.txt",row.names=FALSE, sep = "\t")
write.table(data.frame("Gene" = rownames(expDat), expDat), file = "expDat.txt", row.names = FALSE, sep = "\t")

###################
here is my cellphonedb output
(cpdb-venv) MacBook-Pro:test YT$ cellphonedb method analysis sampTab.txt expDat.txt
[ ][APP][24/09/19-10:25:52][WARNING] Latest local available version is v2.0.0, using it
[ ][APP][24/09/19-10:25:52][WARNING] User selected downloaded database v2.0.0 is available, using it
[ ][CORE][24/09/19-10:25:52][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][24/09/19-10:25:52][INFO] Using custom database at /Users/YT/.cpdb/releases/v2.0.0/cellphone.db
[ ][APP][24/09/19-10:25:52][INFO] Launching Method cpdb_analysis_local_method_launcher
[ ][APP][24/09/19-10:25:52][INFO] Launching Method _set_paths
[ ][APP][24/09/19-10:25:52][INFO] Launching Method _load_meta_counts
[ ][CORE][24/09/19-10:25:54][INFO] Launching Method cpdb_method_analysis_launcher
[ ][CORE][24/09/19-10:25:54][INFO] Launching Method _counts_validations
[ ][CORE][24/09/19-10:25:54][INFO] [Non Statistical Method] Threshold:0.1 Precission:3
[ ][CORE][24/09/19-10:25:54][INFO] Running Simple Prefilters
[ ][CORE][24/09/19-10:25:54][INFO] [Non Statistical Method] Threshold:0.1 Precision:3
[ ][CORE][24/09/19-10:25:54][INFO] Running Complex Prefilters
[ ][APP][24/09/19-10:25:54][ERROR] Unexpected error
Traceback (most recent call last):
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", line 207, in analysis
subsampler,
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 98, in cpdb_analysis_local_method_launcher
subsampler)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 113, in cpdb_method_analysis_launcher
result_precision)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_method.py", line 35, in call
result_precision)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_complex_method.py", line 29, in call
complex_compositions, counts_data)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_complex_method.py", line 329, in prefilters
complexes, complex_compositions)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_analysis_complex_method.py", line 391, in get_involved_complex_from_counts
drop_duplicates=False)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/models/complex/complex_helper.py", line 11, in get_involved_complex_from_protein
right_on='id_multidata')
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 61, in merge
validate=validate)
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 551, in init
self.join_names) = self._get_merge_keys()
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 857, in _get_merge_keys
rk, stacklevel=stacklevel))
File "/Users/YT/Desktop/test/cpdb-venv/lib/python3.7/site-packages/pandas/core/generic.py", line 1382, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'id_multidata'

Input format error??

Hi ,
Thank you very much for this package. After converting my anndata (scanpy) to the count file and meta file using your script @ https://www.cellphonedb.org/faq-and-troubleshooting I then run cellphonedb as shown below and got an error I can't understand the problem. Any help would be greatly appreciated.

Thanks

$ cellphonedb method statistical_analysis meta_8w_I_ES.txt count_8w_I_ES.txt
/Users/tommy/cpdb-venv/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.cluster.k_means_ module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.cluster. Anything that cannot be imported from sklearn.cluster is now part of the private API.
warnings.warn(message, FutureWarning)
[ ][APP][28/01/20-10:27:22][WARNING] Latest local available version is v2.0.0, using it
[ ][APP][28/01/20-10:27:22][WARNING] User selected downloaded database v2.0.0 is available, using it
[ ][CORE][28/01/20-10:27:22][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][28/01/20-10:27:22][INFO] Using custom database at /Users/tommy/.cpdb/releases/v2.0.0/cellphone.db
[ ][APP][28/01/20-10:27:22][INFO] Launching Method cpdb_statistical_analysis_local_method_launcher
[ ][APP][28/01/20-10:27:22][INFO] Launching Method _set_paths
[ ][APP][28/01/20-10:27:22][INFO] Launching Method _load_meta_counts
[ ][CORE][28/01/20-10:27:23][INFO] Launching Method cpdb_statistical_analysis_launcher
[ ][CORE][28/01/20-10:27:23][INFO] Launching Method _counts_validations
[ ][CORE][28/01/20-10:27:23][INFO] [Cluster Statistical Analysis Simple] Threshold:0.1 Iterations:1000 Debug-seed:-1 Threads:4 Precision:3
[ ][CORE][28/01/20-10:27:23][INFO] Running Simple Prefilters
[ ][CORE][28/01/20-10:27:24][INFO] Running Real Simple Analysis
[ ][APP][28/01/20-10:27:24][ERROR] Unexpected error
Traceback (most recent call last):
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", line 144, in statistical_analysis
subsampler,
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 64, in cpdb_statistical_analysis_local_method_launcher
subsampler
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 75, in cpdb_statistical_analysis_launcher
self.separator)
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_method.py", line 34, in call
result_precision,
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_simple_method.py", line 38, in call
cluster_interactions = cpdb_statistical_analysis_helper.get_cluster_combinations(clusters['names'])
File "/Users/tommy/cpdb-venv/lib/python3.7/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_helper.py", line 134, in get_cluster_combinations
return sorted(itertools.product(cluster_names, repeat=2))
TypeError: '<' not supported between instances of 'float' and 'str'

dot plot

Dear All,
I am facing some problems producing the dot plot after running my cellphone to my single cell data.

1)default option --> pdf
File is opened in adobe, but it warns that is to big and so is truncated

option jpeg and png
rror in grDevices::png(..., res = dpi, units = "in") :
unable to start device 'png'

R[write to console]: Warning message:
R[write to console]: In grDevices::png(..., res = dpi, units = "in") :
R[write to console]:
R[write to console]: cairo error 'invalid value (typically too big) for the size of the input (surface, pattern, etc.)'
[ ][APP][04/12/19-10:53:50][ERROR] R Runtime Exception: Error in grDevices::png(..., res = dpi, units = "in") :
unable to start device 'png'

How can I resolve these issues?
Is there a way to save to ggplot2 file?

Many Thanks
Paolo

Having trouble using the function cellphonedb plot

**It's my first time to use cellphonedb. I have no problem using cellphonedb method.
While I was using cellphonedb plot with the code "cellphonedb plot dot-plot "

It showed errors as follows:**

R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

As there is no R environment set up, some functionalities will be disabled, e.g. plot
R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

You cannot perform this plot command unless there is a working R setup according to CellPhoneDB specs

So what can I do with this situation? I have already installed R (version 3.6.1)
Thanks~

ValueError: cannot reindex from a duplicate axis

Dear all,
I try to use the package with my own data but I keep encountering this error while running "cellphonedb method statistical_analysis clusters.txt matrice.txt".

I checked my files and they are concordant to the example files, when I run a head file.txt on the terminal, the only difference is that my cell type are numerics, corresponding to the clusters numbers, could it be the issue?

It happens at the Building Simple results step.

Thank you!

Document 3: significant mean. (significant_mean.csv)

Hi all,
What's the 'rank' mean in output file 3(Document 3: significant mean. (significant_mean.csv))?
Thank you.

Mouse databases?

Hi there!

I found your manuscript and the database you've designed to be truly amazing!

I'm however confused. Is the current database available only for human data? One should build a similar database according to your instructions in order to do the analysis for mouse data?

id_cp_interaction not unique

When I run the cellphonedb program locally, the id_cp_interaction term CPI-SS027556893 is present twice in the output results.

Input data

Hi,

Thanks for the awesome tool!

I noticed that the text input data is count.
Was wondering if TPM data will work the same?

Thanks!

fail to install cellphonedb

when I use "pip" for installing cellphonedb in Win7 or Win10, I just encountered a problem referring to "rpy2" package.

WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/sqlalchemy/
ERROR: Command errored out with exit status 1:
command: 'D:\Users\Ming\PycharmProjects\untitled\venv\Scripts\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Ming\AppData\Local\Temp\pycharm-packaging\rpy2\setup.py'"'"'; file='"'"'C:\Users\Ming\AppData\Local\Temp\pycharm-packaging\rpy2\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Ming\AppData\Local\Temp\pycharm-packaging\rpy2\pip-egg-info'
cwd: C:\Users\Ming\AppData\Local\Temp\pycharm-packaging\rpy2
Complete output (92 lines):
warning: no previously-included files found matching 'setup.pyc'
warning: no previously-included files matching 'yacctab.' found under directory 'tests'
warning: no previously-included files matching 'lextab.' found under directory 'tests'
warning: no previously-included files matching 'yacctab.' found under directory 'examples'
warning: no previously-included files matching 'lextab.' found under directory 'examples'
warning: build_py: byte-compiling is disabled, skipping.

warning: install_lib: byte-compiling is disabled, skipping.

zip_safe flag not set; analyzing archive contents...

Installed c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NaN' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NaReal' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NaInt' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NaString' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_BlankString' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_BlankScalarString' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_GlobalEnv' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_EmptyEnv' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_BaseEnv' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_BaseNamespace' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_BaseNamespaceRegistry' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NilValue' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_UnboundValue' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_MissingArg' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_ClassSymbol' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_NameSymbol' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py:164: UserWarning: Global variable 'R_DimSymbol' in cdef(): for consistency with C it should have a storage class specifier (usually 'extern')
  "(usually 'extern')" % (decl.name,))
Traceback (most recent call last):
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py", line 305, in _parse
    ast = _get_parser().parse(fullcsource)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\c_parser.py", line 152, in parse
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\ply\yacc.py", line 331, in parse
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\ply\yacc.py", line 1199, in parseopt_notrack
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\ply\yacc.py", line 193, in call_errorfunc
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\c_parser.py", line 1848, in p_error
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\pycparser-2.19-py3.7.egg\pycparser\plyparser.py", line 67, in _parse_error
pycparser.plyparser.ParseError: <cdef source string>:23:5: before: blah1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Ming\AppData\Local\Temp\pycharm-packaging\rpy2\setup.py", line 184, in <module>
    'rpy2': ['doc/source/rpy2_logo.png', ]}
  File "D:\Users\Ming\PycharmProjects\untitled\venv\lib\site-packages\setuptools-39.1.0-py3.7.egg\setuptools\__init__.py", line 129, in setup
  File "D:\Users\Ming\Anaconda3\lib\distutils\core.py", line 108, in setup
    _setup_distribution = dist = klass(attrs)
  File "D:\Users\Ming\PycharmProjects\untitled\venv\lib\site-packages\setuptools-39.1.0-py3.7.egg\setuptools\dist.py", line 363, in __init__
  File "D:\Users\Ming\Anaconda3\lib\distutils\dist.py", line 292, in __init__
    self.finalize_options()
  File "D:\Users\Ming\PycharmProjects\untitled\venv\lib\site-packages\setuptools-39.1.0-py3.7.egg\setuptools\dist.py", line 519, in finalize_options
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\setuptools_ext.py", line 217, in cffi_modules
    add_cffi_module(dist, cffi_module)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\setuptools_ext.py", line 49, in add_cffi_module
    execfile(build_file_name, mod_vars)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\setuptools_ext.py", line 25, in execfile
    exec(code, glob, glob)
  File "rpy/_rinterface_cffi_build.py", line 546, in <module>
    """ if os.name == 'nt' else ''
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\api.py", line 112, in cdef
    self._cdef(csource, override=override, packed=packed, pack=pack)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\api.py", line 126, in _cdef
    self._parser.parse(csource, override=override, **options)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py", line 358, in parse
    self._internal_parse(csource)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py", line 363, in _internal_parse
    ast, macros, csource = self._parse(csource)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py", line 307, in _parse
    self.convert_pycparser_error(e, csource)
  File "c:\users\ming\appdata\local\temp\pycharm-packaging\rpy2\.eggs\cffi-1.13.2-py3.7-win-amd64.egg\cffi\cparser.py", line 336, in convert_pycparser_error
    raise CDefError(msg)
cffi.CDefError: cannot parse "blah1 ReadConsole;"
<cdef source string>:23:5: before: blah1
----------------------------------------

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

What should I do and thank you so much!

All lines from the Counts file were filtered

Dear Developer:
I am trying "cellphonedb" on my single cell data. I met a problem when I wanted to test the small data provided on your webpage. "All the lines were removed". I can not figure out what the problem is because I just put the test data into the two slots and choose "ensembl" . There seems like a simple problem I don't notice. I also tried my own data(much more cells), the problem was the same.
Looking forward to your reply. Thank you.

All Genes Filtered

Hello everyone,
I am using a single cell RNA-Seq data (16388 genes and 2887 cells). My data is a mouse data and my genes are in the GeneSymbol format and the counts were normalized in Scater/Scran packages.

When I use CellPhoneDB locally and in web application, I have the error: "All counts filtered: Are you using human data?"

What can be the main reason for this error? Do CellPhoneDB use only ENSEMBL symbols?

Thank you

Discrepancy between documentation and code, option --threshold

I really like the paper and this solution that you have done.
I found a discrepancy between the 'Readme.md' file, documentation (https://www.cellphonedb.org/documentation) and the code regarding the setting of the threshold of cells expressing a gene.

The threshold is stated in the documentation as the percentage (%) of cells but in code it is only accepted as a value between 0 and 1.

Fixing the documentation to fraction would be the solution.

Thank you.


class ThresholdValueException(Exception):
    def __init__(self, threshold_value):
        super(ThresholdValueException, self).__init__(
            'Threshold value ({}) is not valid. Accepted range: 0<=threshold<=1'.format(threshold_value))

https://github.com/Teichlab/cellphonedb/blob/master/cellphonedb/src/core/exceptions/ThresholdValueException.py

--- ERROR ---

--threshold: % of cells expressing a gene
[ ][APP][18/07/19-11:20:37][ERROR] Threshold value (20.0) is not valid. Accepted range: 0<=threshold<=1

Error: Result is empty

Hi,
when I run the statistical analysis in cellphonedb, I get the following error message: Result is empty. My data comes from an anndata object out of scanpy. Many thanks in advance for your help.

Technical data:
Windows 10
Python 3.6.8
Cellphonedb v.2.0.0

Minimum number of cells in a cluster

Hi. I appreciate this great tool that you have developed.
I understand that the threshold can be set to only take into account ligand-receptor interactions if they are expressed in set % of cells.
I was wondering what is the minimum (if at all) number of cells in a cluster for an enriched ligand-receptor pairs to be meaningful? Hypothetically, if a cluster has 5 cells, would an enriched ligand-receptor pair concerning that cluster be valid at all? What cell number in a cluster would you consider acceptable to add confidence to a validity of the results? Thank you very much!

Unable to save dotplot

Hello,

I am trying to use the dot_plot function but I am unable to save the plot. The plot appears for a quick second on my screen but disappears after that. The output directory gets created with the specified name but the directory has nothing in it.

Please let me know how to fix this and if I am missing anything from the documentation.

Thanks.

analysis on imputed data

Hi,

really nice and useful tool. I was wondering whether you tried or have any recommendation with respect to using imputed/denoised data as input.
If no, what is the count threshold upon which cellphonedb filter % of cells expressing a gene? Is it just 0 (anything above is considered expressed) or something else (e.g. log(2))?

Thank you!
Giovanni

Document 1: p-value (pvalues.csv)

Hi all,
Great tool!
Why the range of p-value in paper Single-cell reconstruction of the early maternal–fetal interface in humans are 0, 1, or 0-0.9(Supplementary Table 4 - List of interactions in the placenta dataset). There were no values such as 0.02, 0.001 and so on. But in Fig.5A, the legend "-log10(P-value)" includes 0, 1, 2, >=3.
https://www.nature.com/articles/s41586-018-0698-6
Thanks

ValueError: You are trying to merge on float64 and object columns.

Hi, I'm also getting:

ValueError: You are trying to merge on float64 and object columns. If you wish to proceed you should use pd.concat

I'm using cellphonedb==2.1.1 and I get the same error after followed your instructions as above; any ideas?

Originally posted by @cartal in #21 (comment)

Prioritize results

Hi all,

I ran the cellphoneDB analysis with the default parameters and there were lot of interactions with p-value equal to 0. My question is how do I further prioritize this list. Should I do more than 1000 permutations so that I can get p-values with greater precision or should I use the means values to sort the list of significant interactions.

Any thoughts.

Thank you very much.