GithubHelp home page GithubHelp logo

jacoblevine / phenograph Goto Github PK

View Code? Open in Web Editor NEW
132.0 132.0 68.0 184 KB

Subpopulation detection in high-dimensional single-cell data

Home Page: http://www.c2b2.columbia.edu/danapeerlab/html/phenograph.html

License: MIT License

Python 100.00%

phenograph's People

Contributors

jacoblevine avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phenograph's Issues

python2 version?

Any plans for a python2 version? If not, any thoughts on the best way to integrate with python2 code? Thanks!

Louvain sources unavailable

Would it be possible to include the sources of the Louvain subdirectory? Also, what is the license of the modified Louvain sources? Will they be released under a free software license like the rest of your code?

As it stands your software cannot be built from source, so it cannot be packaged for GNU Guix, which we use at our institute.

resolution parameter

In the R implementation, there is a resolution parameter for "Value of the resolution parameter, use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities (used only for leiden and louvian 2 or 3 methods)". I wonder if there is a parameter here with similar function? Thanks.

Upload to PyPi

It would be more accessible if phenograph was available on PyPi, so that we could install it with

pip install --user phenograph

New computer, new error message

I recently got the newest mac and ever since this I have had an issue opening the heat map in shiny app, I get the error: "cannot open file 'cytofkit_shinyAPP_marker_heatmap_plot.pdf" I have tried loading the data into the shiny app on other computers, and it has worked. This has also occured with three different data sets. Thanks!

GPU-boosted implementation of PhenoGraph

I'm writing to share my GPU-boosted implementation of PhenoGraph. Instead of using the CPU-bound libraries numpy, scipy.sparse, and sklearn as in the legacy implementation, I use the GPU-bound libraries cupy, cupyx.sparse, and cudf/cuml from NVIDIA's RAPIDS library to reduce execution time by orders of magnitude for large datasets. For especially large datasets or dataset compilations (~3 million cells x 50 features), the kNN search can be distributed to multiple GPUs, if they are available. For a synthetic dataset of 1 million cells x 30 features, the CPU implementation executes in ~6 hours, whereas the GPU implementation run on a single V100 GPU executes in ~40 seconds (~500-fold speed-up):

benchmark

Modularity is comparable between GPU and CPU implementations:

tsne

Please feel free to link to the repo if interested: https://gitlab.com/eburling/grapheno

Thanks and sorry for the spam! I hope the community finds it useful.

Source code for modified Louvain?

Hello,

Thanks a lot for your work! As a python user, I have found it is way easier to call the Louvain code using your code base than to use their provider Python package (which is quite slow).

The phenograph readme mentions that you use a modified version of Louvain community detection. Any chance you can make the source code for this modified version available? I need to make a small change to make the results deterministic, but I am not a C++ programmer so it would really help to start with what you already have rather than try to replicate your described modification from scratch.

Reproducibility problem (cluster number )

Hi,

I ran Phenograph 4 times using the same input matrix and I have noted that the results is different in term of output number of cluster.

How I can set the parameters for reproduce (or modify the seed) the clustering results of Phenograph (or to have at list a similar results)?

Permission denied at Louvein

Hi I think there's an error in .
When I run a test:

import numpy as np
import phenograph

tmp = np.random.rand(100,10)
communities, graph, Q = phenograph.cluster(tmp)

I get a permission error. It might be due for this: This happens if you are trying to open a file, but your path is a folder.

PermissionError Traceback (most recent call last)
in
6
7 tmp = np.random.rand(100,10)
----> 8 communities, graph, Q = phenograph.cluster(tmp)

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/cluster.py in cluster(data, clustering_algo, k, directed, prune, min_cluster_size, jaccard, primary_metric, n_jobs, q_tol, louvain_time_limit, nn_method, partition_type, resolution_parameter, n_iterations, use_weights, seed, **kargs)
348 communities, Q = "", ""
349 if clustering_algo == "louvain":
--> 350 communities, Q = run_louvain(graph, q_tol, louvain_time_limit)
351
352 elif clustering_algo == "leiden":

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/cluster.py in run_louvain(graph, q_tol, louvain_time_limit)
162 uid = uuid.uuid1().hex
163 graph2binary(uid, graph)
--> 164 communities, Q = runlouvain(uid, tol=q_tol, time_limit=louvain_time_limit)
165
166 # clean up

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/core.py in runlouvain(filename, max_runs, time_limit, tol)
259 filename + "_graph.weights",
260 ]
--> 261 p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
262 out, err = p.communicate()
263 # check for errors from convert

~/.conda/envs/pypandaenv1/lib/python3.8/subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
856 encoding=encoding, errors=errors)
857
--> 858 self._execute_child(args, executable, preexec_fn, close_fds,
859 pass_fds, cwd, env,
860 startupinfo, creationflags, shell,

~/.conda/envs/pypandaenv1/lib/python3.8/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
1704 if errno_num != 0:
1705 err_msg = os.strerror(errno_num)
-> 1706 raise child_exception_type(errno_num, err_msg, err_filename)
1707 raise child_exception_type(err_msg)
1708

PermissionError: [Errno 13] Permission denied: '/udd/remge/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/louvain/linux-convert'

single cell RNAseq data

Does Phonograph only work for CYTOF or FLOW data? Have you tested it on single cell RNAseq data? Thank you

Windows requires Visual Studio installation?

We have noticed that PhenoGraph is only running once visual studio is installed. Unfortunately, it is not clear which packages or which dll are necessary to function.

Do you have any feedback on this problem? Is there an easy work-around?

We assume that it is somehow linked to lovain.

run on linux

Hello,

I would like to run phenograph on linux but the executable files in louvain folder are not compatible.
I also tried compiling them from the source code downloaded here https://sites.google.com/site/findcommunities/.
However, the compiled files are not working, and the process is stacked at "Running Louvain modularity optimization". Am I using wrong code of this lib?
Thanks!

Best/ Yang

Syntax error during import

Syntax error occurs while loading library:

 File "/usr/local/lib/python3.6/site-packages/phenograph/classify.py", line 45
   print("Warning: iterative solver failed to converge in at least one case", flush=True)
                                                                                   ^
SyntaxError: invalid syntax

Using Python 2.7.15 on Mac OS

Is this repo deprecated?

It appears this repository is outdated compared to the fork https://github.com/dpeerlab/PhenoGraph . The fork contains more up to date installation information (addressing, for example, #20) and has an updated codebase.

@armMSKCC @hisplan is your fork now the preferred access point for this repository? In that case it would be best if this one contains a clear marker in the readme that it is out of date.

Having an issue saving heatmaps as PDFs

Hello!

I'm having an issue with the save heatmap as a PDF feature in Shinyapp. I have uploaded the R file on a different mac computer and been able to successfully download, but it will not work on our main desktop. Any ideas? Perhaps I need a newer version of R or PhenoGraph?

parallel computing issue

Hello,

I am trying to use the parallel computing function.
I called phenograph.cluster in ipython notebook. It gives the following error when use_parallel is set True:

Launching new cluster with 8 workers
Cluster launched successfully
---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<string> in <module>()
TypeError: 'CannedFunction' object is not callable

I also tried to set dview. But it gives another error:

Neighbors computed in 0.06748223304748535 seconds
---------------------------------------------------------------------------NameError                                 Traceback (most recent call last)<string> in <module>()
/envs/py34/lib/python3.4/site-packages/ipyparallel/util.py in _pull(keys)
    264         return [eval(key, globals()) for key in keys]
    265     else:
--> 266         return eval(keys, globals())
    267 
    268 @interactive
<string> in <module>()
NameError: name 's' is not defined

Any suggestion?

Thank you in advance!

Best,
Yang

Parallel computation of Jaccard index uses up server resources. Proposed solution.

When I run the algorithm and set n_jobs I find that there are more processes than expected. I believe that this is why this happens:
File: PhenoGraph/phenograph/core.py
135: with closing(Pool()) as pool: # replace with: with closing(Pool(n_jobs)) as pool:
136: jaccard_values = pool.starmap(calc_jaccard, zip(range(n), repeat(idx)))

I can correct this in my clone of the library. Is there a reason why you can't set the maximum number of jobs for this process pool?

PhenoGraph Shiny App issue

I have run into a new issue when using the ShinyApp. I tried loading in some old PhenoGraph R data like usual, however this time I got this error message:

Warning: Error in load: empty (zero-byte) input file
Stack trace (innermost first):
67: load
66: observeEventHandler [/Library/Frameworks/R.framework/Versions/3.4/Resources/library/cytofkit/shiny/server.R#61]
2: shiny::runApp
1: cytofkitShinyAPP

Any ideas?

Phenograph seed no. and csv export

Hi,
I have been having some issues with the Windows versions 1.75 and 1.76.

  1. Phenograph: setting a seed# does not always show up and goes to running phenograph immediately after selecting the number of neighbours.
  2. Export gates as cvs rarely works and if it does it take a very long time.

I only use underscore in my folders name, so I hope it is not causing any issues related to this.

Any feedback is much appreciated!

M.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.