GithubHelp home page GithubHelp logo

dendrosplit's Introduction

DendroSplit

This repository provides the full source code for the DendroSplit framework described in the paper "An Interpretable Framework for Clustering Single-Cell RNA-Seq Datasets" by Zhang, Fan, Fan, Rosenfeld, and Tse. It also contains the scripts necessary for reproducing the results in the paper. Please see this Bitbucket repository for the version of the package used and maintained by BD Genomics.

Overview

In our paper we analyzed 9 publicly available single-cell RNA-Seq datasets:

  1. Biase et al.: paper, data
  2. Yan et al.: paper, data
  3. Pollen et al.: paper, data
  4. Kolodzieczyk et al.: paper, data
  5. Patel et al.: paper, data
  6. Zeisel et al.: paper, data
  7. Macosko et al.: paper, data
  8. Birey et al.: paper, data
  9. Zheng et al.: paper, data

We also analyzed some synthetic datasets. Please see the Jupyter notebooks in the Figures directory for the code used to reproduce all the figures in the paper. Some wrapper code used in the notebooks is also provided. For each dataset, processing requires 4 inputs which are saved in directory DATAPREFIX/ as:

  1. DATAPREFIX_expr.txt (or DATAPREFIX_expr.h5 for larger datasets): a matrix of gene/transcript expression values where the rows correspond to cells and the columns correspond to features
  2. DATAPREFIX_labels.txt: a set of labels for all the cells
  3. DATAPREFIX_features.txt: a set of feature names
  4. DATAPREFIX_reducedim_coor.txt: a 2D representation of the data for visualizing results

Dependencies

DendroSplit is written in Python 2.7 has the following dependencies (Python modules):

  • numpy (1.12.1)
  • scipy (0.19.0)
  • matplotlib (1.5.3)
  • sklearn (0.18.1)
  • networkx (1.11)
  • community

The tutorial Jupyter notebook also uses tsne (0.1.7) and pandas (0.20.1) for preparing the example data.

Instructions

DendroSplit can be installed via pip:

pip install dendrosplit

Import DendroSplit by adding the following line of code to your Python script:

from dendrosplit import split, merge, utils

A tutorial for using the main DendroSplit functions is given in the tutorial Jupyter notebook. Please refer to the Jupyter notebooks used to generate the figures in the paper for more examples.

License

DendroSplit is licensed and distributed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

Method

method

dendrosplit's People

Contributors

jessemzhang avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.