GithubHelp home page GithubHelp logo

sqjin / scepath Goto Github PK

View Code? Open in Web Editor NEW
15.0 3.0 10.0 8.25 MB

An energy landscape-based approach for measuring developmental states and inferring cellular trajectories from single cell RNA-seq data

MATLAB 28.48% R 0.77% C++ 69.63% M 0.04% Makefile 0.22% Fortran 0.86%

scepath's Introduction

scEpath

Package of scEpath (a novel tool for analyzing single cell RNA-seq data)

Overview

This is a MATLAB Package of scEpath ("single-cell Energy path"). scEpath is a novel computational method for quantitatively measuring developmental potency and plasticity of single cells and transition probabilities between cell states, and inferring lineage relationships and pseudotemporal ordering from single-cell gene expression data. In addition, scEpath performs many downstream analyses including identification of the most important marker genes or transcription factors for given cell clusters or over pseudotime.

The rational of scEpath for inferring cellular trajectories is based on the famous Waddington's landscape metaphor for describing the cellular dynamics during the development. Below is a conceptual illustration from a paper (Takahashi et al. Development, 2015)

Check out our paper (Jin et al. Bioinformatics, 2018) for the detailed methods and applications. Below is the overview of scEpath.

Overview of scEpath

Systems Requirements

scEpath is independent of operating systems because it is written in Matlab. Basic requirement for running scEpath includes MATLAB and the Statistics toolbox. The pseudotime estimation step requires the R package "princurve" for principal curve analysis. In this case, both R and Matlab are required for running scEpath.

This Package has been tested using MATLAB 2016a/b/2017a on Mac OS/64-bit Windows.

Usage

Unzip the package. Change the current directory in Matlab to the folder containing the scripts.

This directory includes the following main scripts:

  1. scEpath_demo.m -- an example run of scEpath on a specific dataset
  2. preprocessing.m -- do preprocessing of the input data (if applicable)
  3. constructingNetwork.m -- construct a gene-gene co-expression network
  4. estimatingscEnergy.m -- estimate the single cell energy (scEnergy) for each cell
  5. ECA.m -- prinpipal component analysis of energy matrix
  6. clusteringCells.m -- perform unsupervised clustering of single cell data
  7. addClusterInfo.m -- integrate clustering information
  8. inferingLineage.m -- infer the cell lineage hierarchy
  9. FindMDST.m -- find the minimal directed spanning tree in a directed graph
  10. inferingPseudotime.m -- reconstruct pseudotime
  11. smootheningExpr.m -- calculating the smooth version of expression level based on pseudotime
  12. identify_pseudotime_dependent_genes.m -- identify pseudotime dependent marker genes
  13. identify_keyTF.m -- identify key transcription factors responsible for cell fate decision

  1. cluster_visualization.m -- visualize cells on two-dimensional space
  2. lineage_visualization.m -- display cell lineage hierarchy with transition probability
  3. scEnergy_comparison_visualization.m -- comparison of scEnergy among different clusters
  4. landscape_visualization -- display energy landscape in 2-D contour plot and 3-D surface
  5. plot_genes_in_pseudotime.m -- plot the temporal dynamics of individual gene along pseudotime
  6. plot_rolling_wave.m -- create "rolling wave" showing the temporal pattern of pseudotime-dependent genes and display gene clusters showing similar patterns
  7. plot_rolling_wave_TF.m -- create "rolling wave" showing the temporal pattern of key transcription factors

For each run, the final results of the analysis are deposited in the "results" directory:

  1. results/figures, containing PDF figures of the analysis.
  2. results/PDG_in_each_cluster, containing the identified pseudotime-dependent marker genes in each cluster
  3. results/temporalfiles, containing intermediate results from the analysis.

Please refer to scEpath_demo.m for instructions on how to use this code. Input Data are gene expression data matrix (rows are genes and columns are cells).

If you have any problem or question using the package please contact [email protected]

scepath's People

Contributors

sqjin avatar

Stargazers

 avatar axoaxonic avatar  avatar  avatar Chengyu, Li avatar melvin luo avatar Ying avatar Zizhang Li avatar Song Feng avatar Huan Yang avatar  avatar Yale Liu avatar  avatar Lingyi avatar Eric avatar

Watchers

James Cloos avatar suph avatar  avatar

scepath's Issues

Error in FindMDST function

Whenever I try running the scePathDemo file, I get an error when running the FindMDST function.

The error is: Undefined function 'incidence_to_3n' for input arguments of type 'double'.

Any ideas why this might be happening?

asking for running SIMLR

hi,
I am sorry, I not good at matlab. because my cell number are with more than 3,000 cells, so I want to use the the function called "SIMLR_LARGE". I change the "SIMLR_LARGE" to "SIMLR_LARGE" in the clusteringCells.m, but I run the scEpath_demo.m that had an error the function "mex_top_eig_svd" can not identify. would be very grateful indeed for any help you could give me.

scEpath R package

Hello there,

scEpath looks like a really nice package to go hand-in-hand with a psuedotime package like Monocle 2. I came across your package while reading a recent paper:

https://www.nature.com/articles/s41467-018-08247-x

I'm really interested in trying out scEpath for my data. However, since the package is in Matlab, it seems hard to try out quickly, as I'm hesitant that it may not integrate well with the rest of my R-based single-cell pipeline.

By any chance, is this available in R? Or, do you plan to bring it out in R anytime soon?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.