GithubHelp home page GithubHelp logo

gladelephant / cellrank_velocity Goto Github PK

View Code? Open in Web Editor NEW

This project forked from junhaobearxiong/velocity

0.0 0.0 0.0 283.08 MB

RNA velocity on (time-course) single-cell expression data

Python 100.00%

cellrank_velocity's Introduction

RNA velocity on (time-course) single-cell expression data

Intro

See the slides here for the workflow and results of an example analysis.

Preparation

First, follow documentation on scanpy (link) and scvelo (link) to install the packages and their dependencites. You can also install cellrank (link), but it's not necessary for running most of the scripts.

This repository was built with the following version of the main packages:

cellrank==1.2.0
scanpy==1.7.0
scvelo==0.2.3

Then, create the following directories under the main directory: data/ (where the main input data should be saved, and the preprocessed data will also be saved here) and figures/ (where the figures will be saved).

Workflow

Below is a standard workflow for preprocessing, estimating and analyzing RNA velocity on (time-course) single-cell expression data. It is largely based on the tutorial on the scvelo website.

Run velocyto

Starting with the bam files, we first need to run the command line interface (CLI) of velocyto to generate unspliced/spliced count matrices (see tutorial for many more details). For the example analysis, I first used samtools to sort each bam files by cell barcodes (as required by velocyto), then use the run command of the CLI. Users can modify sort_bam.sh and run_velocyto.sh for this step. Note that you may need to create additional directories, change file names, download additional files, etc.

Preprocess data before running scvelo

For the example analysis, since I have a loom file for each round, collection, collection day, I need to combine them first. This is easily done with the loompy package. Users can modify combine_loom.py to combine all the relevant loom files into a single loom file.

Then, I filtered out certain cells by taking the intersection between the loom file and an already-filtered, annotated expression file. This also adds annotations of cell types and time points to the loom files. I also filtered out genes, normalized the data, computed neighbours and moments, estimated PCA and UMAP, etc. Users can modify filter_cells_genes.py for this step.

Estimate velocity with scvelo

Taking the output from filter_cells_genes.py, users can run estimate_velocity.py to estimate RNA velocity.

Analyze velocity

It's often useful to project and visualize velocity on a lower-dimensional embedding. project_velocity.py generates such figures.

There are many other analyses one can perform with the velocity estimates using scvelo. Some of these visualizations can be done with intepret_velocity.py by feeding in the appropriate command line arguments.

Additional analyses

When estimating the velocity graph, scvelo allows an additional argument tkey for time-course data. It's unclear if it does the right thing (see github issue), but users can get velocity estimates under the tkey prior using velocity_graph_tkey.py.

In the example analysis we have multiple samples / cell lines. After running the analysis on all the data, I split the loom files by cell lines and generate plots for each cell line using create_file_by_id.py and project_velocity_by_line.py.

Run cellrank

I did not have success getting intepretable results using cellrank, but users can use run_cellrank.py to start the analysis with cellrank.

cellrank_velocity's People

Contributors

junhaobearxiong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.