GithubHelp home page GithubHelp logo

jcdaneshmand / dropseqpipe Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hoohm/dropseqpipe

0.0 0.0 0.0 6.98 MB

A SingleCell RNASeq pre-processing snakemake workflow

License: Creative Commons Attribution Share Alike 4.0 International

Python 35.62% R 64.38%

dropseqpipe's Introduction

Snakemake Build Status

Description

This pipeline is based on snakemake and the dropseq tools provided by the McCarroll Lab. It allows to go from raw data of your Single Cell RNA seq experiment until the final count matrix with QC plots along the way.

This is the tool we use in our lab to improve our wetlab protocol as well as provide an easy framework to reproduce and compare different experiments with different parameters.

It uses STAR to map the reads. It is usable for any single cell protocol using two reads where the first one holds the Cell and UMI barcodes and the second read holds the RNA. Here is a non-exhausitve list of compatible protocols/brands:

This package is trying to be as user friendly as possible. One of the hopes is that non-bioinformatician can make use of it without too much hassle. It will still require some command line execution, this is not going to be fully interactive package.

Authors

Usage

Step 1: Install workflow

If you simply want to use this workflow, download and extract the latest release. If you intend to modify and further develop this workflow, fork this reposity. Please consider providing any generally applicable modifications via a pull request.

In any case, if you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository and, once available, its DOI.

Step 2: Configure workflow

Configure the workflow according to your needs via editing the file config.yaml and the samples.tsv following those instructions

Step 3: Execute workflow

All you need to execute this workflow is to install Snakemake via the Conda package manager. Software needed by this workflow is automatically deployed into isolated environments by Snakemake.

Test your configuration by performing a dry-run via

snakemake --use-conda -n --directory $WORKING_DIR

Execute the workflow locally via

snakemake --use-conda --cores $N --directory $WORKING_DIR

using $N cores on the $WORKING_DIR. Alternatively, it can be run in cluster or cloud environments (see the docs for details).

If you not only want to fix the software stack but also the underlying OS, use

snakemake --use-conda --use-singularity

in combination with any of the modes above.

Step 4: Investigate results

After successful execution, you can create a self-contained report with all results via:

snakemake --report report.html

Documentation

You can find the documentation here

Future implementations

I'm actively seeking help to implement the points listed bellow. Don't hesitate to contact me if you wish to contribute.

  • Create a sharing platform where quality plots/logs can be discussed and troubleshooted.
  • Create a full html report for the whole pipeline
  • Multiqc module for drop-seq-tools
  • Implement an elegant "preview" mode where the pipeline would only run on a couple of millions of reads and allow you to have an approximated view before running all of the data. This would dramatically reduce the time needed to get an idea of what filters whould be used.

I hope it can help you out in your single cell experiments!

Feel free to comment and point out potential improvements via issues

TODO

  • Add a mixed reference reference for testing purposes
  • Finalize the parameters validation schema
  • Make the debug feature a bit "cleaner". Deal with automatic naming of the debug variables
  • Implement ddseq barcoding strategies

dropseqpipe's People

Contributors

hoohm avatar seb-mueller avatar cgirardot avatar duyck avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.