GithubHelp home page GithubHelp logo

bnediction / profile_binr Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 1.0 205 KB

PROFILE RNA-seq binarization technique.

License: BSD 3-Clause "New" or "Revised" License

Python 52.91% R 45.98% Dockerfile 1.11%
single-cell-rna-seq rna-seq-analysis binarization python3 rpy2 pandas bioinformatics computational-biology normalisation

profile_binr's Introduction

profile_binr

DEPRECATION/MIGRATION WARNING:

This package has been replaced by scBoolSeq which implements the generation of synthetic scRNA-Seq data biased by Boolean activation states as well as rich CLI. If any of your work relies on profile_binr, we invite you to migrate to scBoolSeq which has retained a similar API but has a richer set of features.

Legacy README:

Join the chat at https://gitter.im/bnediction/profile_binr

The PROFILE methodology for the binarisation and normalisation of RNA-seq data.

This is a Python interface to a set of normalisation and binarisation functions for RNA-seq data originally written in R.

This software package is based on the methodology developed by Beal, Jonas; Montagud, Arnau; Traynard, Pauline; Barillot, Emmanuel; and Calzone, Laurence at Computational Systems Biology of Cancer team at Institut Curie ([email protected]). It generalizes and offers a Python interface of the original implementation in Rmarkdown notebooks available at https://github.com/sysbio-curie/PROFILE.

Installation

Using conda

The tool can be installed using the Conda package profile_binr in the colomoto channel. Note that some of its dependencies requires the conda-forge channel.

conda install -c conda-forge colomoto::profile_binr

Using pip

Requirements

  • R (≥4.0)
  • R packages:
    • mclust
    • diptest
    • moments
    • magrittr
    • tidyr
    • dplyr
    • tibble
    • bigmemory
    • doSNOW
    • foreach
    • glue
pip install profile_binr

Usage

A minimal example of the binarization suite. Take a look at notebooks/ for more details.

from profile_binr import ProfileBin
import pandas as pd

# your data is assumed to contain observations as
# rows and genes as columns
data = pd.read_csv("path/to/your/data.csv")
data.head()

# create the binarisation instance using the dataframe
# with the index containing the cell identifier
# and the columns being the gene names
probin = ProfileBin(data)

# compute the criteria used to binarise/normalise the data :
# This method uses a parallel implementation, you can specify the 
# number of workers with an integer
probin.fit(8) # train using 8 threads

# Look at the computed criteria
probin.criteria

# get binarised data (alternatively .binarise()):
my_bin = probin.binarize()
my_bin.head()

# idem for normalised data :
my_norm = probin.normalize()
my_norm.head()

References

  • Béal J, Montagud A, Traynard P, Barillot E and Calzone L (2019) Personalization of Logical Models With Multi-Omics Data Allows Clinical Stratification of Patients. Front. Physiol. 9:1965. doi:10.3389/fphys.2018.01965

profile_binr's People

Contributors

gitter-badger avatar gmagannadevelop avatar pauleve avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

gitter-badger

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.