GithubHelp home page GithubHelp logo

zifa's Introduction

ZIFA

Zero-inflated dimensionality reduction algorithm for single-cell data. Created by Emma Pierson and Christopher Yau.

Citation

@article{pierson2015zifa, title={ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis}, author={Pierson, Emma and Yau, Christopher}, journal={Genome biology}, volume={16}, number={1}, pages={1}, year={2015}, publisher={BioMed Central} }

Instructions

If you are using count data, we recommend taking the log (ie, Y = log2(1 + count_data)) prior to using ZIFA.

Algorithm code is contained in ZIFA.py and block_ZIFA.py. For datasets with more than a few thousand genes, we recommend using block_ZIFA, which subsamples genes in blocks to increase efficiency; it should yield similar results to ZIFA. Runtime for block ZIFA on the full single-cell dataset from Pollen et al, 2014 (~250 samples, ~20,000 genes) is approximately 15 minutes on a quadcore Mac Pro.

Runtime for block ZIFA is roughly linear in the number of samples and the number of genes, and quadratic in the block size. Decreasing the block size may decrease runtime but will also produce less reliable results.

See example.py for a full example demonstrating superior performance over factor analysis.

See read_in_real_data_example.py for a example demonstrating how to read in real data using pandas and run ZIFA on it.

ZIFA requires pylab, scipy, numpy, and scikit.learn for full functionality.

Prior to issuing pull requests, please confirm that your code passes the tests by running unitTests.py. (If you are using a different version of scipy, numpy, or sklearn, the results may be slightly different -- in that case, please report how much you have to increase the absolute_tolerance parameter to get them to pass. If your package versions are very different, the unit tests may fail entirely even though the main code will still run.) The tests take about 30 seconds to run.

Please contact [email protected] with any questions or comments.

Installation

Download the code: git clone https://github.com/epierson9/ZIFA

Install the package: cd ZIFA then python setup.py install

##Sample usage

from ZIFA import ZIFA
from ZIFA import block_ZIFA

To fit ZIFA:

Z, model_params = ZIFA.fitModel(Y, k)

To fit with the block algorithm:

Z, model_params = block_ZIFA.fitModel(Y, k)

or

Z, model_params = block_ZIFA.fitModel(Y, k, n_blocks = desired_n_blocks)

where Y is the observed zero-inflated data, k is the desired number of latent dimensions, and Z is the low-dimensional projection and desired_n_blocks is the number of blocks to divide genes into. By default, the number of blocks is set to n_genes / 500 (yielding a block size of approximately 500).

zifa's People

Contributors

ahwillia avatar dakota-hawkins avatar epierson9 avatar jlumpe avatar yue-jiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zifa's Issues

Variance explained by zifa factors

Hi,
I am trying to find a principled way to select k number of factors using ZIFA. Is there a straight forward way to obtain the % of variance explained by each component/factor?

Thank!

ZIFA in R?

Hello, could you advise how to run ZIFA in R? Thanks!

Access to x

Would it be possible to also retrieve the entries in the expression matrix without the zeros, that is, x before passing through h?

Working with sprase matrix

Hi,

I was interested in using ZIFA one of the problems is that my data set is huge (i think all zero inflated data would be) and needs to be stored as a sparse matrix but I don't see a sparse matrix implementation of this methd.

confused about the magical_matrix

The matrixToInvert computed by function computeMatrixInLastStep is the key term in the mean and covariance in conditional distribution(i.e. the (28)(29) in Supplementary Information). So why you compute the magical_matrix when you obtain the sigma_xz and mu_xz, what does it mean? I am eager to know that. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.