GithubHelp home page GithubHelp logo

aanet's Introduction

Archetypal Analysis network (AAnet)

Quick start

Introduction

AAnet is a tool for scalable Archetypal Analysis of large and potentially non-linear datasets. A full description of the algorithm is available in our manuscript on ArXiv.

D. van Dijk, D. Burkhardt, et al. Finding Archetypal Spaces for Data Using Neural Networks. 2019. arXiv

alt text

Archetypal analysis is a data decomposition method that describes each observation in a dataset as a convex combination of "pure types" or archetypes. Existing methods for archetypal analysis work well when a linear relationship exists between the feature space and the archetypal space. However, such methods are not applicable to systems where the feature space is generated non-linearly from the combination of archetypes, such as in biological systems or image transformations. Here, we propose a reformulation of the problem such that the goal is to learn a non-linear transformation of the data into a latent archetypal space. To solve this problem, we introduce Archetypal Analysis network (AAnet), which is a deep neural network framework for learning and generating from a latent archetypal representation of data. We demonstrate state-of-the-art recovery of ground-truth archetypes in non-linear data domains, show AAnet can generative from data geometry rather than from data density, and use AAnet to identify biologically meaningful archetypes in single-cell gene expression data.

Using this repository

Currently, AAnet is not a full Python package, but all of the code you need to run the algorithm is in this repo. For convenience, we've organized code into folders for AAnet proper and for other algorithms to which we compared AAnet in our manuscript.

The most updated version of AAnet (08/01/2022) is implemented in torch.

File list

  • AAnet_torch/models- Includes most recent iteration of AAnet model
  • AAnet_torch/data - Includes helper functions for generating curved simplices for testing
  • AAnet_torch/simulated_data_example - Includes Jupyter notebook and helper functiosn to run AAnet

aanet's People

Contributors

dburkhardt avatar dvdijk avatar aarthivenkat avatar andrew-benz avatar

Stargazers

Germans Savcisens avatar  avatar  avatar Demeter Turos avatar  avatar Yinuo Jin avatar  avatar  avatar Alberto Labarga avatar  avatar fred monroe avatar Amirsina Torfi avatar  avatar Ethan Weinberger avatar Kaiyu (Rossmann) Qiu avatar Yang Lu avatar Gian M. Franceschini avatar  avatar Aaron Miller avatar Long Faning avatar Nik avatar  avatar  avatar Bill Flynn avatar Vitalii Kleshchevnikov avatar Yiran Wang avatar dyl4nm4rsh4ll avatar

Watchers

 avatar James Cloos avatar Shalin Mehta avatar  avatar  avatar  avatar paper2code - bot avatar

aanet's Issues

from tensorflow.examples.tutorials.mnist import input_data deprecated

Hi, thank you for developing AAnet!

While running the tutorial I encountered the following error:

ModuleNotFoundError: No module named 'tensorflow.examples.tutorials'

Apparently, that module containing MNIST is no longer supported.

I was able to solve this with:

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

data_all = x_train

data_all = data_all/255
data_all = (data_all * 2) - 1 # norm for tanh

I hope this might be of use,
Best

Versions of dependencies

Hi,

I read your AAnet paper and found it really interesting. I have tried to run it on my computer
but it seems that when calling methods from Tensorflow it gives some errors, and I guess that the version I am using is different than the one from the code.

Could you tell what version of Tensorflow (and other dependencies if possible) is AAnet using?

Thanks.

adding setup.py & pypi release

Hi David,

Thanks for your effort of improving archetypal analysis. I would like to use this package so I am wondering if you plan to release it via pypi. If not any time soon, would it be posible for you to add setup.py so that AAnet can be installed via pip? E.g. pip install git+https://github.com/KrishnaswamyLab/AAnet.git).

predicting archetype membership of new points?

Thanks for publishing your implementation, I've really enjoyed your work. One addition to the tutorial notebook that I'd love to see is the application of the trained model to new data to predict their membership to each of the identified archetypes. I'm not sure how to do this and it would be really helpful to see it demonstrated. Currently my idea is to use the get distance function you've included to measure the distance between the new data point and each of the archetypes, then express membership as a ratio, the numerator of which would be the distance between the new point and a given archetype, and denominator the sum of distances between the new point and each archetype. That way the membership for each point across all archetypes should sum to 1. Is this the correct approach?

EDIT: I took a look in the AAnet.py script itself, and in fairness to you, I think from your annotation I was able to figure out a prediction method, which I'll paste below. Is this the correct procedure? If so, perhaps this could be added, along with a little more explanation, to the end of the tutorial?

# get testing data and apply model to predict membership
data_test = mnist.test.images
data_test = (data_test * 2) - 1 # norm for tanh

# get only digit 4, single digit
digit = 4
idx_digit = mnist.test.labels == digit
data_test = data_test[idx_digit,]

# predict membership
new_archetypal_coords = model.data2at(data_test)

# get the index of best membership for each point
labels = np.argmax(new_archetypal_coords, axis=1)

# plot points in PCA space, colored by archetype
model.plot_pca_ats_data(data_test, c=labels)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.