mfbalin / concrete-autoencoders Goto Github PK

Jupyter Notebook 67.44% Python 32.56%

machine-learning unsupervised-learning feature-selection concrete-autoencoders icml-2019 icml supervised-learning

concrete-autoencoders's Introduction

Concrete Autoencoders

The concrete autoencoder is an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features. The method can be applied to unsupervised and supervised settings, and is a modification of the standard autoencoder.

For more details, see the accompanying paper: "Concrete Autoencoders for Differentiable Feature Selection and Reconstruction", ICML 2019, and please use the citation below.

@article{abid2019concrete,
  title={Concrete Autoencoders for Differentiable Feature Selection and Reconstruction},
  author={Abid, Abubakar and Balin, Muhammed Fatih and Zou, James},
  journal={arXiv preprint arXiv:1901.09346},
  year={2019}
}

Installation

To install, use pip install concrete-autoencoder

Usage

Here's an example of using Concrete Autoencoders to select the 20 most important features (pixels) across the entire MNIST dataset:

from concrete_autoencoder import ConcreteAutoencoderFeatureSelector
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.layers import Dense, Dropout, LeakyReLU
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = np.reshape(x_train, (len(x_train), -1))
x_test = np.reshape(x_test, (len(x_test), -1))
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

def decoder(x):
    x = Dense(320)(x)
    x = LeakyReLU(0.2)(x)
    x = Dropout(0.1)(x)
    x = Dense(320)(x)
    x = LeakyReLU(0.2)(x)
    x = Dropout(0.1)(x)
    x = Dense(784)(x)
    return x

selector = ConcreteAutoencoderFeatureSelector(K = 20, output_function = decoder, num_epochs = 800)

selector.fit(x_train, x_train, x_test, x_test)

Then, to get the pixels, run this:

selector.get_support(indices = True)

Run this code inside a colab notebook: https://colab.research.google.com/drive/11NMLrmToq4bo6WQ_4WX5G4uIjBHyrzXd

Documentation:

class ConcreteAutoencoderFeatureSelector:

Constructor takes a number of parameters to initalize the class. They are:

K: the number of features one wants to select

output_function: the decoder function

num_epochs: number of epochs to start training concrete autoencoders

batch_size: the batch size during training

learning_rate: learning rate of the adam optimizer used during training

start_temp: the starting temperature of the concrete select layer

min_temp: the ending temperature of the concrete select layer

tryout_limit: number of times to double the number of epochs and try again in case it doesn't converge

fit(X, Y = None): trains the concrete autoencoder

X: the data for which you want to do feature selection

Y: labels, in case labels are given, it will do supervised feature selection, if not, then unsupervised feature selection

transform(X): filters X's features after fit has been called

X: the data to be filtered

fit_transform(X): calls fit and transform in a sequence

X: the data to do feature selection on and filter

get_support(indices = False): if indices is True, returns indices of the features selected, if not, returns a mask

indices: boolean flag to determine whether to return a list of indices or a boolean mask

get_params(): returns the underlying keras model for the concrete autoencoder

concrete-autoencoders's People

Contributors

Stargazers

Watchers

concrete-autoencoders's Issues

Reconstruct data from extracted features

Hi, thanks for your interesting paper.

From my understanding, the concrete autoencoder is an autoencoder, with the latent space constrained by concrete distribution, and now work as a feature selector.

I want to know if it's possible to reconstruct the data, in case I only have the extracted features. For example, I got 2 populations A and B:

Collect n1 features on population A -> yield dataset A -> train concrete autoencoder on it -> get n2 features of A
Collect n2 features on population B -> yield dataset B -> reconstruct n1 feature by using only the decoder -> get n2 feature of population B

Would you like to suggest/advise me how to to this? Thank you.

hyperparameters tuning

Hello
Can anybody help me to find a way to apply hyperparameters tuning for this algorithm?

Error in colab demo

Runing the second block, there is a small bug.
cannot import name 'Adam' from 'keras.optimizers' (/usr/local/lib/python3.7/dist-packages/keras/optimizers.py)
Maybe you can fix it.😀

About the dataset

Can you put the download link of geo data?

Repeated features?

Thank you for your great feature selection method. However, I found that your method might lead to such a situation that the same features could be selected for multiple times. For example, after training, the method get_support() might output a result like [3, 4, 6, 2, 2, 5, 1, 0]. In this case, the second feature has been selected twice. Is it possible to avoid such cases using concrete autoencoders? Thank you!

GPU usage

Does Concrete Autoencoder have GPU support?

Pytorch implementation

Hello,
Really thankful for this amazing work. I want to know that is it possible to implement it on Pytorch?
Thanks again.

Error in calculating mean_max

Not sure if this is the right place to put it, but I think you have a small error when calculating mean_max

Line 107 reads:
if K.get_value(K.mean(K.max(K.softmax(self.concrete_select.logits, axis = -1)))) >= stopper_callback.mean_max_target:
Line 52 reads:
monitor_value = K.get_value(K.mean(K.max(K.softmax(self.model.get_layer('concrete_select').logits), axis = -1)))

The difference is that in Line 107 you have axis=-1 in the K.softmax function, while in Line 52 you have it in the K.max function. From what I can tell, K.max, without axis specified takes the maximum of the entire matrix, instead of the row-wise maximum. For this reason, I have been seeing the program finish and exit when the monitoring value is still only ~0.15

Could you provide the code of L1000 datasets??

Hello, I've read your paper and I would like to do a experiment of L1000 datasets.
I checked this site , but there are seven datasets.

Whick datasets did you use??
And, Could you provide the code of processing datasets??

Thank you.

Question on epochs

Hello,
thank you for your significant contribution.
I was wondering regarding the epochs of the model: why if I select let's say 1 epoch in the constructor, in the end in the verbose I see 1, 2, 4, 8 and then 16 epochs trained?

Is it suitable for a big amount of data? What is mean_max_target?

Hi, Thank you for providing the paper and the code,

I wanted to ask if you recommend using this approach for a big amount of data, let's say number of features around 2000. I tried the code with such a number of features, setting k=600 as selected features in input, however the code was taking too much time, and it was doubling the epoch time every time since it is not converging apparently. I wanted to ask approximately does this approach usually takes time ? also I wanted to ask about the convergence, what I understood from the code is that the selection will be done once we reach a 'mean_max_target=0.998' otherwise it will keep doubling the epoch time till it reaches that value, right? However I set the tryout limit to only 4 and epoch number to 50, the training stopped therefore after 400 epochs, however the selected features were redundant, i.e I got an array of indexes with for example [ 305, 310, 822, 310, 310, 310, 310, 305, 305, 310, 310, 310, 310, 305, 305, 310, 305, 310, 305, 305, 310, 305, 310, 310, 310, 771, …. ]
The size was indeed 600 features though. is there any idea how to set the tryout limit and the epoch num in case of this amount of features, and what does mean_max_target mean?

What exactly do transform(X) method?

I applied on my dataset and that's shape changed to (K, N), that k is input to your model and n is my X_train data features number. I supposed transform method is like dimensions reduction transform methods.

Confusion about step loss and value loss

Hello,
after each epoch the code prints: step - loss: 0.0043 - val_loss: 0.0043. What does this step and value loss mean for each epoch??
Also at the starting of each epoch it shows : mean max of probabilities: 6.54364e-05 - temperature 0.1. what is mean max of probabilities here??
Actually I am selecting 50 features using different no of epochs and trying to find reconstruction error at each epoch. Which one is the reconstruction error ???
Thank you.

Supervised concrete autoencoder

Hi,

I am looking through your code and I see that for the supersived implementation you use the MSE loss instead of the cross-entropy loss like you mentioned in the paper... There is no condition to use the cross-entropy loss when supervised and it always use the mse loss. Is it a mistake or is it done on purpose?

set_learning_phase is depreciated

set_learning_phase is depreciated in Keras, thus the ConcreteAutoencoderFeatureClass cannot be run. It throws the error: "AttributeError: module 'keras.api.backend' has no attribute 'set_learning_phase' "

License

Hi,

Could you consider adding a permissive license such as MIT to the repository?
Github makes that really easy: https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository

Thanks in advance.

Unsupervised feature selection on a structured data in CSV format, the program can't run.

Hello! I want to use this auto encoder for unsupervised feature selection, but alaways fails.

If ues fit(X), then reporting error'fit() missing 1 required positional argument: 'Y'';

if use fir(X, Y=None) , reporting 'have no len()'

if comment out the code 'assert len(X) == len(Y)' , reporting error while epochs begin.

So how do we start unsupervised training?

mfbalin / concrete-autoencoders Goto Github PK

concrete-autoencoders's Introduction

Concrete Autoencoders

Installation

Usage

Documentation:

concrete-autoencoders's People

Contributors

Stargazers

Watchers

Forkers

concrete-autoencoders's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs