rosinality / mac-network-pytorch Goto Github PK

Memory, Attention and Composition (MAC) Network for CLEVR implemented in PyTorch

License: MIT License

Python 100.00%

mac-network-pytorch's Introduction

mac-network-pytorch

Memory, Attention and Composition (MAC) Network for CLEVR from Compositional Attention Networks for Machine Reasoning (https://arxiv.org/abs/1803.03067) implemented in PyTorch

Requirements:

Python 3.6
PyTorch 0.4
torch-vision
Pillow
nltk
tqdm

To train:

Download and extract CLEVR v1.0 dataset from http://cs.stanford.edu/people/jcjohns/clevr/
Preprocessing question data and extracting image features using ResNet 101

python preprocess.py [CLEVR directory]
python image_feature.py [CLEVR directory]

!CAUTION! the size of file created by image_feature.py is very large! (~70 GiB) You may use hdf5 compression, but it will slow down feature extraction.

Run train.py

python train.py [CLEVR directory]

This implementation produces 95.75% accuracy at epoch 10, 96.5% accuracy at epoch 20.

mac-network-pytorch's People

Contributors

Stargazers

Watchers

mac-network-pytorch's Issues

image_features.py output shape?

Hi! This is probably a gap in my understanding of pyTorch or h5py, but I wanted to bring it your attention just in case it’s not.

The output of image_features.py is a batch_size*(num images in split) x 1024 x 14 x 14 numpy array. You assign the features associate with each image to batch_size continuous indices in a slice of the first index. I don’t understand why it’s necessary to store batch_size copies of each image’s features.

Later, when you load the data from the h5 file in the CLEVR dataloader’s getitem method in dataset.py, you index the array as if img[i] gives the features of the ith image. But based on how you initialized the h5 file, these would actually be stored [batch_size*i:batch_size(i+1)], not i.

What am I missing here?

Attention and Memory performance is worse

Thanks for making this repo available!

Just want to post that when the memory grate and write attention are used together, the performance seems to be worse.

Padded token is not masked when calculating attention in control unit

Thanks for sharing the implementation, it's really nice.

I noticed that when you call the mac unit, with LSTM output and image representation, the question_len is not passed in, and the attention calculation in control unit seems unaware of the "padding tokens", do I miss something here?

Specifically I'm referring to this line:

mac-network-pytorch/model.py

Line 38 in 564ca5b

attn = F.softmax(attn_weight, 1)

With sentence of varying lengths, the attention should be restricted to the actual sentence length, rather than on the padding tokens. Is that right?

MACNet configuration

@rosinality Hi, since the original MACnet code supports multiple configurations, I wanted to make sure that does your code offers full support for original macnet's functionality or not?

Why keep two copies of the network (net and net_running)?

Hi, I 'm looking at the code and couldn't understand this part...
Two MACNetworks are created: "net" and "net_running". During training stage, only the "net" is trained, and 0.01% of the "net"'s parameter is injected into the "net_running" model. While at the testing stage, the "net_running" is evaluated.

I'm wondering why it's necessary to keep two copies here, instead of using the a single "net" model directly?
Thanks!

Resume Training using saved checkpoints.

Hi again,

I want to know how can we used the saved checkpoints to reusme training?

I used the following code for this purpose, but it used to give me few warnings and I am not sure if it was loading the weights correctly:

```
   if opt.resume >= 0:
          model_param_file = glob.glob('%s/checkpoint_%s*.model' % (opt.path_to_chkpt_folder, opt.resume))

          net = torch.load(model_param_file[0])

opt.resume is the epoch number I want to resume training from..
Thanks!

Accuracy on validation set

Hi @rosinality, Great job!
Is the accuracy you reported for training set?

I got 96.xx% accuracy on train data, and the avg. accuracy on validation set is 85.9%.

rosinality / mac-network-pytorch Goto Github PK

mac-network-pytorch's Introduction

mac-network-pytorch

mac-network-pytorch's People

Contributors

Stargazers

Watchers

Forkers

mac-network-pytorch's Issues

image_features.py output shape?

Attention and Memory performance is worse

Padded token is not masked when calculating attention in control unit

MACNet configuration

Why keep two copies of the network (net and net_running)?

Resume Training using saved checkpoints.

Accuracy on validation set

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs