saoyan / learntopayattention Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of ICLR 2018 paper Learn To Pay Attention (and some modification)
License: GNU General Public License v3.0
PyTorch implementation of ICLR 2018 paper Learn To Pay Attention (and some modification)
License: GNU General Public License v3.0
Hello
i want to introduce your attention model to a resnet network, my model is already done and i just want to integrate attention, i don't know if i can add a learning step focusing on C1, C2 and C3 so i want to ask if i can train my model without compatibility scores.
python train.py --attn_mode before --outf logs_before --normalize_attn --log_images
loading the dataset ...
Downloading https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz to CIFAR100_data/cifar-100-python.tar.gz
100.0%Extracting CIFAR100_data/cifar-100-python.tar.gz to CIFAR100_data
Traceback (most recent call last):
File "train.py", line 199, in
main()
File "train.py", line 51, in main
trainloader = torch.utils.data.DataLoader(trainset, batch_size=opt.batch_size, shuffle=True, num_workers=8, worker_init_fn=_init_fn)
NameError: name '_init_fn' is not defined
The code crashes after few epochs
First, great work @SaoYan, congrats
I have a doubt about the attention mechanism you implemented, I can use this approach using the Attention in TensorFlow?
Best,
model1.py
x = self.conv_block1(x)
x = self.conv_block2(x)
l1 = self.conv_block3(x) # /1
x = tnf.max_pool2d(l1, kernel_size=2, stride=2, padding=0) # /2
l2 = self.conv_block4(x) # /2
x = tnf.max_pool2d(l2, kernel_size=2, stride=2, padding=0) # /4
l3 = self.conv_block5(x) # /4
x = tnf.max_pool2d(l3, kernel_size=2, stride=2, padding=0) # /8
x = self.conv_block6(x) # /32
g = self.dense(x) # batch_sizex512x1x1
###can modify as
x = self.conv_block1(x)
x = self.conv_block2(x)
l1 = self.conv_block3(x) # /1
l1 = tnf.max_pool2d(l1, kernel_size=2, stride=2, padding=0) # /2
l2 = self.conv_block4(l1) # /2
l2 = tnf.max_pool2d(l2, kernel_size=2, stride=2, padding=0) # /4
l3 = self.conv_block5(l2) # /4
x = tnf.max_pool2d(l3, kernel_size=2, stride=2, padding=0) # /8
x = self.conv_block6(x) # /32
g = self.dense(x) # batch_sizex512x1x1
##################################################3
model2.py
x = self.conv_block1(x)
x = self.conv_block2(x)
x = self.conv_block3(x)
l1 = tnf.max_pool2d(x, kernel_size=2, stride=2, padding=0) # /2
l2 = tnf.max_pool2d(self.conv_block4(l1), kernel_size=2, stride=2, padding=0) # /4
l3 = tnf.max_pool2d(self.conv_block5(l2), kernel_size=2, stride=2, padding=0) # /8
x = self.conv_block6(l3) # /32
g = self.dense(x) # batch_sizex512x1x1
###can modify to
x = self.conv_block1(x)
x = self.conv_block2(x)
l1 = self.conv_block3(x) # /1
l1 = tnf.max_pool2d(l1, kernel_size=2, stride=2, padding=0) # /2
l2 = self.conv_block4(l1) # /2
l2 = tnf.max_pool2d(l2, kernel_size=2, stride=2, padding=0) # /4
l3 = self.conv_block5(l2) # /4
x = tnf.max_pool2d(l3, kernel_size=2, stride=2, padding=0) # /8
x = self.conv_block6(x) # /32
g = self.dense(x) # batch_sizex512x1x1
they are the same~~~~
Why use num_aug=3 in the one epoch? Do repetitions increase robustness?
Hello, Thank you very much for posting this implementation of the ‘LearnToPayAttention’ paper. I was hoping you can help me with an issue I am having when running the code. When I run model 1 with (or without) attention with the default hyperparameter settings on the github page (LR = 0.1 etc) on CIFAR100, the training loss and train/test accuracy does not seem to change. Training loss is stuck at around 4.6 and test accuracy is stuck at 1%. I tried pytorch 0.4.1 and 1.0.0. Any help would be greatly appreciated. Thanks.
Can you include a notebook showing how to extract the attention map for a prediction given the trained model and a sample image?
First of all, thank you very much for the code.
I have a few questions, hoping you can help.
Thanks in advance for your help!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.