Implementing Attention Augmented Convolutional Networks using Pytorch
- In the paper, it is implemented as Tensorflow. So I implemented it with Pytorch.
I posted two versions of the "Attention-Augmented Conv"
Reference
Paper
- Attention Augmented Convolutional Networks Paper
- Author, Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens
- Quoc V.Le Google Brain
Wide-ResNet
- Github URL
- Thank you :)
Method
Input Parameters
Experiments
Datasets | Model | Accuracy | Epoch | Training Time |
---|---|---|---|---|
CIFAR-10 | WORK IN PROCESS | |||
CIFAR-100 | Just 3-Conv layers(channels: 64, 128, 192) | 61.6% | 100 | 22m |
CIFAR-100 | Just 3-Attention-Augmented Conv layers(channels: 64, 128, 192) | 59.82% | 35 | 2h 23m |
- I just want to see feasibility of this method(Attention-Augemnted Conv layer), I'll try about ResNet.
- The above results show that there are many time differences. I will think about this part a bit more.
- I have seen the issue that the torch.einsum function is slow. Link
- When I execute the example code in the link, the result was:
- using cuda