lonepatient / lookahead_pytorch Goto Github PK

View Code? Open in Web Editor NEW

187.0 5.0 31.0 174 KB

pytorch implement of Lookahead Optimizer

License: MIT License

Python 100.00%

pytorch lookahead optimizer resnet-18 cifar10

lookahead_pytorch's Introduction

Lookahead Pytorch

This repository contains a PyTorch implementation of the Lookahead Optimizer from the paper

Lookahead Optimizer: k steps forward, 1 step back

by Michael R. Zhang, James Lucas, Geoffrey Hinton and Jimmy Ba.

Dependencies

PyTorch
torchvision
matplotlib

Usage

The code in this repository implements both Lookahead and Adam training, with examples on the CIFAR-10 datasets.

To use Lookahead use the following command.

from optimizer import Lookahead
optimizer = optim.Adam(model.parameters(), lr=0.001)
optimizer = Lookahead(optimizer=optimizer,k=5,alpha=0.5)

We found that evaluation performance is typically better using the slow weights. This can be done in PyTorch with something like this in your eval loop:

if args.lookahead:
    optimizer._backup_and_load_cache()
    val_loss = eval_func(model)
    optimizer._clear_and_load_backup()

Example

To produce th result,we use CIFAR-10 dataset for ResNet18.

# use adam
python run.py --optimizer=adam

# use lookahead 
python run.py --optimizer=lookahead

Results

Train loss of adam and lookahead with ResNet18 on CIFAR-10.

Valid loss of adam and lookahead with ResNet18 on CIFAR-10.

Valid accuracy of adam and lookahead with ResNet18 on CIFAR-10.

lookahead_pytorch's People

Contributors

Stargazers

Watchers

lookahead_pytorch's Issues

Step_counter not defined ?

optimizer.step()

File "/home/thomas/HELIX/superpoint-graph-job/superpointgraph2/learning/refactor/models/optimizers.py", line 33, in step
group['step_counter'] += 1
KeyError: 'step_counter'

Do you have any idea how it can be solved ? I would like to try it out with Radam
base_optim = RAdam(model.parameters())
return Lookahead(base_optim, k=5, alpha=0.5)

My model is special bec it is built over pytorch geometric.

RuntimeError when part of the model is freezed

Hi, as mentioned in the title, part of the model's parameters are freezer (requires_grad=False) and they are on the CPU, in this case, I got the RuntimeError on this line that says that tensors must be on the same device, how should I modify the code to handle my case?

License

Could you please add a license (e.g., MIT), so I can use your code without legal troubles? :)

Cuda Error under self.alpha,p.data - q.data

Hi, I have tried wrapping both torch.optim.Adam (and also RAdam) optimiser, however I get an error when I run this on the gpu:

File "../optimizer.py", line 35, in step q.data.add_(self.alpha,p.data - q.data) RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

Using

base_optimizer = RAdam(model.parameters(), lr = 0.001)
#base_optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
#optimizer = Lookahead(base_optimizer, k=5, alpha=0.5)

Do I have to alter the way I load model and data to gpu? Using standard method of .to(device)

Lookahead has no attribute 'state'

When trying to save optimizer_state_dict,
I get an error saying >>Lookahead has no attribute 'state'
I thought this was due to not initializing parent class inside the class Lookahead
and so I added the line

super(Lookahead).__init__()

However, this still does not solve the issue. Other than saving, there seems no issue. Do you have any idea how to solve this issue?