samsungsailmontreal / ghn3 Goto Github PK

View Code? Open in Web Editor NEW

29.0 29.0 3.0 1.2 MB

Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]

Home Page: https://arxiv.org/abs/2303.04143

License: MIT License

Python 8.57% Shell 91.43%

computational-graphs deep-learning graphs hypernetworks imagenet large-scale pytorch transformers

ghn3's People

Contributors

Stargazers

Watchers

Forkers

doha-hwang hijihyo lchen64

ghn3's Issues

Need additional detail regarding few-shot learning experiment

Can you explain the few-shot setting whose result are reported in
Table 7. Transfer learning from ImageNet to few-shot CIFAR-10
and CIFAR-100 with 1000 training labels with 3 networks: ResNet50 (R-50), ConvNext-B (C-B) and Swin-T (S-T)

How many shots in the training set as well as in the test set or query set.

question regarding evaluation on swin-t

This is the code i use to evaluate ghn3clm16 on cifar-10 but the top1 results was 0.41 and 1.68 for top5

did I do something wrong.
here is my test code

`import torch
import torchvision
from ppuda.config import init_config
from ghn3 import from_pretrained, norm_check, Graph, Logger
from ppuda.config import init_config
from ppuda.utils import infer, AvgrageMeter, adjust_net
from ppuda.vision.loader import image_loader

args = init_config(mode='eval', debug=0, arch='resnet50', split='torch') # load arguments from the command line
assert args.arch is not None, ('architecture must be specified using, e.g. --arch resnet50', args.arch)

def bn_set_train(module):
if isinstance(module, torch.nn.BatchNorm2d):
module.track_running_stats = False
module.training = True

ghn = from_pretrained(args.ckpt, debug_level=args.debug).to(args.device) # get a pretrained GHN

is_imagenet = args.dataset.startswith('imagenet')
is_torch = args.split == 'torch'
print('loading the %s dataset...' % args.dataset)
val_loader, num_classes = image_loader(args.dataset,
args.data_dir,
test=True,
test_batch_size=args.test_batch_size,
num_workers=args.num_workers,
noise=args.noise,
im_size=224,#args.imsize,
seed=args.seed)[1:]

model = eval(f'torchvision.models.{args.arch}()').to(args.device) # create a PyTorch model

if is_torch and not is_imagenet:
model = adjust_net(model, large_input=False) # adjust the model for small images such as 32x32 in CIFAR-10

with torch.no_grad(): # to improve efficiency
model = ghn(model) # predict parameters of the model
model.eval()
model.eval() # set to the eval mode to disable dropout, etc.

model.apply(bn_set_train)
top1, top5 = infer(model.to(args.device), val_loader, verbose=False)
print(f'top1: {top1} and top5:{top5} ')`

Questions about the paper

Thank you for your interesting research. I have some questions regarding the paper:

I'm curious about the adaptability of GHNs to other standard-sized datasets, particularly in different tasks such as image segmentation. The Penn-Fudan dataset discussed in your paper seems relatively small. Could you share your thoughts on this?
Do you remember the accuracy of the models in DeepNets-1M achieved during the training of GHNs model? How are they compared to the predicted parameter model accuracy with/without fine-tuning?

Thank you.

Question regarding random init performance too good

Hello thank you for your work.
Why random INIT performance is so high in Table 7 CIFAR-10 if the model has not been trained before.
does random init mean model initialize with Xavier initialization?
when using swint without any pretrained weights on CIFAR-10 as random initialize the performance is around 5 to 10. so I do not understand how you get that huge performance with no pretraining with the baseline random init.
thank you

samsungsailmontreal / ghn3 Goto Github PK

ghn3's People

Contributors

Stargazers

Watchers

Forkers

ghn3's Issues

Need additional detail regarding few-shot learning experiment

question regarding evaluation on swin-t

Questions about the paper

Question regarding random init performance too good

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs