GithubHelp home page GithubHelp logo

samsungsailmontreal / ghn3 Goto Github PK

View Code? Open in Web Editor NEW
29.0 29.0 3.0 1.2 MB

Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]

Home Page: https://arxiv.org/abs/2303.04143

License: MIT License

Python 8.57% Shell 91.43%
computational-graphs deep-learning graphs hypernetworks imagenet large-scale pytorch transformers

ghn3's People

Contributors

bknyaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ghn3's Issues

Need additional detail regarding few-shot learning experiment

Can you explain the few-shot setting whose result are reported in
Table 7. Transfer learning from ImageNet to few-shot CIFAR-10
and CIFAR-100 with 1000 training labels with 3 networks: ResNet50 (R-50), ConvNext-B (C-B) and Swin-T (S-T)

How many shots in the training set as well as in the test set or query set.

question regarding evaluation on swin-t

This is the code i use to evaluate ghn3clm16 on cifar-10 but the top1 results was 0.41 and 1.68 for top5

did I do something wrong.
here is my test code

`import torch
import torchvision
from ppuda.config import init_config
from ghn3 import from_pretrained, norm_check, Graph, Logger
from ppuda.config import init_config
from ppuda.utils import infer, AvgrageMeter, adjust_net
from ppuda.vision.loader import image_loader

args = init_config(mode='eval', debug=0, arch='resnet50', split='torch') # load arguments from the command line
assert args.arch is not None, ('architecture must be specified using, e.g. --arch resnet50', args.arch)

def bn_set_train(module):
if isinstance(module, torch.nn.BatchNorm2d):
module.track_running_stats = False
module.training = True

ghn = from_pretrained(args.ckpt, debug_level=args.debug).to(args.device) # get a pretrained GHN

is_imagenet = args.dataset.startswith('imagenet')
is_torch = args.split == 'torch'
print('loading the %s dataset...' % args.dataset)
val_loader, num_classes = image_loader(args.dataset,
args.data_dir,
test=True,
test_batch_size=args.test_batch_size,
num_workers=args.num_workers,
noise=args.noise,
im_size=224,#args.imsize,
seed=args.seed)[1:]

model = eval(f'torchvision.models.{args.arch}()').to(args.device) # create a PyTorch model

if is_torch and not is_imagenet:
model = adjust_net(model, large_input=False) # adjust the model for small images such as 32x32 in CIFAR-10

with torch.no_grad(): # to improve efficiency
model = ghn(model) # predict parameters of the model
model.eval()
model.eval() # set to the eval mode to disable dropout, etc.

model.apply(bn_set_train)
top1, top5 = infer(model.to(args.device), val_loader, verbose=False)
print(f'top1: {top1} and top5:{top5} ')`

Questions about the paper

Thank you for your interesting research. I have some questions regarding the paper:

  1. I'm curious about the adaptability of GHNs to other standard-sized datasets, particularly in different tasks such as image segmentation. The Penn-Fudan dataset discussed in your paper seems relatively small. Could you share your thoughts on this?
  2. Do you remember the accuracy of the models in DeepNets-1M achieved during the training of GHNs model? How are they compared to the predicted parameter model accuracy with/without fine-tuning?

Thank you.

Question regarding random init performance too good

Hello thank you for your work.
Why random INIT performance is so high in Table 7 CIFAR-10 if the model has not been trained before.
does random init mean model initialize with Xavier initialization?
when using swint without any pretrained weights on CIFAR-10 as random initialize the performance is around 5 to 10. so I do not understand how you get that huge performance with no pretraining with the baseline random init.
thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.