facebookresearch / luckmatters Goto Github PK

Understanding Training Dynamics of Deep ReLU Networks

License: Other

Python 99.92% Shell 0.08%

luckmatters's Issues

Some files are missing in “ssl/common_utils”

I tried to run the bn_gen.py script, but common_utils.MultiRunUtil seems to be missing which lead to errors. I managed to modify one of its function, common_utils.MultiRunUtil.load_full_cfg by myself to make the code usable. However, common_utils.MultiRunUtil.load_omega_conf, common_utils.print_info and maybe more remain undone. Please update the "ssl/common_utils" folder.

torch.eig is deprecated for a long time and is being removed

PyTorch's torch.eig was deprecated since version 1.9 and is being removed by pytorch/pytorch#70982. Please use the torch.linalg.eig function instead if you want your code to continue to work with the latest PyTorch.

Affected files:
https://github.com/facebookresearch/luckmatters/blob/main/luckmatter/test_multilayer.py
https://github.com/facebookresearch/luckmatters/blob/main/ssl/real-dataset/byol_trainer.py

Do you try on the multiple GPUs

I tried to train your code on the multiple GPU with STL10
On single GPU, it has the same score as your paper.
But, it has lower score on the multiple GPUs by using "torch.nn.parallel.DataParallel"
How can I get the same results ?

why is F the input to the predictor?

May I know why F is the input to the predictor? Shouldn't it be WX?
@yuandong-tian

STL10 setting in alpha-CL

Hi, nice work!
When I re-implement alpha-CL performance on stl10, it doesn't work. Is it correct to set lr=1e-3, p=4, batch_size=256, and \tau=0.5 for ResNet50 (c.f. Table 3)?

Theorem 1 proof: Understanding Self-Supervised Learning Dynamics without Contrastive Pairs

May I know how you get equation 44? What is t here?
My understanding is W_p is the parameter of the predictor. It is not a function of t. How can you take the derivative of W_p with respect to t?
Thanks for your work. It's a nice paper.

"Init Teacher" never ends

I tried to run the recon_multilayer.py script with MNIST dataset. However, it got stuck at the "Init teacher" step for more than 2 hrs and didn't go to the next step. I wonder if the code enters an infinite loop?

input:
python recon_multilayer.py --dataset mnist
output:
cuda: Namespace(batchsize=64, bn=False, bn_affine=False, bn_before_relu=False, cmdline='recon_multilayer.py --dataset mnist', cross_entropy=False, d_output=0, data_d=20, data_std=10.0, dataset='mnist', eval_batchsize=64, init_multi=4, json_output=False, ks=[10, 15, 20, 25], load_teacher=None, lr={0: 0.01}, momentum=0.0, no_bias=False, no_sep=False, node_multi=10, normalize=False, num_epoch=40, num_iter=30000, num_trial=10, perturb=None, regen_dataset_each_epoch=False, same_dir=False, same_sign=False, save_dir='./', seed=1, signature='070519_161701_245908', stats_H=False, stats_w=False, teacher_bn=False, teacher_bn_affine=False, use_cnn=False, weight_decay=0) ks: [10, 15, 20, 25] d_output: 10 Init teacher..
`

Confusion about dyn_eps and dyn_reg in DirectPred

Hi authors,

First of all, this is an issue on paper DirectPred. And I really like your DirectPred paper:)

There is confusion here:
In the README, dyn_reg is set to be 0.01 which is referred to as eps in Eq.18 in the paper.
However, in the corresponding code line,

luckmatters/ssl/real-dataset/byol_trainer.py

Line 257 in e621d9d

eigen_values = eigen_values.pow(1/self.dyn_convert) + self.dyn_eps

, eps in Eq. 18 is represented by dyn_eps.

So which parameter, dyn_eps or dyn_reg, refers to the eps in Eq.18?

Thanks!

facebookresearch / luckmatters Goto Github PK

luckmatters's Issues

Some files are missing in “ssl/common_utils”

torch.eig is deprecated for a long time and is being removed

Do you try on the multiple GPUs

why is F the input to the predictor?

STL10 setting in alpha-CL

Theorem 1 proof: Understanding Self-Supervised Learning Dynamics without Contrastive Pairs

"Init Teacher" never ends

Confusion about dyn_eps and dyn_reg in DirectPred

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs