facebookresearch / luckmatters Goto Github PK
View Code? Open in Web Editor NEWUnderstanding Training Dynamics of Deep ReLU Networks
License: Other
Understanding Training Dynamics of Deep ReLU Networks
License: Other
I tried to run the bn_gen.py
script, but common_utils.MultiRunUtil
seems to be missing which lead to errors. I managed to modify one of its function, common_utils.MultiRunUtil.load_full_cfg
by myself to make the code usable. However, common_utils.MultiRunUtil.load_omega_conf
, common_utils.print_info
and maybe more remain undone. Please update the "ssl/common_utils" folder.
PyTorch's torch.eig was deprecated since version 1.9 and is being removed by pytorch/pytorch#70982. Please use the torch.linalg.eig function instead if you want your code to continue to work with the latest PyTorch.
Affected files:
https://github.com/facebookresearch/luckmatters/blob/main/luckmatter/test_multilayer.py
https://github.com/facebookresearch/luckmatters/blob/main/ssl/real-dataset/byol_trainer.py
I tried to train your code on the multiple GPU with STL10
On single GPU, it has the same score as your paper.
But, it has lower score on the multiple GPUs by using "torch.nn.parallel.DataParallel"
How can I get the same results ?
May I know why F is the input to the predictor? Shouldn't it be WX?
@yuandong-tian
Hi, nice work!
When I re-implement alpha-CL performance on stl10, it doesn't work. Is it correct to set lr=1e-3, p=4, batch_size=256, and \tau=0.5 for ResNet50 (c.f. Table 3)?
I tried to run the recon_multilayer.py
script with MNIST dataset. However, it got stuck at the "Init teacher" step for more than 2 hrs and didn't go to the next step. I wonder if the code enters an infinite loop?
input:
python recon_multilayer.py --dataset mnist
output:
cuda: Namespace(batchsize=64, bn=False, bn_affine=False, bn_before_relu=False, cmdline='recon_multilayer.py --dataset mnist', cross_entropy=False, d_output=0, data_d=20, data_std=10.0, dataset='mnist', eval_batchsize=64, init_multi=4, json_output=False, ks=[10, 15, 20, 25], load_teacher=None, lr={0: 0.01}, momentum=0.0, no_bias=False, no_sep=False, node_multi=10, normalize=False, num_epoch=40, num_iter=30000, num_trial=10, perturb=None, regen_dataset_each_epoch=False, same_dir=False, same_sign=False, save_dir='./', seed=1, signature='070519_161701_245908', stats_H=False, stats_w=False, teacher_bn=False, teacher_bn_affine=False, use_cnn=False, weight_decay=0) ks: [10, 15, 20, 25] d_output: 10 Init teacher..
`
Hi authors,
First of all, this is an issue on paper DirectPred. And I really like your DirectPred paper:)
There is confusion here:
In the README, dyn_reg is set to be 0.01 which is referred to as eps in Eq.18 in the paper.
However, in the corresponding code line,
luckmatters/ssl/real-dataset/byol_trainer.py
Line 257 in e621d9d
So which parameter, dyn_eps or dyn_reg, refers to the eps in Eq.18?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.