GithubHelp home page GithubHelp logo

luckmatters's Introduction

This repo contains the code of the following 6 papers:

Analysis of Self-supervised learning (./ssl)

Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning

Yuandong Tian

arXiv

Understanding Deep Contrastive Learning via Coordinate-wise Optimization

Yuandong Tian

NeurIPS'22 Oral

Understanding self-supervised Learning Dynamics without Contrastive Pairs

Yuandong Tian, Xinlei Chen, Surya Ganguli

ICML 2021 link, Outstanding Paper Honorable Mention

Understanding Self-supervised Learning with Dual Deep Networks

Yuandong Tian, Lantao Yu, Xinlei Chen, Surya Ganguli

arXiv link

Teacher-student setting in supervised learning

Student Specialization in Deep ReLU Networks With Finite Width and Input Dimension (./student_specialization)

Yuandong Tian

ICML 2020 link

Luck Matters: Luck Matters: Understanding Training Dynamics of Deep ReLU Networks (./luckmatter)

Yuandong Tian, Tina Jiang, Qucheng Gong, Ari Morcos

arxiv link

luckmatters's People

Contributors

yuandong-tian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

luckmatters's Issues

Confusion about dyn_eps and dyn_reg in DirectPred

Hi authors,

First of all, this is an issue on paper DirectPred. And I really like your DirectPred paper:)

There is confusion here:
In the README, dyn_reg is set to be 0.01 which is referred to as eps in Eq.18 in the paper.
However, in the corresponding code line,

eigen_values = eigen_values.pow(1/self.dyn_convert) + self.dyn_eps
, eps in Eq. 18 is represented by dyn_eps.

So which parameter, dyn_eps or dyn_reg, refers to the eps in Eq.18?

Thanks!

"Init Teacher" never ends

I tried to run the recon_multilayer.py script with MNIST dataset. However, it got stuck at the "Init teacher" step for more than 2 hrs and didn't go to the next step. I wonder if the code enters an infinite loop?

input:
python recon_multilayer.py --dataset mnist
output:
cuda: Namespace(batchsize=64, bn=False, bn_affine=False, bn_before_relu=False, cmdline='recon_multilayer.py --dataset mnist', cross_entropy=False, d_output=0, data_d=20, data_std=10.0, dataset='mnist', eval_batchsize=64, init_multi=4, json_output=False, ks=[10, 15, 20, 25], load_teacher=None, lr={0: 0.01}, momentum=0.0, no_bias=False, no_sep=False, node_multi=10, normalize=False, num_epoch=40, num_iter=30000, num_trial=10, perturb=None, regen_dataset_each_epoch=False, same_dir=False, same_sign=False, save_dir='./', seed=1, signature='070519_161701_245908', stats_H=False, stats_w=False, teacher_bn=False, teacher_bn_affine=False, use_cnn=False, weight_decay=0) ks: [10, 15, 20, 25] d_output: 10 Init teacher..
`

Some files are missing in “ssl/common_utils”

I tried to run the bn_gen.py script, but common_utils.MultiRunUtil seems to be missing which lead to errors. I managed to modify one of its function, common_utils.MultiRunUtil.load_full_cfg by myself to make the code usable. However, common_utils.MultiRunUtil.load_omega_conf, common_utils.print_info and maybe more remain undone. Please update the "ssl/common_utils" folder.

Do you try on the multiple GPUs

  • I tried to train your code on the multiple GPU with STL10

  • On single GPU, it has the same score as your paper.

  • But, it has lower score on the multiple GPUs by using "torch.nn.parallel.DataParallel"

  • How can I get the same results ?

STL10 setting in alpha-CL

Hi, nice work!
When I re-implement alpha-CL performance on stl10, it doesn't work. Is it correct to set lr=1e-3, p=4, batch_size=256, and \tau=0.5 for ResNet50 (c.f. Table 3)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.