GithubHelp home page GithubHelp logo

Comments (5)

izmailovpavel avatar izmailovpavel commented on July 19, 2024 2

Hi @roderickObrist, good to hear SWA is working well for you :) To make the figures we treat all the parameters of the network as one large (say 10 million-dimensional) vector. It includes both the biases and the weights. Say we have a total of D parameters. We treat the whole parameter space as just R^D. For each visualization we then pick three vectors v1, v2 and v3 in this R^D parameter space. These typically correspond to the weights of some networks, like SGD iterates from different iterations. Then, we construct the unique 2-d plane (affine subspace) that passes through these three vectors. We then plot the loss restricted to this 2-d subspace.

To answer your questions:

  1. Weights include both weights and biases, and they are not from a single layer. This is the full vector of all the network's parameters.
  2. We have a public implementation of a very similar visualization for our other paper here: https://github.com/timgaripov/dnn-mode-connectivity/blob/master/plane.py. I believe you would need to change this part here https://github.com/timgaripov/dnn-mode-connectivity/blob/master/plane.py#L96-L101, and load the weights of three networks v1, v2, v3 in the w list.

from swa.

roderickObrist avatar roderickObrist commented on July 19, 2024 2

Thank you kindly, I will implement this in my own project over the next few days.

from swa.

izmailovpavel avatar izmailovpavel commented on July 19, 2024

Please see footnote 1 on page 2 of the camera-ready version of the paper:
http://auai.org/uai2018/proceedings/papers/313.pdf
There we tried to clarify the exact procedure for making the loss and test error surface visualizations.

from swa.

izmailovpavel avatar izmailovpavel commented on July 19, 2024

I will close the issue for now, but I will be happy to answer if you will have further questions about those figures.

from swa.

roderickObrist avatar roderickObrist commented on July 19, 2024

@izmailovpavel Hi and thank you for the great work, I've been implementing SWA in my research project and the results are great. I just have a few questions regarding the illustrations.

  1. Are the weight vectors literally the weights (not biases) from a single linear layer of a network or are they the concatenation of the entire model?
  2. Would you be comfortable providing the snippet of code you used to make the figures? (Does not need to be functional/polished or commented). Just so I can double check my own implementation.

Thank you for what you have done for the community.

from swa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.