GithubHelp home page GithubHelp logo

sergia-ch / optmlproject Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 55.87 MB

Comparing minima sharpness for Frank-Wolfe, SGD and Adam on small nets using TF

Jupyter Notebook 96.30% Python 2.33% Shell 1.37%
optimization tensorflow minima-sharpness generalization

optmlproject's Introduction

OptMLProject

Frank-Wolfe Optimization method for NNs with minima sharpness analysis.

Mariam Hakobyan, Sergei Volodin. Swiss Federal Institute of Technology in Lausanne (EPFL)

We train small fully-connected networks on MNIST using (Frank-Wolfe, Adam, SGD) and measure minima sharpness via Hessian eigenvalues.

Mini-Project for Optimization for Machine Learning CS-439 at EPFL, 2019

Optimizers

We consider SGD, Adam and Frank-Wolfe, with and without averaging. See our report for more details

How to run experiments

Tested on Ubuntu 16.04.5 LTS with 12 CPU, 60GB of RAM and 2x GPU NVidia GeForce 1080.

  1. Install Anaconda (Python 3.7 option)
  2. Create and activate an environment
  3. Clone/download: git clone https://github.com/sergeivolodin/OptMLProject.git; cd OptMLProject
  4. Install requirements: pip install -r requirements.txt. Install tensorflow-gpu by conda install -c anaconda tensorflow-gpu
  5. Run all settings by calling run_all.sh
  6. It will produce output/*.output files and output/figures/*.pdf files, as well as will output run information to run_*.txt

Project structure

  1. experiment.py the main file containing one experiment (loading optimizer, training, computing Hessian, computing metrics)
  2. helpers.py contains a definition of a Fully-Connected Network FCModelConcat() with variables as a single tensor (needed to compute the Hessian). In addition, it contains our own implementation of the Stochastic Frank-Wolfe method StochasticFrankWolfe(). This file also contains helper functions required in the experiments, such as training code, dataset loaders, Hessian calculation
  3. create_run.py creates the .sh script from config.py
  4. analyze_run.py analyzes output produced by training (the .sh script) and writes output to run_*.txt and figures to output/figures
  5. create_analyze_runs_helpers.py is the helper file for the previous notebook containing code to make the results nice
  6. output/*.sh files consist of many lines of the form python ../experiment.py --param1 v1 --param2 v2 ..., running at most 4 processes in total (2 per GPU)
  7. output/*.output files contain outputs of experiment.py (one run corresponds to one file)
  8. output/figures contains generated figures
  9. run_setting.sh runs a particular setting (create + .sh + analyze) and writes data to a file
  10. run_all.sh runs all settings
  11. Other files are not used

optmlproject's People

Contributors

marcode10 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.