GithubHelp home page GithubHelp logo

ghanima / awesome-distributed-deep-learning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bharathgs/awesome-distributed-deep-learning

0.0 1.0 0.0 51 KB

A curated list of awesome Distributed Deep Learning resources.

awesome-distributed-deep-learning's Introduction

Awesome Distributed Deep Learning

A curated list of awesome Distributed Deep Learning resources.

Table of Contents

Frameworks

  1. MXNet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more.
  2. go-mxnet-predictor - Go binding for MXNet c_predict_api to do inference with pre-trained model.
  3. deeplearning4j - Distributed Deep Learning Platform for Java, Clojure, Scala.
  4. Distributed Machine learning Tool Kit (DMTK) - A distributed machine learning (parameter server) framework by Microsoft. Enables training models on large data sets across multiple machines. Current tools bundled with it include: LightLDA and Distributed (Multisense) Word Embedding.
  5. Elephas - Elephas is an extension of Keras, which allows you to run distributed deep learning models at scale with Spark.
  6. Horovod - Distributed training framework for TensorFlow.

Blogs

  1. Keras + Horovod = Distributed Deep Learning on Steroids
  2. Meet Horovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlow
  3. distributed-deep-learning-part-1-an-introduction-to-distributed-training-of-neural-networks/

Papers

General:

  1. Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis:discusses the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, it also extrapolates the potential directions for parallelism in deep learning.

Synchronization:

Synchronous techniques:

  1. Deep learning with COTS HPC systems: Commodity Off-The-Shelf High Performance Computing (COTS HPC) technology, a cluster of GPU servers with Infiniband interconnects and MPI.
  2. FireCaffe: near-linear acceleration of deep neural network training on compute clusters : The speed and scalability of distributed algorithms is almost always limited by the overhead of communicating between servers; DNN training is not an exception to this rule. Therefore, the key consideration this paper makes is to reduce communication overhead wherever possible, while not degrading the accuracy of the DNN models that we train.
  3. SparkNet: Training Deep Networks in Spark. In Proceedings of the International Conference on Learning Representations (ICLR).
  4. 1-Bit SGD: 1-Bit Stochastic Gradient Descent and Application to Data-Parallel Distributed Training of Speech DNNs, In Interspeech 2014.
  5. Scalable Distributed DNN Training Using Commodity GPU Cloud Computing:It introduces a new method for scaling up distributed Stochastic Gradient Descent (SGD) training of Deep Neural Networks (DNN). The method solves the well-known communication bottleneck problem that arises for data-parallel SGD because compute nodes frequently need to synchronize a replica of the model.
  6. Multi-GPU Training of ConvNets.: Training of ConvNets on multiple GPU's

Stale-Synchronous techniques:

  1. Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study.
  2. A Fast Learning Algorithm for Deep Belief Nets.:A fast learning algorithm for deep belief nets
  3. Heterogeneity-aware Distributed Parameter Servers.: J. Jiang, B. Cui, C. Zhang, and L. Yu. 2017. Heterogeneity-aware Distributed Parameter Servers. In Proc. 2017 ACM International Conference on Management of Data (SIGMOD ’17). 463–478.
  4. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization:X. Lian, Y. Huang, Y. Li, and J. Liu. 2015. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization. In Proc. 28th Int’l Conf. on NIPS - Volume 2. 2737–2745.
  5. Staleness-Aware Async-SGD for Distributed Deep Learning: W. Zhang, S. Gupta, X. Lian, and J. Liu. 2016. Staleness-aware async-SGD for Distributed Deep Learning. In Proc. Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16). 2350–2356.

Asynchronous techniques:

  1. A Unified Analysis of HOGWILD!-style Algorithms.: C. De Sa, C. Zhang, K. Olukotun, and C. Ré. 2015. Taming the Wild: A Unified Analysis of HOGWILD!-style Algorithms. In Proc. 28th Int’l Conf. on NIPS - Volume 2. 2674–2682.
  2. Large Scale Distributed Deep Networks: J. Dean et al. 2012. Large Scale Distributed Deep Networks. In Proc. 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS’12). 1223–1231.
  3. Asynchronous Parallel Stochastic Gradient Descent:J. Keuper and F. Pfreundt. 2015. Asynchronous Parallel Stochastic Gradient Descent: A Numeric Core for Scalable Distributed Machine Learning Algorithms. In Proc. Workshop on MLHPC. 1:1–1:11.
  4. Dogwild!-Distributed Hogwild for CPU & GPU.: C. Noel and S. Osindero. 2014. Dogwild!-Distributed Hogwild for CPU & GPU. In NIPS Workshop on Distributed Machine Learning and Matrix Computations.
  5. GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training.: T. Paine, H. Jin, J. Yang, Z. Lin, and T. S. Huang. 2013. GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training. (2013). arXiv:1312.6186
  6. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent: B. Recht, C. Re, S. Wright, and F. Niu. 2011. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. In Advances in Neural Information Processing Systems 24. 693–701.
  7. Asynchronous stochastic gradient descent for DNN training: S. Zhang, C. Zhang, Z. You, R. Zheng, and B. Xu. 2013. Asynchronous stochastic gradient descent for DNN training. In IEEE International Conference on Acoustics, Speech and Signal Processing. 6660–6663.

Feedback: If you have any ideas or you want any other content to be added to this list, feel free to contribute.

awesome-distributed-deep-learning's People

Contributors

bharathgs avatar

Watchers

Craig Hancock avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.