GithubHelp home page GithubHelp logo

tensorflow2-tutorial's Introduction

TensorFlow2-tutorial

Installation

git clone https://github.com/lambdal/TensorFlow2-tutorial.git
cd TensorFlow2-tutorial
virtualenv venv-tf2
. venv-tf2/bin/activate
pip install tf-nightly-gpu-2.0-preview==2.0.0.dev20190526

Tutorials Summary

See individual tutorial's README for details

01 Basic Image Classification

A tutorial of Image classification with ResNet.

  • Data pipeline with TensorFlow Dataset API
  • Model pipeline with Keras (TensorFlow 2's offical high level API)
  • Multi-GPU with distributed strategy
  • Customized training with callbacks (TensorBoard, Customized learning schedule)

02 Transfer Learning

This tutorial explains how to do transfer learning with TensorFlow 2. We will cover:

  • Handling Customized Dataset
  • Restore Backbone with Keras's application API
  • Restore backbone from disk

03 Checkpoint

This tutorial explains how use checkpoint to save and restore model during training.

  • Use tf.keras.ModelCheckpoint to save checkpoint
  • Resume training from a pre-saved checkpoint

04 Early Stopping

This tutorial explains how to implement early stopping in TensorFlow 2.

  • Use tf.keras.EarlyStopping callback to achieve early stopping.

05 Distributed Training Across Multi-Nodes

This tutorial explains how to do distributed training across multiple nodes:

  • Code boilerplate for multi-node distributed training
  • Run code across multiple machines

tensorflow2-tutorial's People

Contributors

chuanli11 avatar stephenbalaban avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow2-tutorial's Issues

Maintainence of repo

Hi,

Thanks for the tutorials. They are really great!
I would like to contribute to the repo probably with new tutorials, and I have already a pull request for new tutorial about training on large datasets.

Do you maintain the repo and are you open for pull requests? If yes, is there any guideline to prepare pull requests?

Best,
Ilker

tensorflow2 version required no longer available through pip

pip install tf-nightly-gpu-2.0-preview==2.0.0.dev20190526

ERROR: Could not find a version that satisfies the requirement tf-nightly-gpu-2.0-preview==2.0.0.dev20190526 (from versions: none)
ERROR: No matching distribution found for tf-nightly-gpu-2.0-preview==2.0.0.dev20190526

Resnet56 Training Results, compared to another Tensorflow 2 Model

I'm trying to learn TF2+Keras, and came across your great examples. I'm working my way through the image classification example. I'm comparing this to what I think is the official TF 2.0 Keras-based approach(?), here, which I'll call MOVIR (after the first initials in the path of the python file, within tensorflow/models).

When I run MOVIR (vanilla, no command line arguments), I get the following training & validation results (Val Acc=93.22%):

390/390 - 19s - loss: 0.1213 - categorical_accuracy: 0.9995
78/78 - 2s - loss: 0.4276 - categorical_accuracy: 0.9322

I've modified your example for resnet_cifar.py, changing #GPUs=1 and the # of epochs and the LR schedule to match MOVIR:

182, [(0.1, 91), (0.01, 136), (0.001, 182)]

The final results I get using the modified resnet_cifar.py are (Val Acc=78.32%):

Epoch 182/182
390/390 [==============================] - 30s 78ms/step - loss: 0.1410 - sparse_categorical_accuracy: 1.0000 - val_loss: 1.1042 - val_sparse_categorical_accuracy: 0.7832
78/78 [==============================] - 1s 17ms/step - loss: 1.1042 - sparse_categorical_accuracy: 0.7832

Both are using Resnet56 and BS=128.
[I'm using Windows10, TitanXP, Python 3.7, tf-nightly-gpu-2.1.0.dev20191028.]

Questions

  • Do you get similar results? If so, any idea on why the val acc results are low?
  • Thought: I don't see how MOVIR does augmentation, so I don't see how they reach a relatively decent result.

Additional Information
I noticed that the Resnet56 model that MOVIR uses is slightly different than the one that you use. So, I thought I'd try switching them. Here is the summary (LL = Lambda Labs). So, I think the performance difference is not due to the slightly different models.

Training File Resnet56 Model Final Validation Accuracy ~s/Epoch
LL LL 78.32% 31
LL MOVIR 78.52% 27
MOVIR LL 93.46%ย  21
MOVIR MOVIR 93.22% 19

Doubt in augmentation

It is mentioned in augmentation part that train_dataset.map(augmentation) will provide inflated training dataset. I am unsure as to that just change the original inputs rather than adding to the dataset size. Could you please confirm?

Distributed training - Parameter Server Strategy

Dear Chuan Li,

Thank you very much for sharing the distributed-training example for TensorFlow MultiWorkerMirroredStrategy. Would you think it is possible to apply ParameterServerStrategy with the rest of your code unchanged?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.