vahidk / effectivetensorflow Goto Github PK

View Code? Open in Web Editor NEW

8.6K 349.0 915.0 151 KB

TensorFlow tutorials and best practices.

Home Page: https://twitter.com/VahidK

tensorflow neural-network deep-learning machine-learning ebook

effectivetensorflow's People

Stargazers

Watchers

Forkers

ml-lab zoom-quiet gongqingyi-github leidianxiaoxiang benjamesbabala zhouyonglong playplaydata kevinking lfwin kiranhk tastycanofmalk vonum merajat gnvsoftware sscottsd cclauss bityangke zssasa deeplearningjourney inzahgi kod3r chaipat-ncm ksharpdabu mrace mnrmja007 paojianghu yyyyaaa cedrickchee zumbalamambo xiaofengzhiyu projetosparalelos prem2017 rhinojosa biranchi2018 lohithsubramanya 19ai oleiva warrendev3190 starcyclone neo4reo shuklaamar treaston2 tonydeep scarletmclearn data-scientist-ml1 d1p ml-ai-nlp-ir tony32769 pranishd1 gopikrishnagurram philosophercode cptpackage tahir-shahzad4 moustaphacheikh hedyla94 smottahedi akangaziz surfcao draviteja hytsang richardknop ak3ra ai-hack life1347 jeremiq dm gluch dquarks bundit2525 asyrafjanai hbcbh1999 heyuhere programmingtips igerry wuqixiaobai fanshiqing helloweishi azhao1981 fendouai williamelts codewinguagua buptpatriot sachuin23 jabez128 vdt wonyonyon vgoklani pbamotra aaxwaz vineethkanaparthi matrixy hhy5277 rasouli dl-yc rouseguy fw1121 parksangkil 0xdaksh alonsotagle nimmen

effectivetensorflow's Issues

TensorFlow data input

Thank you for writing such a great TensorFlow tutorial. Do you plan to discuss best practices for effectively feeding data while training? I really hope you do, as this seems to be something a lot of beginners really struggle with.

Detailed comments.

The material is great. I can understand how Tensorflow works and go deeper. But there is a lack of some commenting on special arguments. As an example, in the stochastic gradient descent code, there are some parameters for the axis values that I can't understand. It would be great if you provide them too. I know that's a little too much too ask, but for a beginner like me, that's crucial. I've also tried reading the docs of Tensorflow, but they are not good at explaining the axis terms in a simple and understandable way.
Thanks!

Explain why tf.nn.softmax isn't used for entropy

Hello!

Thank you for the wonderful guide! There's one thing I'm confused about: the recipe for entropy uses a manually-defined softmax function instead of tf.nn.softmax. Is there a particular reason for this, or was it just to demonstrate how to implement both numerically-stable softmax and entropy?

Cheers!

Mistake in "Scopes and when to use them"

I've tried to run the penultimate example of the "Scopes and when to use them" section like so:

image1 = tf.placeholder(tf.float32, shape=[None, 100, 100, 3])
image2 = tf.placeholder(tf.float32, shape=[None, 100, 100, 3])

features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3)

# Use the same convolution weights to process the second image
with tf.variable_scope(tf.get_variable_scope(), reuse=True):
    features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3)

but I got:

ValueError: Variable conv2d_1/kernel does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

So I tried:

conv1_weights = tf.get_variable('conv1_w', [3, 3, 3, 64])
features1 = tf.nn.conv2d(image1, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')

# Use the same convolution weights to process the second image
with tf.variable_scope(tf.get_variable_scope(), reuse=True):
    conv1_weights = tf.get_variable('conv1_w')
    features2 = tf.nn.conv2d(image2, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')

and this does work (at least in terms of raising errors), but does not provide the segue to the final example (which uses tf.layers.conv2d).

Perhaps you know a way of modifying the current example so that it's runnable?

EDIT:
I should have added: my tf.__version__ is '1.6.0-rc0'

Avoiding blocking of processes due to lack of data

The code relevant to this issue can be found here
Situation of the problem
I am using tf.contrib.staging.StagingArea for efficient usage of GPUs by prefetching.
To explain the issue better I am taking a small part of the snippet from the above code here :

with tf.device("/gpu:0"):
        runningcorrect = tf.get_variable("runningcorrect", [], dtype=tf.float32, initializer=tf.zeros_initializer(), trainable=False)
        runningnum = tf.get_variable("runningnum", [], dtype=tf.float32, initializer=tf.zeros_initializer(), trainable=False)
    for i in range(numgpus):
        with tf.variable_scope(tf.get_variable_scope(), reuse=i>0) as vscope:
            with tf.device('/gpu:{}'.format(i)):
                with tf.name_scope('GPU-Tower-{}'.format(i)) as scope:
                    stagingarea = tf.contrib.staging.StagingArea([tf.float32, tf.int32], shapes=[[trainbatchsize, 3, 221, 221], [trainbatchsize]], capacity=20)
                    stagingclarify.append(stagingarea.clear())
                    putop = stagingarea.put(input_iterator.get_next())
                    train_put_list.append(putop)
                    getop = stagingarea.get()
                    train_get_list.append(getop)
                    elem = train_get_list[i]
                    net, networksummaries =  overfeataccurate(elem[0],numclasses=1000)

So I am using a tf.contrib.staging.StagingArea on each GPU. Each StagingArea takes its input from a tf.contrib.data.Dataset using a tf.contrib.data.Iterator. For each GPU the input is taken from the StagingArea using a StagingArea.get() op.

The Problem
Initially the training works fine. Towards the end of an epoch however, when a StagingArea does not get trainbatchsize number of tensors and the tf.contrib.data.Iterator has produced a tf.errors.OutOfRangeError, the training is blocked. It is clear that why this problem is happening. However I am not able to think of a clean way to correct this problem.
Can I get insights into this issue ?

how does the `get_shape` function work with placeholders?

I tried The code exapmle b = tf.placeholder(tf.float32, [None, 10, 32]); shape = get_shape(b) ,
but when I print out the shape, it show tensor objects, rather than the dynamic/static shape as expected.

I wonder how can I use this get_shape function in a session properly in order to get a placeholder's shape?

Thx!

Mistake in "broadcasting good and ugly"

a =  tf.random_uniform([5, 3, 5])
b = tf.random_uniform([5, 1, 6])
# concat a and b and apply nonlinearity
tiled_b = tf.tile(b, [1, 3, 1])
c = tf.concat([a, tiled_b], 2)

Should be a = tf.random_uniform([5, 3, 6]) of b = tf.random_uniform([5, 1, 5])

why multi gpu is slower than single gpu

I use the code in README to run using single gpu and multi gpu.
I find that multi gpu is slower than single gpu
why?

Mistake in “Broadcasting the good and the ugly”

import tensorflow as tf

a = tf.constant([[1., 2.], [3., 4.]])
b = tf.constant([[1.], [2.]])
# c = a + tf.tile(a, [1, 2])
c = a + b

Should be # c = a + tf.tile(b, [1, 2]) ?

Multi-GPU code

Hello!

I have a question regarding the multi-gpu recipe.
Shouldn't it be something along the lines

out_split.append(fn(**{k : v[i] for k,v in in_splits.iteritems()}))

instead of just

out_split.append(fn(**kwargs))

in the make_parallel function?

Awesome work, by the way!

Please add a license to this repo

Could you please add an explicit LICENSE file to the repo so that it's clear under what terms the content is provided, and under what terms user contributions are licensed?

Per GitHub docs on licensing:

[...] without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work. If you're creating an open source project, we strongly encourage you to include an open source license.

Thanks!

Check your gradients with tf.compute_gradient_error - softmax applied twice.

In the example code there is the softmax function activation just before tf.nn.softmax_cross_entropy_with_logits:

def non_differentiable_entropy(logits):
    probs = tf.nn.softmax(logits)
    return tf.nn.softmax_cross_entropy_with_logits(labels=probs, logits=logits)

The TensorFlow explicitly states that it is incorrect since it might lead to a vanishing gradient problem:

WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

TL;DR - There should be no softmax before tf.nn.softmax_cross_entropy_with_logits.

tf.name_scope & variables

TensorFlow introduces two different context managers to alter the name of tensors and variables. The first is tf.name_scope which modifies the name of tensors:

This sentence might mean that tf.name_scope does not modify the name of variables.

However, when tf.Variable constructor is called the name of Variable is also altered.

with tf.name_scope("scope"):
    a = tf.get_variable(name="a", shape=[])
    print(a.name)  # prints "a:0"

    b = tf.constant(1, name="b")
    print(b.name)  # prints "scope/b:0"
    
    c = tf.Variable(tf.zeros([]), name="c")
    print(c.name)  # prints "scope/c:0"

What do you think, is this point worth to be included in your wonderful guide?

Why use static shapes while converting the Tensor of rank 3 to rank 2?

In the example for converting the Tensor of rank 3 to rank 2, a combination of static and dynamic shapes are used (based on the get_sahpe function). Is not it enough to use the dynamic shapes for this purpose as follows? What is the merit of using static shapes?

b = tf.placeholder(tf.float32, [None, 10, 32])
shape = tf.shape(b)
b = tf.reshape(b, [shape[0], shape[1] * shape[2]])

typo?

In README.md

Debugging TensorFlow models

For example, consider the tf.matmul op, it can multiply two matrices:

a = tf.random_uniform([2, 3]) b = tf.random_uniform([3, 4]) c = tf.matmul(a, b) # c is a tensor of shape [2, 4]

But the same function also does batch matrix multiplication:

a = tf.random_uniform([10, 2, 3]) b = tf.random_uniform([10, 3, 4]) tf.matmul(a, b) # c is a tensor of shape [10, 2, 4]

Shouldn't the first c be a tensor of shape [6, 12]

And the second c be a tensor of shape [100, 6, 12]

Where are the trainable variables placed in the "Multi-GPU processing with data parallelism " ?

I mean if there are 4 gpus that can be used for data parallelism. Where are the variables placed ? All variables are placed on the gpu:0 or in some kind of other allocation approach？
If all the variables are placed on the gpu0, it seems possible to meet the OOM (Out of Memory
) issue.
Waiting for your reply, thanks!

Fix import of TFRecordDataset

dataset = tf.contrib.data.Dataset.TFRecordDataset(path_to_data)

dataset = tf.contrib.data.TFRecordDataset(path_to_data)

tf.AUTO_REUSE work with tf.layers.conv2d

In 'Scopes and when to use them` section

with tf.variable_scope("scope", reuse=tf.AUTO_REUSE):
  features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3)
  features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3)

The above conv2d layer won't share weights, it seems that we have to explicitly specify name attribute of tf.layers.conv2d to share weights like,

with tf.variable_scope("scope", reuse=tf.AUTO_REUSE):
  features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3, name='conv2d')
  features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3, name='conv2d')

tf.version = 1.8.0

Thanks

vahidk / effectivetensorflow Goto Github PK

effectivetensorflow's People

Stargazers

Watchers

Forkers

effectivetensorflow's Issues

Debugging TensorFlow models

Recommend Projects

Recommend Topics

Recommend Org

Jobs