vahidk / effectivetensorflow Goto Github PK
View Code? Open in Web Editor NEWTensorFlow tutorials and best practices.
Home Page: https://twitter.com/VahidK
TensorFlow tutorials and best practices.
Home Page: https://twitter.com/VahidK
Thank you for writing such a great TensorFlow tutorial. Do you plan to discuss best practices for effectively feeding data while training? I really hope you do, as this seems to be something a lot of beginners really struggle with.
The material is great. I can understand how Tensorflow works and go deeper. But there is a lack of some commenting on special arguments. As an example, in the stochastic gradient descent code, there are some parameters for the axis values that I can't understand. It would be great if you provide them too. I know that's a little too much too ask, but for a beginner like me, that's crucial. I've also tried reading the docs of Tensorflow, but they are not good at explaining the axis terms in a simple and understandable way.
Thanks!
Hello!
Thank you for the wonderful guide! There's one thing I'm confused about: the recipe for entropy uses a manually-defined softmax function instead of tf.nn.softmax
. Is there a particular reason for this, or was it just to demonstrate how to implement both numerically-stable softmax and entropy?
Cheers!
I've tried to run the penultimate example of the "Scopes and when to use them" section like so:
image1 = tf.placeholder(tf.float32, shape=[None, 100, 100, 3])
image2 = tf.placeholder(tf.float32, shape=[None, 100, 100, 3])
features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3)
# Use the same convolution weights to process the second image
with tf.variable_scope(tf.get_variable_scope(), reuse=True):
features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3)
but I got:
ValueError: Variable conv2d_1/kernel does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?
So I tried:
conv1_weights = tf.get_variable('conv1_w', [3, 3, 3, 64])
features1 = tf.nn.conv2d(image1, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
# Use the same convolution weights to process the second image
with tf.variable_scope(tf.get_variable_scope(), reuse=True):
conv1_weights = tf.get_variable('conv1_w')
features2 = tf.nn.conv2d(image2, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
and this does work (at least in terms of raising errors), but does not provide the segue to the final example (which uses tf.layers.conv2d
).
Perhaps you know a way of modifying the current example so that it's runnable?
EDIT:
I should have added: my tf.__version__
is '1.6.0-rc0'
The code relevant to this issue can be found here
Situation of the problem
I am using tf.contrib.staging.StagingArea
for efficient usage of GPUs by prefetching.
To explain the issue better I am taking a small part of the snippet from the above code here :
with tf.device("/gpu:0"):
runningcorrect = tf.get_variable("runningcorrect", [], dtype=tf.float32, initializer=tf.zeros_initializer(), trainable=False)
runningnum = tf.get_variable("runningnum", [], dtype=tf.float32, initializer=tf.zeros_initializer(), trainable=False)
for i in range(numgpus):
with tf.variable_scope(tf.get_variable_scope(), reuse=i>0) as vscope:
with tf.device('/gpu:{}'.format(i)):
with tf.name_scope('GPU-Tower-{}'.format(i)) as scope:
stagingarea = tf.contrib.staging.StagingArea([tf.float32, tf.int32], shapes=[[trainbatchsize, 3, 221, 221], [trainbatchsize]], capacity=20)
stagingclarify.append(stagingarea.clear())
putop = stagingarea.put(input_iterator.get_next())
train_put_list.append(putop)
getop = stagingarea.get()
train_get_list.append(getop)
elem = train_get_list[i]
net, networksummaries = overfeataccurate(elem[0],numclasses=1000)
So I am using a tf.contrib.staging.StagingArea
on each GPU. Each StagingArea
takes its input from a tf.contrib.data.Dataset
using a tf.contrib.data.Iterator
. For each GPU the input is taken from the StagingArea
using a StagingArea.get()
op.
The Problem
Initially the training works fine. Towards the end of an epoch however, when a StagingArea
does not get trainbatchsize
number of tensors and the tf.contrib.data.Iterator
has produced a tf.errors.OutOfRangeError
, the training is blocked. It is clear that why this problem is happening. However I am not able to think of a clean way to correct this problem.
Can I get insights into this issue ?
I tried The code exapmle b = tf.placeholder(tf.float32, [None, 10, 32]); shape = get_shape(b)
,
but when I print out the shape, it show tensor objects, rather than the dynamic/static shape as expected.
I wonder how can I use this get_shape
function in a session properly in order to get a placeholder's shape?
Thx!
a = tf.random_uniform([5, 3, 5])
b = tf.random_uniform([5, 1, 6])
# concat a and b and apply nonlinearity
tiled_b = tf.tile(b, [1, 3, 1])
c = tf.concat([a, tiled_b], 2)
Should be a = tf.random_uniform([5, 3, 6])
of b = tf.random_uniform([5, 1, 5])
I use the code in README to run using single gpu and multi gpu.
I find that multi gpu is slower than single gpu
why?
import tensorflow as tf
a = tf.constant([[1., 2.], [3., 4.]])
b = tf.constant([[1.], [2.]])
# c = a + tf.tile(a, [1, 2])
c = a + b
Should be # c = a + tf.tile(b, [1, 2])
?
Hello!
I have a question regarding the multi-gpu recipe.
Shouldn't it be something along the lines
out_split.append(fn(**{k : v[i] for k,v in in_splits.iteritems()}))
instead of just
out_split.append(fn(**kwargs))
in the make_parallel
function?
Awesome work, by the way!
Could you please add an explicit LICENSE
file to the repo so that it's clear under what terms the content is provided, and under what terms user contributions are licensed?
[...] without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work. If you're creating an open source project, we strongly encourage you to include an open source license.
Thanks!
In the example code there is the softmax
function activation just before tf.nn.softmax_cross_entropy_with_logits
:
def non_differentiable_entropy(logits):
probs = tf.nn.softmax(logits)
return tf.nn.softmax_cross_entropy_with_logits(labels=probs, logits=logits)
The TensorFlow explicitly states that it is incorrect since it might lead to a vanishing gradient problem:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.
TL;DR - There should be no softmax
before tf.nn.softmax_cross_entropy_with_logits
.
TensorFlow introduces two different context managers to alter the name of tensors and variables. The first is tf.name_scope which modifies the name of tensors:
This sentence might mean that tf.name_scope
does not modify the name of variables.
However, when tf.Variable
constructor is called the name of Variable is also altered.
with tf.name_scope("scope"):
a = tf.get_variable(name="a", shape=[])
print(a.name) # prints "a:0"
b = tf.constant(1, name="b")
print(b.name) # prints "scope/b:0"
c = tf.Variable(tf.zeros([]), name="c")
print(c.name) # prints "scope/c:0"
What do you think, is this point worth to be included in your wonderful guide?
In the example for converting the Tensor of rank 3 to rank 2, a combination of static and dynamic shapes are used (based on the get_sahpe function). Is not it enough to use the dynamic shapes for this purpose as follows? What is the merit of using static shapes?
b = tf.placeholder(tf.float32, [None, 10, 32])
shape = tf.shape(b)
b = tf.reshape(b, [shape[0], shape[1] * shape[2]])
In README.md
For example, consider the tf.matmul op, it can multiply two matrices:
a = tf.random_uniform([2, 3]) b = tf.random_uniform([3, 4]) c = tf.matmul(a, b) # c is a tensor of shape [2, 4]
But the same function also does batch matrix multiplication:
a = tf.random_uniform([10, 2, 3]) b = tf.random_uniform([10, 3, 4]) tf.matmul(a, b) # c is a tensor of shape [10, 2, 4]
Shouldn't the first c be a tensor of shape [6, 12]
And the second c be a tensor of shape [100, 6, 12]
I mean if there are 4 gpus that can be used for data parallelism. Where are the variables placed ? All variables are placed on the gpu:0 or in some kind of other allocation approach?
If all the variables are placed on the gpu0, it seems possible to meet the OOM (Out of Memory
) issue.
Waiting for your reply, thanks!
dataset = tf.contrib.data.Dataset.TFRecordDataset(path_to_data)
->
dataset = tf.contrib.data.TFRecordDataset(path_to_data)
In 'Scopes and when to use them` section
with tf.variable_scope("scope", reuse=tf.AUTO_REUSE):
features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3)
features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3)
The above conv2d layer won't share weights, it seems that we have to explicitly specify name attribute of tf.layers.conv2d
to share weights like,
with tf.variable_scope("scope", reuse=tf.AUTO_REUSE):
features1 = tf.layers.conv2d(image1, filters=32, kernel_size=3, name='conv2d')
features2 = tf.layers.conv2d(image2, filters=32, kernel_size=3, name='conv2d')
tf.version = 1.8.0
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.