taki0112 / densenet-tensorflow Goto Github PK

Simple Tensorflow implementation of Densenet using Cifar10, MNIST

License: MIT License

Python 100.00%

densenet-tensorflow's Issues

Hellow,I'm a beginner of DL

Hellow,Thank you very much for providing this set of code,I'm a beginner of DL ,but i But I had some trouble implementing the cifar100.so So I wonder if you can provide the reading program of cifar100.
Thanks.

The graph couldn't be sorted in topological order.

Hi there!
I appreciated a lot for providing this tensorflow implementation of densenet. But when I running it on my own dataset, some warnings/errors appear and I don't know why. And I think they somehow affect the training process and results.
I searched this on stackoverflow and only found limited answers, including https://stackoverflow.com/questions/52607063/tensorflow-warning-the-graph-couldnt-be-sorted-in-topological-order. The answer says it's related with the design of graph.
I'm kind of confused. Could you help me figure this out?

The warnings/errors:
E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-07-02 15:08:43.119749: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-07-02 15:08:44.691616: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: The graph couldn't be sorted in topological order.
2019-07-02 15:08:44.828751: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] arithmetic_optimizer failed: Invalid argument: The graph couldn't be sorted in topological order.
2019-07-02 15:08:44.939032: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order.
2019-07-02 15:08:45.096445: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order.

why throws a ResourceExhaustedError after a 30 epochs training process?

i have encounted a problem with a ResourceExhaustedError ,while i have train this model with 30 epochs，some details as follows: image_size=224*224,batch_size=16,bn_blocks=2.and my GPU memory is about 16G.

shuffling data

Data shuffling operation is conducted only once, I guess it should be conducted at the beginning of every epoch.

freeze densnet ckpt to pb problems.

after training with this densenet, I cannot freeze the ckpt to pb.
KeyError: u'cond/linear_batch_1/cond_1/AssignMovingAvg/linear_batch/moving_mean/linear_batch/linear_batch/moving_mean/Switch/read'

Got following errors while running Densenet_MNIST.py ?

Traceback (most recent call last):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\contextlib.py", line 66, in exit
next(self.gen)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for 'trans_3/average_pooling2d/AvgPool' (op: 'AvgPool') with input shapes: [?,1,1,12].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/Densenet/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 176, in
logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model
File "E:/Densenet/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init
self.model = self.Dense_net(x)
File "E:/Densenet/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 150, in Dense_net
x = self.transition_layer(x, scope='trans_3')
File "E:/Densenet/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 110, in transition_layer
x = Average_pooling(x, pool_size=[2,2], stride=2)
File "E:/Densenet/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 65, in Average_pooling
return tf.layers.average_pooling2d(inputs=x, pool_size=pool_size, strides=stride, padding=padding)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\layers\pooling.py", line 361, in average_pooling2d
return layer.apply(inputs)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\layers\base.py", line 492, in apply
return self.call(inputs, *args, **kwargs)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\layers\base.py", line 441, in call
outputs = self.call(inputs, *args, **kwargs)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\layers\pooling.py", line 276, in call
data_format=utils.convert_data_format(self.data_format, 4))
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 1741, in avg_pool
name=name)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 48, in _avg_pool
data_format=data_format, name=name)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2508, in create_op
set_shapes_for_outputs(ret)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1873, in set_shapes_for_outputs
shapes = shape_func(op)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1823, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'trans_3/average_pooling2d/AvgPool' (op: 'AvgPool') with input shapes: [?,1,1,12].

growing filter

I notice your models doesn't have increasing filter as the original paper where each convolution layer in dense block has an increasing number of filter. Have you tried adding it and compare the difference?

Dense Net original implementation of dense block
https://github.com/liuzhuang13/DenseNet/blob/master/models/DenseConnectLayer.lua

A bug in dense_block

For example, in Densenet_Cifar10.py

   def dense_block(self, input_x, nb_layers, layer_name):
        with tf.name_scope(layer_name):
            layers_concat = list()
            layers_concat.append(input_x)

            x = self.bottleneck_layer(input_x, scope=layer_name + '_bottleN_' + str(0))

            layers_concat.append(x)

            for i in range(nb_layers - 1):
                x = Concatenation(layers_concat)
                x = self.bottleneck_layer(x, scope=layer_name + '_bottleN_' + str(i + 1))
                layers_concat.append(x)

            return x

The output of dense block layer should take all preceding feature-maps in the paper. So the return should be:

return Concatenation(layers_concat)

Right?

Why the axis of concat layer is 3

May I ask why the axis of concat layer is 3? I know it is concat layer.
https://github.com/taki0112/Densenet-Tensorflow/blob/master/Cifar10/Densenet_Cifar10.py#L73

And if training, the weight which is concated well be another weight ?
Or as the same as origin weight.
Thanks

Data preprocessing

Hi,

This code is very clean. But there is a problem, I think. When you do color preprocessing, the test data should not be normalized by their own mean and variance, instead, they should be normalized by the same way as the training data, i.e. minus the means of the training data and then divided by the std. of training data. Hope this comment helps you to form a rigorous set of codes.

batch normalize

Wow excellent work! Thanks for sharing~
Just a quick question: is there any reason why you don't use the batch_normalization in tf.layers(https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization)?
I'll be glad if you reply me~ thank you!

weights for the models provided

Hi,
Are there pretrained weights for these models trained on Cifar10? It'll be really convenient if you could provide those as well. Thanks a lot for sharing the code btw!! Amazing work!!

how to standardize a single image in testing phase?

I find that in the training and testing phase, the dataset is standardized as a whole batch, including computing the mean and variance in a per channel style.
However, when the model is deployed, the image is feed individually, how should we preprocess the image?
What mean and variance should we use?

About pretrained model for transfer learning

Hi Taki!
Your work is remarkbale.
I am Bi Qi, an graduate studying remote sensing image processing.
Could u please offer your pretrained model so that we can implement transfer learning?
Thx!

does it work on 224*224 image?

hi,
i am so appreciated with your great job.
i have try another version of densenet in 'https://github.com/ikhlestov/vision_networks'
but i found it can not trained on 224*224 images with densenet_BC, because the gpu memory out. it is not a memory-efficient implementations.
so whether your code is consider the memory-efficient?
thanks

Testing from a Saved checkpoint

Hi, I am a beginner to tensorflow,

Post successful training of MNIST dataset I got 4 files created in model directory:
dense.ckpt.meta
dense.ckpt.index
checkpoint
dense.ckpt.data-00000-of-00001

Can you please confirm If this is fine as I could the in last command the file name is only "dense.ckpt"

And I would like to test my custom data in one particular folder after training this model, Can you also please provide the code for it?

pretrained weight for accuracy verification

hi, could you share the pretrained checkpoint or pb file , then we can check the accuracy. that would be nice.

Implementation without compression or bottleneck layers

Hello, I just wonder have you ever reproduced any reported result without compression or bottleneck layers on cifar10. Since I only got 5.7%(k=12, depth=40) while the reported result is 5.2%.
Thank you.

Model compilation failing with ValueError: Variable dense_1_bottleN_0_batch1/beta already exists, disallowed

======Loading data======
DataSet aready exist!
Loading ./cifar-10-batches-py/data_batch_1 : 10000.
Loading ./cifar-10-batches-py/data_batch_2 : 10000.
Loading ./cifar-10-batches-py/data_batch_3 : 10000.
Loading ./cifar-10-batches-py/data_batch_4 : 10000.
Loading ./cifar-10-batches-py/data_batch_5 : 10000.
Loading ./cifar-10-batches-py/test_batch : 10000.
Train data: (50000, 32, 32, 3) (50000, 10)
Test data : (10000, 32, 32, 3) (10000, 10)
======Load finished======
======Shuffling data======
======Prepare Finished======

ValueError Traceback (most recent call last)
in ()
207 learning_rate = tf.placeholder(tf.float32, name='learning_rate')
208
--> 209 model = DenseNet(x=x, nb_blocks=nb_block, filters=growth_k, training=training_flag)
210
211 # logits = DenseNet(x=x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model

20 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource, constraint, synchronization, aggregation)
862 tb = [x for x in tb if "tensorflow/python" not in x[0]][:5]
863 raise ValueError("%s Originally defined at:\n\n%s" %
--> 864 (err_msg, "".join(traceback.format_list(tb))))
865 found_var = self._vars[name]
866 if not shape.is_compatible_with(found_var.get_shape()):

ValueError: Variable dense_1_bottleN_0_batch1/beta already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/variables.py", line 283, in variable
aggregation=aggregation)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/variables.py", line 355, in model_variable
aggregation=aggregation)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py", line 315, in _fused_batch_norm
trainable=trainable)

A bug in transition layer

the original filter in transition_layer is equal to the growth_k, which is too small ,so the result is not good ,and the network is hard to converge , so I change it as below , referring to another code , and the result is normal now and much more better.
def transition_layer(self, x, scope):
with tf.name_scope(scope):
x = Batch_Normalization(x, training=self.training, scope=scope+'_batch1')
x = Relu(x)
shape = x.get_shape().as_list()
in_channel = shape[3]
#x = conv_layer(x, filter=self.filters, kernel=[1,1], layer_name=scope+'_conv1')
x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
x = Drop_out(x, rate=dropout_rate, training=self.training)
x = Average_pooling(x, pool_size=[2,2], stride=2)

does not converge

I clone the repository and run cifar10 densenet.
The accuracy reaches 87% at epoch 150 and does not increase anymore.
I don't know why.
Should I change some hyperparameters?

Error while testing for Cifar10 dataset

Here's the error stack trace

Traceback (most recent call last):
  File "Densenet_Cifar10.py", line 270, in <module>
    _, batch_loss, weight_summary = sess.run([train, cost, merged], feed_dict=train_feed_dict)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 969, in _run
    fetch_handler = _FetchHandler(self._graph, fetches, feed_dict_string)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 408, in __init__
    self._fetch_mapper = _FetchMapper.for_fetch(fetches)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 230, in for_fetch
    return _ListFetchMapper(fetch)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 337, in __init__
    self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 337, in <listcomp>
    self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 227, in for_fetch
    (fetch, type(fetch)))
TypeError: Fetch argument None has invalid type <class 'NoneType'>

tf.summary.merge_all() returns None

An question

for i in range(self.nb_blocks) :
# 6 -> 12 -> 48
x = self.dense_block(input_x=x, nb_layers=4, layer_name='dense_'+str(i))
x = self.transition_layer(x, scope='trans_'+str(i))
What about those codes mean?
All nb_layers equals 4?
why use this code in Densenet_MNIST.py?

The inputs of all 1*1 layers in a dense block include the output of the transition layer？

In the code，the inputs of all the 11 conv layers in a dense block concatenate the output of the transition layer.
However, I calculated the parameters of the Densenet121. If the inputs of all the 11 conv layers in a dense block concatenate the output of the transition layer, the parameters are about 13M, but it
is not larger than 10M in the paper (Densely Connected Convolutional Networks). if the inputs of all 11 layers in a dense block do not include the output of transition layer？ The parameters are about 8.8M. So I think that the inputs of all 11 layers in a dense block do not include the outputs of the transition layer.

taki0112 / densenet-tensorflow Goto Github PK

densenet-tensorflow's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs