GithubHelp home page GithubHelp logo

Comments (12)

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024 1

I tested with your trained model and here is the result:

-----------build encoder: deeplab pre-trained-----------
after start block: (1, ?, ?, 64)
after block1: (1, ?, ?, 256)
after block2: (1, ?, ?, 512)
after block3: (1, ?, ?, 1024)
after block4: (1, ?, ?, 2048)
-----------build decoder-----------
after aspp block: (1, ?, ?, 19)
Restored model parameters from model/model.ckpt-15000
step 0
step 100
step 200
step 300
step 400
Pixel Accuracy: 0.940
Mean IoU: 0.678

Please check your dataset.

from deeplab-v2--resnet-101--tensorflow.

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024

The test phase has nothing to do with 'input_height' and 'input_width' as they are setting the patch size for training. Your test phase did not work because you did not change 'valid_num_steps' which is 500 for cityscape and 'valid_step' which should be 20000 in your setting.
My experiments showed that coarse data hurt the performance. I'm not sure about the reason. I'll appreciate it if you can share your results here. Thanks.

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Thanks for your correction. I changed these lines and also image width and height as 713 to reduce memory. I have trained the model again with 15k iterations and perform inference but it still got error as below. I used deeplab weight as initial weight. How should I fix it?

You can download my model at https://drive.google.com/open?id=1PPVvKmmDHuvz1hFq9TSdbMkLkjNWT4x7

-----------build decoder-----------
('after aspp block:', TensorShape([Dimension(1), Dimension(None), Dimension(None), Dimension(19)]))
Restored model parameters from model_cityscape/model.ckpt-15000
Traceback (most recent call last):
  File "main_cityscape.py", line 75, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "main_cityscape.py", line 69, in main
    getattr(model, args.option)()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 90, in test
    preds, _, _ = self.sess.run([self.pred, self.accu_update_op, self.mIou_update_op])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1095)]]

Caused by op u'Equal', defined at:
  File "main_cityscape.py", line 75, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "main_cityscape.py", line 69, in main
    getattr(model, args.option)()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 76, in test
    self.test_setup()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 271, in test_setup
    self.pred, gt, weights=weights)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/metrics/python/ops/metric_ops.py", line 465, in streaming_accuracy
    updates_collections=updates_collections, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/metrics_impl.py", line 409, in accuracy
    is_correct = math_ops.to_float(math_ops.equal(predictions, labels))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 681, in equal
    result = _op_def_lib.apply_op("Equal", x=x, y=y, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1095)]]

from deeplab-v2--resnet-101--tensorflow.

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024

The error information means that your prediction has shape 327680(=1024x320) while the label is of size 2097152(=1024x2048). The code is supposed to output a prediction of the same spatial size as the input:

raw_output = net.outputs
raw_output = tf.image.resize_bilinear(raw_output, tf.shape(self.image_batch)[1:3,])
raw_output = tf.argmax(raw_output, axis=3)
pred = tf.expand_dims(raw_output, dim=3)

I cannot identify the exact problem but you can debug by checking the size of your testing inputs. And I guess that you need to use python3.5. I did not test my code on python2.7.

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Hi, I was reinstalled tensorflow 1.3 GPU and python 3.4. I just snapshot at iteration 100th for testing and it still has error. I cannot print the output shape using print pred.shape (It returns ?). This is log for testing

ci bus id: 0000:02:00.0)
-----------build encoder: deeplab pre-trained-----------
after start block: (1, ?, ?, 64)
after block1: (1, ?, ?, 256)
after block2: (1, ?, ?, 512)
after block3: (1, ?, ?, 1024)
after block4: (1, ?, ?, 2048)
-----------build decoder-----------
after aspp block: (1, ?, ?, 19)
(?,) (?,)
Restored model parameters from model_cityscape/model.ckpt-100
2017-12-03 16:59:53.562388: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1075)]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1075)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main_cityscape.py", line 75, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "main_cityscape.py", line 69, in main
    getattr(model, args.option)()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 89, in test
    preds, _, _ = self.sess.run([self.pred, self.accu_update_op, self.mIou_update_op])
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1075)]]

Caused by op 'Equal', defined at:
  File "main_cityscape.py", line 75, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "main_cityscape.py", line 69, in main
    getattr(model, args.option)()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 75, in test
    self.test_setup()
  File "/home/john/Deeplab-v2--ResNet-101--Tensorflow/model.py", line 271, in test_setup
    self.pred, gt, weights=weights)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/metrics/python/ops/metric_ops.py", line 466, in streaming_accuracy
    updates_collections=updates_collections, name=name)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/metrics_impl.py", line 409, in accuracy
    is_correct = math_ops.to_float(math_ops.equal(predictions, labels))
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 753, in equal
    result = _op_def_lib.apply_op("Equal", x=x, y=y, name=name)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [327680] vs. [2097152]
	 [[Node: Equal = Equal[T=DT_UINT8, _device="/job:localhost/replica:0/task:0/gpu:0"](Cast_2, Select/_1075)]]

from deeplab-v2--resnet-101--tensorflow.

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024

You have to do sess.run(tf.shape(x)) to check the size of x. You got "?" for the spatial size because the code is supposed to take inputs of variable size for testing (which is the case in PASCAL dataset).

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Thanks. I have uploaded my model at 15k iterations. Could you try to run testing using my model? If it is error , so I think the problem is from training. If not,then problem is from testing. You can download my model at https://drive.google.com/open?id=1PPVvKmmDHuvz1hFq9TSdbMkLkjNWT4x7

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Finally, I solved the problem with your support. I hope it can help someone meet the same problem. The problem as the author mentioned about the size of the image. A long time ago, I replate one image in the validation set by another image, so, the new image has different size with ground-truth. Yesterday, I just check ground-truth size, without the raw image folder. After check the raw image folder and change to original cityscape image, it worked well. Sorry for my stupid.

@zhengyang-wang : How about your performance in the cityscape? Is it better than 0.678?

from deeplab-v2--resnet-101--tensorflow.

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024

Thank you. My best result is around 70.

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Thanks. I got 71.6% after 30k iters, with changing some learning weight value and weight decay using fine annotation only. If you interested, I will share the setting to you. I think multiple scale may help improve performance also, I hope you can spend time to implement it

from deeplab-v2--resnet-101--tensorflow.

zhengyang-wang avatar zhengyang-wang commented on July 23, 2024

Thanks. I’ll appreciate it if you share your settings.
I’ll implement it soon.

from deeplab-v2--resnet-101--tensorflow.

John1231983 avatar John1231983 commented on July 23, 2024

Hello. This is my setting
IMG_MEAN = np.array((103.939, 116.779, 123.68), dtype=np.float32)

	flags.DEFINE_integer('num_steps', 50000, 'maximum number of iterations')
	flags.DEFINE_integer('save_interval', 10000, 'number of iterations for saving and visualization')
	flags.DEFINE_integer('random_seed', 1234, 'random seed')
	flags.DEFINE_float('weight_decay', 0.0001, 'weight decay rate')
	flags.DEFINE_float('learning_rate', 1e-3, 'learning rate')
	flags.DEFINE_float('power', 0.9, 'hyperparameter for poly learning rate')
	flags.DEFINE_float('momentum', 0.9, 'momentum')
	flags.DEFINE_string('encoder_name', 'deeplab', 'name of pre-trained model, res101, res50 or deeplab')
	flags.DEFINE_string('pretrain_file', './reference model/deeplab_resnet_init.ckpt', 'pre-trained model filename corresponding to encoder_name')
	flags.DEFINE_string('data_list', './dataset_cityscapes/train_fine.txt', 'training data list filename')

	# testing / validation
	flags.DEFINE_integer('valid_step', 40000, 'checkpoint number for testing/validation')
	flags.DEFINE_integer('valid_num_steps', 500, '= number of testing/validation samples')
	flags.DEFINE_string('valid_data_list', './dataset_cityscapes/val_fine.txt', 'testing/validation data list filename')

	# data
	flags.DEFINE_string('data_dir', './cityscapes', 'data directory')
	flags.DEFINE_integer('batch_size', 3, 'training batch size')
	flags.DEFINE_integer('input_height', 713, 'input image height')
	flags.DEFINE_integer('input_width', 713, 'input image width')
	flags.DEFINE_integer('num_classes', 19, 'number of classes')
	flags.DEFINE_integer('num_steps', 50000, 'maximum number of iterations')
	flags.DEFINE_integer('save_interval', 10000, 'number of iterations for saving and visualization')
	flags.DEFINE_integer('random_seed', 1234, 'random seed')
	flags.DEFINE_float('weight_decay', 0.0005, 'weight decay rate')
	flags.DEFINE_float('learning_rate', 1e-3, 'learning rate')
	flags.DEFINE_float('power', 0.9, 'hyperparameter for poly learning rate')
	flags.DEFINE_float('momentum', 0.9, 'momentum')
	flags.DEFINE_string('encoder_name', 'deeplab', 'name of pre-trained model, res101, res50 or deeplab')
	flags.DEFINE_string('pretrain_file', './reference model/deeplab_resnet_init.ckpt', 'pre-trained model filename corresponding to encoder_name')
	flags.DEFINE_string('data_list', './dataset_cityscapes/train_fine.txt', 'training data list filename')

	# testing / validation
	flags.DEFINE_integer('valid_step', 40000, 'checkpoint number for testing/validation')
	flags.DEFINE_integer('valid_num_steps', 500, '= number of testing/validation samples')
	flags.DEFINE_string('valid_data_list', './dataset_cityscapes/val_fine.txt', 'testing/validation data list filename')

	# data
	flags.DEFINE_string('data_dir', './cityscapes', 'data directory')
	flags.DEFINE_integer('batch_size', 3, 'training batch size')
	flags.DEFINE_integer('input_height', 713, 'input image height')
	flags.DEFINE_integer('input_width', 713, 'input image width')
	flags.DEFINE_integer('num_classes', 19, 'number of classes')

from deeplab-v2--resnet-101--tensorflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.