Dear authors: Hi! Thanks for opensourcing this repo. I meet several problems f

Dear <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi! <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

What is your environment for testing your model? about deeplab2 HOT 11 CLOSED

google-research commented on June 23, 2024

What is your environment for testing your model?

from deeplab2.

Comments (11)

markweberdev commented on June 23, 2024

Dear @lxtGH,

My setup is CUDA 11.2.2 with tf 2.5.0. I don't need to do any changes. I get the same layout error, but everything still runs fine.

The following example is the Motion-DeepLab trained on KITTI checkpoint that we provide evaluated on KITTI:

python trainer/train.py --config_file="./configs/kitti/motion_deeplab/resnet50_os32.textproto" --num_gpus=1 --mode=eval --model_dir="/**retracted**/models/deeplab2/kitti/"

I don't have access to a RTX-3090, so I am unable to verify that It runs there too.

2021-10-08 09:12:02.213287: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
I1008 09:12:05.060556 139644910475072 train.py:65] Reading the config file.
I1008 09:12:05.065931 139644910475072 train.py:69] Starting the experiment.
2021-10-08 09:12:05.068325: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-08 09:12:05.160309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2021-10-08 09:12:05.160363: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-10-08 09:12:05.346183: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-10-08 09:12:05.346345: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-10-08 09:12:05.393284: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-08 09:12:05.518453: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-08 09:12:05.607084: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-10-08 09:12:05.683005: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-10-08 09:12:05.686791: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-08 09:12:05.693328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-10-08 09:12:05.694196: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-08 09:12:05.699542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:83:00.0 name: NVIDIA TITAN X (Pascal) computeCapability: 6.1
coreClock: 1.531GHz coreCount: 28 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 447.48GiB/s
2021-10-08 09:12:05.706127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-10-08 09:12:05.706191: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-10-08 09:12:06.506011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-08 09:12:06.506057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-10-08 09:12:06.506064: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-10-08 09:12:06.513654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11436 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN X (Pascal), pci bus id: 0000:83:00.0, compute capability: 6.1)
I1008 09:12:06.516476 139644910475072 train_lib.py:105] Using strategy <class 'tensorflow.python.distribute.one_device_strategy.OneDeviceStrategy'> with 1 replicas
I1008 09:12:06.715466 139644910475072 motion_deeplab.py:53] Synchronized Batchnorm is used.
I1008 09:12:06.718576 139644910475072 axial_resnet_instances.py:144] Axial-ResNet final config: {'num_blocks': [3, 4, 6, 3], 'backbone_layer_multiplier': 1.0, 'width_multiplier': 1.0, 'stem_width_multiplier': 1.0, 'output_stride': 32, 'classification_mode': True, 'backbone_type': 'resnet', 'use_axial_beyond_stride': 0, 'backbone_use_transformer_beyond_stride': 0, 'extra_decoder_use_transformer_beyond_stride': 32, 'backbone_decoder_num_stacks': 0, 'backbone_decoder_blocks_per_stage': 1, 'extra_decoder_num_stacks': 0, 'extra_decoder_blocks_per_stage': 1, 'max_num_mask_slots': 128, 'num_mask_slots': 128, 'memory_channels': 256, 'base_transformer_expansion': 1.0, 'global_feed_forward_network_channels': 256, 'high_resolution_output_stride': 4, 'activation': 'relu', 'block_group_config': {'attention_bottleneck_expansion': 2, 'drop_path_keep_prob': 1.0, 'drop_path_beyond_stride': 16, 'drop_path_schedule': 'constant', 'positional_encoding_type': None, 'use_global_beyond_stride': 0, 'use_sac_beyond_stride': -1, 'use_squeeze_and_excite': False, 'conv_use_recompute_grad': False, 'axial_use_recompute_grad': True, 'recompute_within_stride': 0, 'transformer_use_recompute_grad': False, 'axial_layer_config': {'query_shape': (129, 129), 'key_expansion': 1, 'value_expansion': 2, 'memory_flange': (32, 32), 'double_global_attention': False, 'num_heads': 8, 'use_query_rpe_similarity': True, 'use_key_rpe_similarity': True, 'use_content_similarity': True, 'retrieve_value_rpe': True, 'retrieve_value_content': True, 'initialization_std_for_query_key_rpe': 1.0, 'initialization_std_for_value_rpe': 1.0, 'self_attention_activation': 'softmax'}, 'dual_path_transformer_layer_config': {'num_heads': 8, 'bottleneck_expansion': 2, 'key_expansion': 1, 'value_expansion': 2, 'feed_forward_network_channels': 2048, 'use_memory_self_attention': True, 'use_pixel2memory_feedback_attention': True, 'transformer_activation': 'softmax'}}, 'bn_layer': functools.partial(<class 'tensorflow.python.keras.layers.normalization_v2.SyncBatchNormalization'>, momentum=0.9900000095367432, epsilon=0.0010000000474974513), 'conv_kernel_weight_decay': 0.0}
I1008 09:12:06.923306 139644910475072 motion_deeplab.py:109] Setting pooling size to (13, 40)
I1008 09:12:06.923628 139644910475072 aspp.py:135] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1008 09:12:06.923738 139644910475072 aspp.py:135] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
2021-10-08 09:12:11.757189: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
I1008 09:12:11.760688 139644910475072 controller.py:362] restoring or initializing model...
restoring or initializing model...
I1008 09:12:12.072492 139644910475072 controller.py:368] initialized model.
initialized model.
I1008 09:12:12.073319 139644910475072 controller.py:252] eval | step: 0 | running complete evaluation...
eval | step: 0 | running complete evaluation...
2021-10-08 09:12:12.287843: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-08 09:12:12.308308: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3399950000 Hz
WARNING:tensorflow:From /usr/wiss/webermar/anaconda3/envs/deeplab_pip/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The validate_indices argument has no effect. Indices are always validated on CPU and never validated on GPU.
W1008 09:12:19.992389 139644910475072 deprecation.py:534] From /usr/wiss/webermar/anaconda3/envs/deeplab_pip/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The validate_indices argument has no effect. Indices are always validated on CPU and never validated on GPU.
2021-10-08 09:12:25.851721: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:808] layout failed: Invalid argument: Size of values 3 does not match size of permutation 4 @ fanin shape inMotionDeepLab/PostProcessor/StatefulPartitionedCall/while/body/_231/while/SelectV2_1-1-TransposeNHWCToNCHW-LayoutOptimizer
2021-10-08 09:12:27.338178: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-08 09:12:28.743844: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8201
2021-10-08 09:12:30.845198: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-10-08 09:12:31.859844: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
I1008 09:37:38.998472 139644910475072 api.py:446] Creating COCO objects for AP eval...
creating index...
index created!
Loading and preparing results...
DONE (t=12.72s)
creating index...
index created!
I1008 09:37:54.079124 139644910475072 api.py:446] Running COCO evaluation...
Running per image evaluation...
Evaluate annotation type segm
DONE (t=98.34s).
Accumulating evaluation results...
DONE (t=3.74s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.375
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.651
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.356
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.150
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.481
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.676
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.141
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.438
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.439
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.195
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.562
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.764
I1008 09:39:38.485969 139644910475072 controller.py:261] eval | step: 0 | eval time: 1646.4 sec | output:
{'evaluation/ap/AP_Mask': 0.3751787,
'evaluation/iou/IoU': 0.63153327,
'evaluation/pq/FN': 734.8947,
'evaluation/pq/FP': 622.7895,
'evaluation/pq/PQ': 0.4207694,
'evaluation/pq/RQ': 0.52304685,
'evaluation/pq/SQ': 0.77444,
'evaluation/pq/TP': 1501.0526,
'evaluation/step/AQ': 0.5277102021206063,
'evaluation/step/IoU': 0.6308144803396004,
'evaluation/step/STQ': 0.5769638090215155,
'losses/eval_center_loss': 0.06295311,
'losses/eval_motion_loss': 0.0750779,
'losses/eval_regression_loss': 0.016829815,
'losses/eval_semantic_loss': 2.098444,
'losses/eval_total_loss': 2.2533038}
eval | step: 0 | eval time: 1646.4 sec | output:
{'evaluation/ap/AP_Mask': 0.3751787,
'evaluation/iou/IoU': 0.63153327,
'evaluation/pq/FN': 734.8947,
'evaluation/pq/FP': 622.7895,
'evaluation/pq/PQ': 0.4207694,
'evaluation/pq/RQ': 0.52304685,
'evaluation/pq/SQ': 0.77444,
'evaluation/pq/TP': 1501.0526,
'evaluation/step/AQ': 0.5277102021206063,
'evaluation/step/IoU': 0.6308144803396004,
'evaluation/step/STQ': 0.5769638090215155,
'losses/eval_center_loss': 0.06295311,
'losses/eval_motion_loss': 0.0750779,
'losses/eval_regression_loss': 0.016829815,
'losses/eval_semantic_loss': 2.098444,
'losses/eval_total_loss': 2.2533038}

Could you provide your full config as well as full log when you evaluate with unchanged code?

from deeplab2.

lxtGH commented on June 23, 2024

@markweberdev Hi! still can not reach the results. But Could you report class-wised iou or PQ for us to reference ?

from deeplab2.

markweberdev commented on June 23, 2024

That's unfortunate. I can eval that for you, could you please specify whether you would like to have per class PQ scores from Panoptic-DeepLab or Motion-DeepLab on KITTI-STEP?

from deeplab2.

lxtGH commented on June 23, 2024

Hi! @markweberdev Thanks for your reply. I want to obtain the results of both Panoptic-Deeplab and Motion-Deeplab. Thanks for that.

from deeplab2.

markweberdev commented on June 23, 2024

Please find the class wise scores attached. Please note, that the results are obtained with a ResNet50 os32 backbone.
classwise_scores_kitti_step.csv

from deeplab2.

lxtGH commented on June 23, 2024

@markweberdev Hi! Mark. I found there 11095 images in test set on Kitti STEP test set. But in your paper the number is 10173.
So the numbers seem to be not very consistent.

from deeplab2.

markweberdev commented on June 23, 2024

@lxtGH Thanks a lot for pointing this out. You are right, it's 11095. I will correct it in the paper!

from deeplab2.

lxtGH commented on June 23, 2024

@markweberdev Hi! Mark In Tab-3, what is window size for VPQ caculation ? According to your csv file result, I believe VPQ = PQ where k =1.

from deeplab2.

markweberdev commented on June 23, 2024

Hi,

I am unsure how you get to these insights. PQ scores are naturally higher than VPQ (by design they can’t be higher). With a window size of k=0 all baselines (B1-B3) would have had the same result, which is not what we reported in the paper.

We used the default setting of VPQ, as introduced in their paper. VPQ is averaged over K=4 different window sizes (0, 1, 2, 3 labelled images). As cityscapers-vps has only every 5th frame labelled this corresponds to their (0, 5, 10, 15) setting.

Hope that helps.

Best,
Mark

from deeplab2.

lxtGH commented on June 23, 2024

@markweberdev Thank for your reply !! I found VPQ of KITTY STEP changes are not as much as Cityscapes-VPS. Is the reason that STEP has less things (2 vs 8 in Cityscape) ?

from deeplab2.

aquariusjay commented on June 23, 2024

Hi @lxtGH,

KITTI-STEP builds on top of KITTI-MOTS (which contains two thing classes for tracking) by additionally annotating the semantic segmentation.
VPQ is sensitive to the window size and stride (i.e., k and lambda in their paper), while our proposed metric STQ can directly evaluate on a whole video sequence.
For videos that have large annotation frame rate, you may need to play with different values of window size and stride to see the variation of VPQ.
For the KITTI-STEP dataset and metric discussion, please refer to our paper for your reference.

I am closing the issue, but please feel free to reopen it if you have any more questions.

Cheers,

from deeplab2.

What is your environment for testing your model? about deeplab2 HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs