GithubHelp home page GithubHelp logo

prismformore / multi-task-transformer Goto Github PK

View Code? Open in Web Editor NEW
275.0 8.0 21.0 14.75 MB

Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding" and ECCV2022 paper "Inverted Pyramid Multi-task Transformer for Dense Scene Understanding"

License: MIT License

Python 96.28% Shell 0.22% C++ 1.07% Cuda 2.42%
computer-vision deep-learning depth-estimation eccv2022 human-parsing multi-task-learning nyudv2 pascal scene-understanding segmentation

multi-task-transformer's People

Contributors

prismformore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multi-task-transformer's Issues

script error

thank you for your nice works. I want to try run this project, but got below errors.
/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
local rank: 0
{'version_name': 'InvPT_pascal_vitLp16', 'out_dir': '../', 'train_db_name': 'PASCALContext', 'val_db_name': 'PASCALContext', 'trBatch': 2, 'valBatch': 6, 'nworkers': 2, 'ignore_index': 255, 'intermediate_supervision': True, 'val_interval': 1000, 'epochs': 999999, 'max_iter': 40000, 'optimizer': 'adam', 'optimizer_kwargs': {'lr': 2e-05, 'weight_decay': 1e-06}, 'scheduler': 'poly', 'model': 'TransformerNet', 'backbone': 'vitL', 'head': 'mlp', 'embed_dim': 512, 'mtt_resolution_downsample_rate': 2, 'PRED_OUT_NUM_CONSTANT': 64, 'task_dictionary': {'include_semseg': True, 'include_human_parts': True, 'include_sal': True, 'include_edge': True, 'include_normals': True, 'edge_w': 0.95}, 'loss_kwargs': {'loss_weights': {'semseg': 1.0, 'human_parts': 2.0, 'sal': 5.0, 'edge': 50.0, 'normals': 10.0}}, 'TASKS': {'NAMES': ['semseg', 'human_parts', 'sal', 'normals', 'edge'], 'NUM_OUTPUT': {'semseg': 21, 'human_parts': 7, 'sal': 2, 'normals': 3, 'edge': 1}, 'FLAGVALS': {'image': 2, 'semseg': 0, 'human_parts': 0, 'sal': 0, 'normals': 2, 'edge': 0}, 'INFER_FLAGVALS': {'semseg': 0, 'human_parts': 0, 'sal': 1, 'normals': 1, 'edge': 1}}, 'edge_w': 0.95, 'eval_edge': False, 'TRAIN': {'SCALE': [512, 512]}, 'TEST': {'SCALE': [512, 512]}, 'root_dir': '../InvPT_pascal_vitLp16', 'output_dir': '../InvPT_pascal_vitLp16', 'save_dir': '../InvPT_pascal_vitLp16/results', 'checkpoint': '../InvPT_pascal_vitLp16/checkpoint.pth.tar', 'run_mode': 'train', 'db_paths': {'PASCALContext': './dataset/PASCALContext', 'NYUD_MT': './dataset/NYUDv2'}, 'PROJECT_ROOT_DIR': ''}
Tensorboard dir: ../InvPT_pascal_vitLp16/tb_dir
Optimizer uses a single parameter group - (Default)
Preparing train dataset for db: PASCALContext

Initializing dataloader for PASCAL train set
Traceback (most recent call last):
File "/home/cenchaojun/phd2code/invpt/main.py", line 169, in
main()
File "/home/cenchaojun/phd2code/invpt/main.py", line 104, in main
train_dataset = get_train_dataset(p, train_transforms)
File "/home/cenchaojun/phd2code/invpt/utils/common_config.py", line 96, in get_train_dataset
database = PASCALContext(p.db_paths['PASCALContext'], download=False, split=['train'], transform=transforms, retname=True,
File "/home/cenchaojun/phd2code/invpt/data/pascal_context.py", line 174, in init
assert os.path.isfile(_human_part)
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 10820) of binary: /home/cenchaojun/.conda/envs/invpt/bin/python3.8
Traceback (most recent call last):
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
elastic_launch(
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/cenchaojun/.conda/envs/invpt/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/home/cenchaojun/phd2code/invpt/main.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2022-11-21_16:07:10
host : cieelab
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 10820)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Dataset link is down

Hi,

Thanks so much for sharing the code of the great works! I find that the links of downloading the dataset is down like

wget https://data.vision.ee.ethz.ch/brdavid/atrc/NYUDv2.tar.gz
wget https://data.vision.ee.ethz.ch/brdavid/atrc/PASCALContext.tar.gz

Could you please share your dataset copy somewhere else? Because I find the dataset on the Google Drive https://drive.google.com/file/d/14EAEMXmd3zs2hIMY63UhHPSFPDAkiTzw/view?usp=sharing might have some difference with the one you used in the paper. When using the dataset on the Google Drive, there would be resolution issue like the test data is of resolution 480640, but we need it to be 448576 during evaluation. However, in the code for preprocessing the NYU test data, we use "transforms.PadImage(size=p.TEST.SCALE)" which requires the raw test image to be smaller than 448*576.

Thanks!

odsf code.

Hello, does the project code provide the code for calculating optimal-dataset-scale F-measure (odsF) in edge detection?

Inference

Hi,
I want to perform inference for human parsing using this model. What are the inputs required apart from the input image...?

I have already added the input image in JPEGImages directory. While inferencing I am getting assertion error at below line
File "/content/drive/MyDrive/POCs/InvPT/data/pascal_context.py", line 165, in init
assert os.path.isfile(_edge)
AssertionError

ViT-B version

Thanks for your great work,I would like to ask how to choose the baseline parameters of vit-B in your paper?(vit_base_patch16_384 or vit_base_patch32_224 or something else).I tried to reproduce the performance in the paper, but without success, the gap is a bit big, I think it may be that vit-B is used differently.
Looking forward to your reply,thanks very much.

Error with reading pre trained weight

I gave as running line: python inference.py --image_path=image.jpg --ckp_path=data.pkl --save_dir=out.jpg

Hello, I get this error when trying to evaluate the pre-trained model:
File "inference.py", line 193, in infer
model = initialize_model(p, checkpoint_path)
File "inference.py", line 15, in initialize_model
ckp = torch.load(checkpoint_path, map_location='cpu')
File "C:\Users\ct\anaconda3\lib\site-packages\torch\serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\Users\ct\anaconda3\lib\site-packages\torch\serialization.py", line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

The inference process is stuck.

When I use the pre-trained model on NYUD-v2, the inference process is stuck. Here is my command

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch \
        --nproc_per_node=1  \
        --master_port=$((RANDOM%1000+12000)) \
         main.py \
         --config_exp './configs/nyud/nyud_vitLp16_taskprompter.yml' \
         --run_mode infer  \
         --trained_model  TaskPrompter_nyud_vitLp16.pth.tar

The log is as follows:

Number of dataset images: 654
Use checkpoint TaskPrompter_nyud_vitLp16.pth.tar
Infer at iteration 0
  0%|    

Script for swin baselines

Sorry to bother you again, I am wondering if you can provide code for reproducing swin baselines in the paper. That would be very helpful,thanks very much.

Issue running inference.py - proper requirements file needed

  • Downloaded pretrained model for cityscapes-3d

  • installed requirements as mentioned here #23

  • while running inference script

!CUDA_VISIBLE_DEVICES=0 python inference.py --config_path='./configs/cityscapes3d/cs_swinB_taskprompter.yml' --image_path='/content/IMG_20200209_2301.jpg' --ckp_path='/content/drive/MyDrive/YOLOv8/TaskPrompter_CS_swinB_v2.pth.tar' --save_dir='/content/'

error occurs as

File "/content/Multi-Task-Transformer/TaskPrompter/detection_toolbox/det_losses.py", line 7, in <module>
    import mmcv
ModuleNotFoundError: No module named 'mmcv' 
  • As an expected solution, tried pip install mmcv, but it takes indefinite time(waited >20minutes) Building wheels for collected packages: mmcv .
  • Also tried some solutions from Installation takes forever - mmcv issue but still no solution worked

System Specifications

  • Google Colab T4 GPU
  • Python 3.10.12
  • Cuda
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

P.S. - As per README this is built using Python 3.7 but had no idea about other library versions so continued without specifying any

@prismformore

RuntimeError: value cannot be converted to type float without overflow: (1.37209e-09,-4.45819e-10)

Traceback (most recent call last):
File "main.py", line 172, in
main()
File "main.py", line 148, in main
end_signal, iter_count = train_phase(p, args, train_dataloader, test_dataloader, model, criterion, optimizer, scheduler, epoch, tb_writer_t
rain, tb_writer_test, iter_count)
File "/public/home/ws/InvPT/utils/train_utils.py", line 41, in train_phase
5%|████▊ | 9/198 [00:08<03:04, 1.02it/s]
Traceback (most recent call last):
File "main.py", line 172, in
main()
File "main.py", line 148, in main
end_signal, iter_count = train_phase(p, args, train_dataloader, test_dataloader, model, criterion, optimizer, scheduler, epoch, tb_writer_train, tb_writer_test, iter_count)
File "/public/home/ws/vpt/InvPT/utils/train_utils.py", line 41, in train_phase
optimizer.step()
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper optimizer.step()
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper return wrapped(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/optimizer.py", line 88, in wrapper return wrapped(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/optimizer.py", line 88, in wrapper return func(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/optimizer.py", line 88, in wrappe[47/1951] return func(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/adam.py", line 144, in step
return func(*args, **kwargs)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/adam.py", line 144, in step
eps=group['eps'])
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/functional.py", line 98, in adam
eps=group['eps'])
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/functional.py", line 98, in adam param.addcdiv(exp_avg, denom, value=-step_size)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/optim/functional.py", line 98, in adam[33/1951]
param.addcdiv
(exp_avg, denom, value=-step_size)
RuntimeError: value cannot be converted to type float without overflow: (1.37209e-09,-4.45819e-10)
param.addcdiv
(exp_avg, denom, value=-step_size)
RuntimeError: value cannot be converted to type float without overflow: (1.37209e-09,-4.45819e-10)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 182046) of binary: /public/home/ws/Anacondas
/anaconda3/envs/invpt/bin/python
Traceback (most recent call last):
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main() File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main launch(args)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in ma[19/1951] launch(args)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch run(args)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args))
File "/public/home/ws/Anacondas/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

main.py FAILED

main.py FAILED [5/1951]

Failures:
[1]:
time : 2022-09-21_15:42:37
host : ai_gpu02
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 182053)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2022-09-21_15:42:37
host : ai_gpu02
rank : 0 (local_rank: 0)
host : ai_gpu02 [0/1951]
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 182053)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2022-09-21_15:42:37
host : ai_gpu02
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 182046)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Excuse me, I have this bug after training 39000+ iterations, how to solve it? Is this problem related to the number of computing cards used? I used two cards in training.

odsF get the better results 76.8

I'm very sorry to bother you again, it's still a question about odsF. I used the checkpoint released on your github and the test result is 76.8 in the case of sessim, maxdist=0.011 and HED.txt, which is 1 less than the number 77.8 you recorded. This is a better result, and the evaluation results of other tasks do not have any question. I haven't found any problems so far, this problem has been bothering me for several days, what could be the problem, I would be very grateful if you could reply.

Human Parts

Hello, thank you very much for your outstanding contribution. I easily found the method of coloring the semantic segmentation prediction map on the Internet, but there are few methods for coloring Pascal-context human body part segmentation. I want to know how you color the human body part segmentation prediction map?

load state_dict failed

I think the param 'model' in pascal_vitLp16.yml should be InvPT, and there should be some loading model code inside utils/common_config.py-->get_model function

Use checkpoint pretrain/InvPT_pascal_vitLp16.pth.tar
Traceback (most recent call last):
  File "main.py", line 169, in <module>
    main()
  File "main.py", line 121, in main
    model.load_state_dict(checkpoint['model'])
  File "/root/anaconda3/envs/invpt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DistributedDataParallel:
        Unexpected key(s) in state_dict: "module.multi_task_decoder.intermediate_head.semseg.weight", "module.multi_task_decoder.intermediate_head.semseg.bias", "module.multi_task_decoder.intermediate_head.sal.weight", "module.multi_task_decoder.intermediate_head.sal.bias", "module.multi_task_decoder.intermediate_head.normals.weight", "module.multi_task_decoder.intermediate_head.normals.bias", "module.multi_task_decoder.intermediate_head.edge.weight", "module.multi_task_decoder.intermediate_head.edge.bias", "module.multi_task_decoder.invpt.redu_chan.0.1.weight", "module.multi_task_decoder.invpt.redu_chan.0.1.bias", "module.multi_task_decoder.invpt.redu_chan.0.2.weight", "module.multi_task_decoder.invpt.redu_chan.0.2.bias", "module.multi_task_decoder.invpt.redu_chan.0.3.weight", "module.multi_task_decoder.invpt.redu_chan.0.3.bias", "module.multi_task_decoder.invpt.redu_chan.0.4.weight", "module.multi_task_decoder.invpt.redu_chan.0.4.bias", "module.multi_task_decoder.invpt.redu_chan.1.1.weight", "module.multi_task_decoder.invpt.redu_chan.1.1.bias", "module.multi_task_decoder.invpt.redu_chan.1.2.weight", "module.multi_task_decoder.invpt.redu_chan.1.2.bias", "module.multi_task_decoder.invpt.redu_chan.1.3.weight", "module.multi_task_decoder.invpt.redu_chan.1.3.bias", "module.multi_task_decoder.invpt.redu_chan.1.4.weight", "module.multi_task_decoder.invpt.redu_chan.1.4.bias", "module.multi_task_decoder.invpt.redu_chan.2.1.weight", "module.multi_task_decoder.invpt.redu_chan.2.1.bias", "module.multi_task_decoder.invpt.redu_chan.2.2.weight", "module.multi_task_decoder.invpt.redu_chan.2.2.bias", "module.multi_task_decoder.invpt.redu_chan.2.3.weight", "module.multi_task_decoder.invpt.redu_chan.2.3.bias", "module.multi_task_decoder.invpt.redu_chan.2.4.weight", "module.multi_task_decoder.invpt.redu_chan.2.4.bias", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.1.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.2.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.3.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.0.blocks.0.attn.conv_proj_q.4.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.1.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.2.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.3.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.patch_embed.4.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.1.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.2.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.3.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.1.blocks.0.attn.conv_proj_q.4.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.1.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.2.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.3.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.1.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.2.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.2.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.2.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.2.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.2.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.4.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.5.weight", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.5.bias", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.5.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.5.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.patch_embed.4.proj.5.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.1.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.2.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.3.bn.num_batches_tracked", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.conv.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.bn.weight", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.bn.bias", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.bn.running_mean", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.bn.running_var", "module.multi_task_decoder.invpt.invpt_stages.2.blocks.0.attn.conv_proj_q.4.bn.num_batches_tracked", "module.multi_task_decoder.invpt.mt_proj.semseg.0.weight", "module.multi_task_decoder.invpt.mt_proj.semseg.0.bias", "module.multi_task_decoder.invpt.mt_proj.semseg.1.weight", "module.multi_task_decoder.invpt.mt_proj.semseg.1.bias", "module.multi_task_decoder.invpt.mt_proj.semseg.1.running_mean", "module.multi_task_decoder.invpt.mt_proj.semseg.1.running_var", "module.multi_task_decoder.invpt.mt_proj.semseg.1.num_batches_tracked", "module.multi_task_decoder.invpt.mt_proj.sal.0.weight", "module.multi_task_decoder.invpt.mt_proj.sal.0.bias", "module.multi_task_decoder.invpt.mt_proj.sal.1.weight", "module.multi_task_decoder.invpt.mt_proj.sal.1.bias", "module.multi_task_decoder.invpt.mt_proj.sal.1.running_mean", "module.multi_task_decoder.invpt.mt_proj.sal.1.running_var", "module.multi_task_decoder.invpt.mt_proj.sal.1.num_batches_tracked", "module.multi_task_decoder.invpt.mt_proj.normals.0.weight", "module.multi_task_decoder.invpt.mt_proj.normals.0.bias", "module.multi_task_decoder.invpt.mt_proj.normals.1.weight", "module.multi_task_decoder.invpt.mt_proj.normals.1.bias", "module.multi_task_decoder.invpt.mt_proj.normals.1.running_mean", "module.multi_task_decoder.invpt.mt_proj.normals.1.running_var", "module.multi_task_decoder.invpt.mt_proj.normals.1.num_batches_tracked", "module.multi_task_decoder.invpt.mt_proj.edge.0.weight", "module.multi_task_decoder.invpt.mt_proj.edge.0.bias", "module.multi_task_decoder.invpt.mt_proj.edge.1.weight", "module.multi_task_decoder.invpt.mt_proj.edge.1.bias", "module.multi_task_decoder.invpt.mt_proj.edge.1.running_mean", "module.multi_task_decoder.invpt.mt_proj.edge.1.running_var", "module.multi_task_decoder.invpt.mt_proj.edge.1.num_batches_tracked", "module.multi_task_decoder.invpt.mix_proj.semseg.0.weight", "module.multi_task_decoder.invpt.mix_proj.semseg.0.bias", "module.multi_task_decoder.invpt.mix_proj.sal.0.weight", "module.multi_task_decoder.invpt.mix_proj.sal.0.bias", "module.multi_task_decoder.invpt.mix_proj.normals.0.weight", "module.multi_task_decoder.invpt.mix_proj.normals.0.bias", "module.multi_task_decoder.invpt.mix_proj.edge.0.weight", "module.multi_task_decoder.invpt.mix_proj.edge.0.bias", "module.multi_task_decoder.preliminary_decoder.semseg.0.conv.weight", "module.multi_task_decoder.preliminary_decoder.semseg.0.bn1.weight", "module.multi_task_decoder.preliminary_decoder.semseg.0.bn1.bias", "module.multi_task_decoder.preliminary_decoder.semseg.0.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.semseg.0.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.semseg.0.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.semseg.1.conv.weight", "module.multi_task_decoder.preliminary_decoder.semseg.1.bn1.weight", "module.multi_task_decoder.preliminary_decoder.semseg.1.bn1.bias", "module.multi_task_decoder.preliminary_decoder.semseg.1.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.semseg.1.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.semseg.1.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.sal.0.conv.weight", "module.multi_task_decoder.preliminary_decoder.sal.0.bn1.weight", "module.multi_task_decoder.preliminary_decoder.sal.0.bn1.bias", "module.multi_task_decoder.preliminary_decoder.sal.0.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.sal.0.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.sal.0.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.sal.1.conv.weight", "module.multi_task_decoder.preliminary_decoder.sal.1.bn1.weight", "module.multi_task_decoder.preliminary_decoder.sal.1.bn1.bias", "module.multi_task_decoder.preliminary_decoder.sal.1.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.sal.1.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.sal.1.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.normals.0.conv.weight", "module.multi_task_decoder.preliminary_decoder.normals.0.bn1.weight", "module.multi_task_decoder.preliminary_decoder.normals.0.bn1.bias", "module.multi_task_decoder.preliminary_decoder.normals.0.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.normals.0.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.normals.0.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.normals.1.conv.weight", "module.multi_task_decoder.preliminary_decoder.normals.1.bn1.weight", "module.multi_task_decoder.preliminary_decoder.normals.1.bn1.bias", "module.multi_task_decoder.preliminary_decoder.normals.1.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.normals.1.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.normals.1.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.edge.0.conv.weight", "module.multi_task_decoder.preliminary_decoder.edge.0.bn1.weight", "module.multi_task_decoder.preliminary_decoder.edge.0.bn1.bias", "module.multi_task_decoder.preliminary_decoder.edge.0.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.edge.0.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.edge.0.bn1.num_batches_tracked", "module.multi_task_decoder.preliminary_decoder.edge.1.conv.weight", "module.multi_task_decoder.preliminary_decoder.edge.1.bn1.weight", "module.multi_task_decoder.preliminary_decoder.edge.1.bn1.bias", "module.multi_task_decoder.preliminary_decoder.edge.1.bn1.running_mean", "module.multi_task_decoder.preliminary_decoder.edge.1.bn1.running_var", "module.multi_task_decoder.preliminary_decoder.edge.1.bn1.num_batches_tracked", "module.heads.semseg.linear_pred.weight", "module.heads.semseg.linear_pred.bias", "module.heads.sal.linear_pred.weight", "module.heads.sal.linear_pred.bias", "module.heads.normals.linear_pred.weight", "module.heads.normals.linear_pred.bias", "module.heads.edge.linear_pred.weight", "module.heads.edge.linear_pred.bias". 
        size mismatch for module.multi_task_decoder.invpt.norm_mts.0.weight: copying a param with shape torch.Size([2880]) from checkpoint, the shape in current model is torch.Size([576]).
        size mismatch for module.multi_task_decoder.invpt.norm_mts.0.bias: copying a param with shape torch.Size([2880]) from checkpoint, the shape in current model is torch.Size([576]).
        size mismatch for module.multi_task_decoder.invpt.norm_mts.1.weight: copying a param with shape torch.Size([1440]) from checkpoint, the shape in current model is torch.Size([288]).
        size mismatch for module.multi_task_decoder.invpt.norm_mts.1.bias: copying a param with shape torch.Size([1440]) from checkpoint, the shape in current model is torch.Size([288]).
        size mismatch for module.multi_task_decoder.invpt.norm_mts.2.weight: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([144]).
        size mismatch for module.multi_task_decoder.invpt.norm_mts.2.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([144]).
        size mismatch for module.multi_task_decoder.invpt.norm_mt.weight: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([144]).
        size mismatch for module.multi_task_decoder.invpt.norm_mt.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([144]).

Questions regarding code for DAVIS demo

Hi,

Thanks for your nice work.

I am very interested about your work, but I seems that your repository do not contains the things with DAVIS demo, could you upload the related code?

Best regards

depth files on NYUDv2

Thanks for your working! The value in npy files on NYUDv2 seems the real depth with meter. When training on one depth estimation task decoder, it is hard to convergence with L1 loss on the provided depth files. Did you do some preprocessing, such as normalize the value to [0, 1]?

We assume all the images have the same size!!!

Dear @prismformore

I am running inference file based on single image downloaded from the internet. Also, I successfully installed all libraries without downloading Cityscapes dataset because I do not want to train a model.

CUDA_VISIBLE_DEVICES=0 python3 inference.py --config_path=configs/pascal/pascal_vitLp16_taskprompter.yml --image_path=fff.jpeg --ckp_path=TaskPrompter_pascal_vitLp16.pth.tar --save_dir=output

Input
fff

Traceback

/home/cvpr/anaconda3/envs/ICCV2023/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode)
/media/cvpr/CM_24/Multi-Task-Transformer/TaskPrompter/utils/visualization_utils.py:122: UserWarning: Warning: We assume all the images have the same size!!!
  warnings.warn('Warning: We assume all the images have the same size!!!')

how to get bodyparts from images?

using the pretrained model and running the code as CUDA_VISIBLE_DEVICES=0 python inference.py --image_path=/home/hsattar/Multi-Task-Transformer/InvPT/test_data/front.jpg --ckp_path=/home/hsattar/Multi-Task-Transformer/InvPT/checkpoint/InvPT_pascal_vitLp16.pth.tar --save_dir=/home/hsattar/Multi-Task-Transformer/InvPT/test_data/result.png

I get output
image
image

but no bodyparts. Do I need to set something somwhere?

t-SNE

Hello, I'm glad to resume your work. I want to know how your t-SNE 3D map is implemented? Could you send me a code if it is convenient for you?

epoch 

Hello, thank you for your contribution, I wonder if you need to really train 999999 epoch in training?

How does the swin transformer deal with the input image with the size of 448 x 576 ?

Hello! Thanks for your work! In the paper titled InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding, you mentioned that while on NYUD-v2, we randomly crop the input image to the size of 448 × 576 as Swin Transformer in the Appendix Data Processing section. Based on this setting, after Swin Transformer, the image is downsampled by x32 and gets a size of 14 x 18. But how to perform official window partition with a such feature map, as the window partition requires equal size and the maximum GCD of (14, 18) is 2. It seems a little problematic, and we can't use official pretrained swin backbone. Therefore, could you please provide more details ? or release the code about swin-based model?

TypeError: TaskPrompter.__init__() got an unexpected keyword argument 'default_cfg'

Steps done:

  1. Clone repo
  2. Download .pth.tar files
  3. Run below commands
CUDA_VISIBLE_DEVICES=0
!python3 inference.py --config_path=configs/pascal/pascal_vitLp16_taskprompter.yml --image_path=/content/Screenshot7.png --ckp_path=/content/Multi-Task-Transformer/TaskPrompter/InvPT_pascal_vitLp16.pth.tar --save_dir=output

Error

Traceback (most recent call last):
  File "/content/Multi-Task-Transformer/TaskPrompter/inference.py", line 185, in <module>
    infer_one_image(args.image_path)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/content/Multi-Task-Transformer/TaskPrompter/inference.py", line 141, in infer_one_image
    model = initialize_model(p, checkpoint_path)
  File "/content/Multi-Task-Transformer/TaskPrompter/inference.py", line 60, in initialize_model
    model = get_model(p)
  File "/content/Multi-Task-Transformer/TaskPrompter/utils/common_config.py", line 79, in get_model
    backbone, backbone_channels = get_backbone(p)
  File "/content/Multi-Task-Transformer/TaskPrompter/utils/common_config.py", line 22, in get_backbone
    backbone = taskprompter_vit_large_patch16_384(p=p, pretrained=True, drop_path_rate=0.15, img_size=p.TRAIN.SCALE)
  File "/content/Multi-Task-Transformer/TaskPrompter/models/transformers/taskprompter.py", line 676, in taskprompter_vit_large_patch16_384
    model = _create_task_prompter('vit_large_patch16_384', pretrained=pretrained, **model_kwargs)
  File "/content/Multi-Task-Transformer/TaskPrompter/models/transformers/taskprompter.py", line 661, in _create_task_prompter
    model = build_model_with_cfg(
  File "/usr/local/lib/python3.10/dist-packages/timm/models/_builder.py", line 385, in build_model_with_cfg
    model = model_cls(**kwargs)
TypeError: TaskPrompter.__init__() got an unexpected keyword argument 'default_cfg'

Trying other solution from closed issue #10

CUDA_VISIBLE_DEVICES=0 
!python inference.py --image_path=/content/Screenshot7.png --ckp_path=/content/Multi-Task-Transformer/TaskPrompter/InvPT_pascal_vitLp16.pth.tar --save_dir=SAVE_DIR

Error

Traceback (most recent call last):
  File "/content/Multi-Task-Transformer/TaskPrompter/inference.py", line 185, in <module>
    infer_one_image(args.image_path)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/content/Multi-Task-Transformer/TaskPrompter/inference.py", line 121, in infer_one_image
    p = create_config(args.config_path, {'run_mode': 'infer'})
  File "/content/Multi-Task-Transformer/TaskPrompter/utils/config.py", line 94, in create_config
    with open(exp_file, 'r') as stream:
FileNotFoundError: [Errno 2] No such file or directory: './configs/pascal/pascal_vitLp16.yml'

Platform
Google colab with T4 runtime

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.