Dear authors, I am interested in your work and would like to learn m

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

A question about reproducing the result (environment setup),about lhoyer/mic

Comments (31)

lhoyer commented on August 18, 2024

Dear Kai,

Thanks a lot for your interest in our work! From the first look at it, nothing seems off about the config or environment. Can you please provide the entire log so that I can check some other details? Did you change anything on your local copy of the repository?

Best,
Lukas

from mic.

wengkai0 commented on August 18, 2024

Dear Lukas,

Thank you for your reply and help! I run it on a temporal machine and the log is lost, but I will try to run it again and try not to change the repository. I will let you know if the result is still not as expected

Thanks!

Kai

from mic.

lhoyer commented on August 18, 2024

Dear Kai,

Have you been successful?

In the meantime, I also freshly cloned the GitHub repository and tested configs/mic/gtaHR2csHR_mic_hrda.py on two different GPUs (Titan RTX and RTX 3090). On both machines, I was able to reproduce the results of the paper (76.0 and 75.8 mIoU).

Best,
Lukas

from mic.

wengkai0 commented on August 18, 2024

Dear Lukas,

Sorry for the late reply and thanks for testing the result again! The script went something wrong before, and I re-run it.
According to my experience, I think after 24000 iterations, the performance is still not as expected. Could I just upload the log file so far and let you check if you are not busy? I will complete the training and upload it later if needed.

Thanks,
Kai

20221221_092942.log

from mic.

lhoyer commented on August 18, 2024

Dear Kai,

Thanks. I'll have a look at the log. Can you please also provide the output of pip freeze so that I can check the library versions?

Best,
Lukas

from mic.

wengkai0 commented on August 18, 2024

Dear Lukas,

Thanks much for your help, I have attached the final log file below and a screenshot of the output of "pip freeze"

Kindly Regards
Kai

20221221_092942.log

from mic.

lhoyer commented on August 18, 2024

Dear Kai,

I checked the logs and library versions. It seems that you have compiled mmcv-full with cuda 11.1 while the installed pytorch uses cuda 11.0. This might be a possible reason for the problem. I would recommend to compile mmcv-full with cuda 11.0.

Best,
Lukas

from mic.

wengkai0 commented on August 18, 2024

Dear Lukas,

Thanks so much for checking it, but I am a new learner in this field. Could you pls give me any suggestions how I could compile mmcv-full with cuda 11.0? (I use conda to install the environment.)

Merry Christmas

Kindly Regards
Kai

from mic.

lhoyer commented on August 18, 2024

Dear Kai,

To compile mmcv-full with Cuda 11.0, you need to use a system, which has this version installed. It seems that you use containers, so maybe there is also a container with Cuda 11.0 available? Otherwise, you can find further information here: https://developer.nvidia.com/cuda-11.0-download-archive

Best,
Lukas

from mic.

zyuanbing commented on August 18, 2024

Hi @wengkai0 ,

I met the same issue and my runtime environment is almost same as yours. Have you solve this problem? or any possible hints?

addict==2.4.0
appdirs==1.4.4
beautifulsoup4==4.11.1
certifi @ file:///croot/certifi_1665076670883/work/certifi
charset-normalizer==2.1.1
cityscapesScripts==2.2.0
click==8.1.3
colorama==0.4.6
coloredlogs==15.0.1
commonmark==0.9.1
cycler==0.10.0
filelock==3.8.2
flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core
gdown==4.2.0
humanfriendly==9.2
idna==3.4
importlib-metadata==5.1.0
kiwisolver==1.2.0
kornia==0.5.8
Markdown==3.4.1
matplotlib==3.4.2
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mmcv-full==1.3.7
model-index==0.1.11
numpy==1.19.2
opencv-python==4.4.0.46
openmim==0.3.3
ordered-set==4.1.0
pandas==1.1.3
Pillow==8.3.1
prettytable==2.1.0
Pygments==2.13.0
pyparsing==2.4.7
pyquaternion==0.9.9
PySocks==1.7.1
python-dateutil==2.8.2
pytz==2020.1
PyYAML==5.4.1
requests==2.28.1
rich==12.6.0
scipy==1.6.3
seaborn==0.11.1
six @ file:///tmp/build/80754af9/six_1644875935023/work
soupsieve==2.3.2.post1
tabulate==0.9.0
timm==0.3.2
torch==1.7.1
torchvision==0.8.2
tqdm==4.48.2
typing==3.7.4.3
typing-extensions==3.7.4.3
urllib3==1.26.13
wcwidth==0.2.5
yapf==0.31.0
zipp==3.11.0

BTW, I have successfully reproduced the results of DAFormer and HRDA under the exactly same environment setting.

Thanks in advance.

from mic.

lhoyer commented on August 18, 2024

Hi @zyuanbing,

Could you please share your training log that I can have a look for possible reasons?

Best,
Lukas

from mic.

zyuanbing commented on August 18, 2024

Hi @lhoyer,

My latest training log is here. Thanks in advance.

Best,
Yuanbing

from mic.

lhoyer commented on August 18, 2024

Hi @zyuanbing,

Thank you for providing the log. I was able to reproduce the results from the paper with a RTX 3090 using the public code. You can find the log here.

I did a diff to your log and had following observations:

Your environment uses python 3.8.15 instead of python 3.8.5.
Your environment has cuda 11.1 instead of cuda 11.0 installed.
Your environment has pytorch==1.7.1 instead of pytorch==1.7.1+cu110.

These subtle differences might affect reproducibility. I would recommend to downgrade cuda and to set up a new environment from scratch with the instructions from the readme as the installed cuda is used to compile mmcv-full. Also, it might be worth a try to change the random seed (see config file). Even though, we could reproduce our results with the seeds 0, 1, and 2 the random behavior might be different in your environment.

I hope that these hints are helpful.

Best,
Lukas

from mic.

kaigelee commented on August 18, 2024

Dear Kai,

To compile mmcv-full with Cuda 11.0, you need to use a system, which has this version installed. It seems that you use containers, so maybe there is also a container with Cuda 11.0 available? Otherwise, you can find further information here: https://developer.nvidia.com/cuda-11.0-download-archive

Best, Lukas

Dear, Lukas:
I did not install nvidia-cuda, but directly use the cuda included in pytorch (1.7.1-cu110), is this ok?
I also cannot reproduce DAFormer+MIC. Here is my log.
https://github.com/kaigelee/MIC/blob/master/20230110_135504.log

from mic.

lhoyer commented on August 18, 2024

Dear @kaigelee,

I think that a cuda installation is necessary to install mmcv-full using pip install mmcv-full==1.3.7 as cuda is necessary to build the mmcv cuda ops. I tried installing mmcv-full without cuda on the system and it failed with "OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.". I would recommend to install cuda 11.0 on your system and to set up a fresh python environment afterwards.

Best,
Lukas

from mic.

kaigelee commented on August 18, 2024

Dear @kaigelee,

I think that a cuda installation is necessary to install mmcv-full using pip install mmcv-full==1.3.7 as cuda is necessary to build the mmcv cuda ops. I tried installing mmcv-full without cuda on the system and it failed with "OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.". I would recommend to install cuda 11.0 on your system and to set up a fresh python environment afterwards.

Best, Lukas

Thanks for your quick reply. I have installed cuda11.0.2.

Besides, I didn't install mmcv-full with pip install, but with the .whl file, due to an installation error, does that make a difference?

from mic.

lhoyer commented on August 18, 2024

Besides, I didn't install mmcv-full with pip install, but with the .whl file, due to an installation error, does that make a difference?

It might make a difference. Therefore, I would recommend to install mmcv-full using pip install mmcv-full==1.3.7 to exclude this factor as possible reason for the issues.

from mic.

kaigelee commented on August 18, 2024

Besides, I didn't install mmcv-full with pip install, but with the .whl file, due to an installation error, does that make a difference?

It might make a difference. Therefore, I would recommend to install mmcv-full using pip install mmcv-full==1.3.7 to exclude this factor as possible reason for the issues.

Thanks for your reply. But I got a lot of installation errors. Hence, I follow the official recommendation: installed mmcv using mim install mmcv==1.3.7.
I'm re-experimenting and will upload a log when results are available.
I hope you can check it out for me, thank you very much.
Best, Kai

from mic.

lhoyer commented on August 18, 2024

In order to simplify the environment setup, I have created a new environment with a prebuilt mmcv-full, which does not require a local cuda installation.

conda create -n mic-mmcv-full-prebuilt python=3.8.5 pip=22.3.1
conda activate mic-mmcv-full-prebuilt
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html

Using the new environment, MIC(HRDA) on GTA->Cityscapes achieved 76.1 mIoU. Here you can find the log: 20230114_233731.log. Please, let me know if you can reproduce MIC(HRDA) using the simplified environment.

from mic.

jkee58 commented on August 18, 2024

Dear @lhoyer ,

I'm very impressed with your research and want to study it in detail. But I'm having the same problem reproducing the performance in my environment. I haven't tested your simplified environment yet.

I set the seed to 1 and run python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py. The result was below expectations, 74.54. It is my 20230112_174500.log. If you get a chance, please check it.

Environment details:

CUDA 11.0.3
Python 3.8.5
GPU Single RTX3090
mmcv-full 1.3.7
pytorch 1.7.1+cu110

And now I'm training with seed set to 2, but I don't think it's performing better than seed 1. It comes out very low, especially for 'train'.

Could you give me a hint about this problem?

Best regards,
Jeongkee

from mic.

kaigelee commented on August 18, 2024

In order to simplify the environment setup, I have created a new environment with a prebuilt mmcv-full, which does not require a local cuda installation.
conda create -n mic-mmcv-full-prebuilt python=3.8.5 pip=22.3.1
conda activate mic-mmcv-full-prebuilt
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html
Using the new environment, MIC(HRDA) on GTA->Cityscapes achieved 76.1 mIoU. Here you can find the log: 20230114_233731.log. Please, let me know if you can reproduce MIC(HRDA) using the simplified environment.

Thanks for your suggestion, I have configured the environment. I‘m running the program now, after which I will upload my results.

from mic.

kaigelee commented on August 18, 2024

In order to simplify the environment setup, I have created a new environment with a prebuilt mmcv-full, which does not require a local cuda installation.
conda create -n mic-mmcv-full-prebuilt python=3.8.5 pip=22.3.1
conda activate mic-mmcv-full-prebuilt
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html
Using the new environment, MIC(HRDA) on GTA->Cityscapes achieved 76.1 mIoU. Here you can find the log: 20230114_233731.log. Please, let me know if you can reproduce MIC(HRDA) using the simplified environment.

My goodness! This environment results in lower performance for DAFormer+MIC.
20230118_110759.log

from mic.

lhoyer commented on August 18, 2024

My goodness! This environment results in lower performance for DAFormer+MIC. 20230118_110759.log

I have only checked the reproducibility of HRDA+MIC in the simplified environment but I haven't checked DAFormer+MIC. I'll do that in the following days. If you have resources available, can you maybe test HRDA+MIC in the simplified environment on your machine?

from mic.

lhoyer commented on August 18, 2024

Dear @lhoyer ,

I'm very impressed with your research and want to study it in detail. But I'm having the same problem reproducing the performance in my environment. I haven't tested your simplified environment yet.

I set the seed to 1 and run python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py. The result was below expectations, 74.54. It is my 20230112_174500.log. If you get a chance, please check it.

Environment details:

CUDA 11.0.3

Python 3.8.5

GPU Single RTX3090

mmcv-full 1.3.7

pytorch 1.7.1+cu110

And now I'm training with seed set to 2, but I don't think it's performing better than seed 1. It comes out very low, especially for 'train'.

Could you give me a hint about this problem?

Best regards, Jeongkee

Dear @EEAIC,

Thank you for your interest in our work! Based on the information that you provided I don't see a potential reason for the problem. Can you maybe try the simplified environment with the pre-compiled mmcv-full to see whether this helps?

Best,
Lukas

from mic.

kaigelee commented on August 18, 2024

My goodness! This environment results in lower performance for DAFormer+MIC. 20230118_110759.log

I have only checked the reproducibility of HRDA+MIC in the simplified environment but I haven't checked DAFormer+MIC. I'll do that in the following days. If you have resources available, can you maybe test HRDA+MIC in the simplified environment on your machine?

OK, I'll try. But I always have a question, how did you determine your python environment during the experiment? Because now it seems that the impact of the environment on the results is not insignificant.

from mic.

lhoyer commented on August 18, 2024

For all experiments, I have used the same environment, which I originally set up for DAFormer in 2021. When checking reproducibility before making this project open source, I additionally re-created the environment according to the readme instructions. In both environments, the results for HRDA+MIC were reproducible on different seeds. Therefore, I am a actually a bit surprised that there are problems with the reproducibility of HRDA+MIC. Some of the initial logs in this thread showed inconsistencies in the cuda versions of the different packages. Therefore, I assumed that the issue is the software environment, why I provided the simplified environment with the pre-built mmcv-full to avoid issues with cuda version conflicts.

from mic.

ferric123 commented on August 18, 2024

oh, the same here, I can reproduce both DAFormer and HRDA in my environment, but not able to reproduce MIC+HRDA(72.07mIoU). I am not so convinced that this is a problem with the environment setting, since the other two methods can be reproduced in the same env without any issue. And MIC is just a masking operation applied to the target input image during training, intuitively this should not be affected by the environment. To verify this, I would suggest to apply MIC as a plug-and-play module on other open source UDA semantic segmentation projects to check if there is an consistent improvement.

from mic.

jkee58 commented on August 18, 2024

Dear @lhoyer ,
I'm very impressed with your research and want to study it in detail. But I'm having the same problem reproducing the performance in my environment. I haven't tested your simplified environment yet.
I set the seed to 1 and run python run_experiments.py --config configs/mic/gtaHR2csHR_mic_hrda.py. The result was below expectations, 74.54. It is my 20230112_174500.log. If you get a chance, please check it.
Environment details:

CUDA 11.0.3

Python 3.8.5

GPU Single RTX3090

mmcv-full 1.3.7

pytorch 1.7.1+cu110

And now I'm training with seed set to 2, but I don't think it's performing better than seed 1. It comes out very low, especially for 'train'.
Could you give me a hint about this problem?
Best regards, Jeongkee

Dear @EEAIC,

Thank you for your interest in our work! Based on the information that you provided I don't see a potential reason for the problem. Can you maybe try the simplified environment with the pre-compiled mmcv-full to see whether this helps?

Best, Lukas

Dear @lhoyer,

MIC(HRDA) works well in the simplified environment you suggested. 😄

Thank you for your help.

Best regards,
Jeongkee

from mic.

zyuanbing commented on August 18, 2024

In order to simplify the environment setup, I have created a new environment with a prebuilt mmcv-full, which does not require a local cuda installation.
conda create -n mic-mmcv-full-prebuilt python=3.8.5 pip=22.3.1
conda activate mic-mmcv-full-prebuilt
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html
Using the new environment, MIC(HRDA) on GTA->Cityscapes achieved 76.1 mIoU. Here you can find the log: 20230114_233731.log. Please, let me know if you can reproduce MIC(HRDA) using the simplified environment.

Dear @lhoyer,

Thanks for the simplified environment, I am able to reproduce the results in the paper(MIC+HRDA). However, the experiments can still fail in some cases(such as with seed=2): 20230203_191213.log

I find some classes are with extremely low mean IoU(class train) in the evaluation results. I guess different random behaviours might affect the collaboration of RCS and MIC.

Thanks for your help.

Best,
Yuanbing

from mic.

lhoyer commented on August 18, 2024

Dear @EEAIC and @zyuanbing,

I'm happy to hear that you have been able to reproduce the results of MIC+HRDA with the simplified environment. As the original issue seems to be solved, I'll close it for now.

@zyuanbing Your observation seems to be related to issue #9. Let's continue the discussion there.

Best,
Lukas

from mic.

kimkj38 commented on August 18, 2024

@lhoyer
Hello. I'm trying to reproduce your code(segmentation on GTA->Cityscapes) with RTX 3090.

As I know, cuda 11.0 is not supported on RTX 3090(https://en.wikipedia.org/wiki/CUDA) and below error occurs when I install mmcv.

Could you explain how to solve this problem?

from mic.

A question about reproducing the result (environment setup) about mic HOT 31 CLOSED

Comments (31)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs