GithubHelp home page GithubHelp logo

alinlab / ifseg Goto Github PK

View Code? Open in Web Editor NEW
77.0 77.0 9.0 5.27 MB

IFSeg: Image-free Semantic Segmentation via Vision-Language Model (CVPR 2023)

Python 87.49% Jupyter Notebook 8.04% Makefile 0.01% Batchfile 0.01% Shell 3.14% C++ 0.39% Cuda 0.63% Cython 0.22% Lua 0.07%

ifseg's People

Contributors

kami93 avatar sm3199 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ifseg's Issues

Distributed training error

Hi,
I allocate 64g REM with 4 A100 GPUs

#SBATCH --time=72:00:00
#SBATCH --mem=64g
#SBATCH --job-name="ifseg"
#SBATCH --partition=gpu
#SBATCH --gres=gpu:a100:4
#SBATCH --cpus-per-task=4
#SBATCH --mail-type=BEGIN,END,ALL

sh run_scripts/IFSeg/coco_unseen.sh

Here is the distributed training error message. Any input? Thanks.

--Ruida

single-machine distributed training is initialized.
/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launch.py:180: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 (pid: 2012686) of binary: /gpfs/gsfs12/users/me/conda/envs/ifseg/bin/python3
Traceback (most recent call last):
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launch.py", line 195, in
main()
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/gpfs/gsfs12/users/me/conda/envs/ifseg/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

[1]:
time : 2023-12-23_05:09:30
host : localhost
rank : 1 (local_rank: 1)
exitcode : -11 (pid: 2012687)
error_file: <N/A>
traceback : Signal 11 (SIGSEGV) received by PID 2012687
[2]:
time : 2023-12-23_05:09:30
host : localhost
rank : 2 (local_rank: 2)
exitcode : -11 (pid: 2012688)
error_file: <N/A>
traceback : Signal 11 (SIGSEGV) received by PID 2012688
[3]:
time : 2023-12-23_05:09:30
host : localhost
rank : 3 (local_rank: 3)
exitcode : -11 (pid: 2012689)
error_file: <N/A>
traceback : Signal 11 (SIGSEGV) received by PID 2012689

Root Cause (first observed failure):
[0]:
time : 2023-12-23_05:09:30
host : localhost
rank : 0 (local_rank: 0)
exitcode : -11 (pid: 2012686)
error_file: <N/A>
traceback : Signal 11 (SIGSEGV) received by PID 2012686

Custom Dataset

First of all, thank you for conducting such great research.

Could you give me an example of how to learn and evaluate a new dataset?

Thank you for reading!

Unable to reproduce results

Excellent work! Thanks for the release.

I followed the instruction here to perform image-free training on ADE20k val, but got extremely low mIoU results(0.0001). Below is my training log:

logging.log

Is there any bug or possible wrong step in the training procedure that I performed? Or any hint?

Thanks in advance.

training script question

Hi,

coco_fine.sh:
data=${data_dir}/fineseg_refined_val2017.tsv,${data_dir}/fineseg_refined_val2017.tsv
coco_unseen.sh:
data=${data_dir}/unseen_val2017.tsv,${data_dir}/unseen_val2017.tsv

Could you leave me input on why the two valid dataset TSV files were specified in the data variable?

Thanks,

--Ruida

csv generation

Thanks for your excellent work. I have downloaded '2017train images', '2017val images', and '2017Stuff Train/val annotations' from the website, and I revised the data path in the python file. But I can't get the correct csv generation, could you give me help?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.