mlpc-ucsd / letr Goto Github PK
View Code? Open in Web Editor NEW(CVPR 2021 Oral) LETR: Line Segment Detection Using Transformers without Edges
License: Apache License 2.0
(CVPR 2021 Oral) LETR: Line Segment Detection Using Transformers without Edges
License: Apache License 2.0
Addition of Gradient Accumulation in all stages
HI!
Web demo (https://huggingface.co/spaces/z-uo/LETR) arises the following error, so no image can be uploaded:
build error
Build failed with exit code: 1
Hello, I encountered the following problem when running the training code, and I hope to solve it.
`(letr) root@shuusv005:~/sjc/LETR# bash ./script/train/a0_train_stage1_res50.sh res50_stage1
folder not exist
| distributed init (rank 1): env://
Traceback (most recent call last):
File "src/main.py", line 215, in
main(args)
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Traceback (most recent call last):
File "src/main.py", line 215, in
main(args)
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Traceback (most recent call last):
File "src/main.py", line 215, in
main(args)
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
| distributed init (rank 0): env://
Traceback (most recent call last):
File "src/main.py", line 215, in
main(args)
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Traceback (most recent call last):
File "src/main.py", line 215, in
Traceback (most recent call last):
File "src/main.py", line 215, in
main(args)
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)main(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
File "src/main.py", line 21, in main
utils.init_distributed_mode(args)
File "/home/shu-usv005/sjc/LETR/src/util/misc.py", line 421, in init_distributed_mode
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch.cuda.set_device(args.gpu)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Traceback (most recent call last):
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/shu-usv005/anaconda3/envs/letr/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/shu-usv005/anaconda3/envs/letr/bin/python', '-u', 'src/main.py', '--coco_path', 'data/wireframe_processed', '--output_dir', 'exp/res50_stage1', '--backbone', 'resnet50', '--resume', 'https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth', '--batch_size', '1', '--epochs', '500', '--lr_drop', '200', '--num_queries', '1000', '--num_gpus', '1', '--layer1_num', '3']' returned non-zero exit status 1.
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
Killing subprocess 6288
Killing subprocess 6289
Killing subprocess 6290
Killing subprocess 6291
Killing subprocess 6292
Killing subprocess 6293
Killing subprocess 6294
Killing subprocess 6295
`
Hi.
I tried to implement the demo.ipynb file on the CPU Machine. but, At the time of loading the pre-trained model, it throws an error "AssertionError: Torch not compiled with CUDA enabled"
i am getting the following error while reproducing the result.
CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 10.76 GiB total capacity; 8.86 GiB already allocated; 64.94 MiB free; 8.99 GiB reserved in total by PyTorch).
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
I have tried to reduce the batch size, image resolution and cleared pytorch cache as well; still the issue exists. Any suggestion on this would be a great help.
Getting the following warning during training:
[W reducer.cpp:1050] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters, consider turning this flag off. Note that this warning may be a false positive your model has flow control causing later iterations to have unused parameters. (function operator())
is file reducer.cpp a part of back-end network and how to access and modify it ?
As mentioned in paper it is trained for 900 epochs. It would be a great help if you can give estimated time to train on 4 GPU's.
Hi, thanks for sharing the code!
How to output the comparison of Figure 5 in the paper?
Why does the memory requirement keep on increasing for several iterations?
Is this an expected behaviour for transformers? Even with resnet50 as backbone, on 16gb GPU I am able to run with only 2 batch-size?
Epoch: [0] [ 0/8819] eta: 6:34:13 lr: 0.000100 loss: 33.1388 (33.1388) time: 2.6822 data: 1.0370 max mem: 3331 Epoch: [0] [ 10/8819] eta: 1:06:08 lr: 0.000100 loss: 28.7849 (29.3637) time: 0.4505 data: 0.1035 max mem: 3409 Epoch: [0] [ 20/8819] eta: 0:52:00 lr: 0.000100 loss: 25.1816 (26.4967) time: 0.2383 data: 0.0167 max mem: 4454 Epoch: [0] [ 30/8819] eta: 0:46:17 lr: 0.000100 loss: 22.1077 (24.6563) time: 0.2421 data: 0.0178 max mem: 4454 Epoch: [0] [ 40/8819] eta: 0:43:19 lr: 0.000100 loss: 19.9209 (23.0615) time: 0.2347 data: 0.0120 max mem: 4454 Epoch: [0] [ 50/8819] eta: 0:43:07 lr: 0.000100 loss: 16.4599 (21.6173) time: 0.2626 data: 0.0404 max mem: 4454 Epoch: [0] [ 60/8819] eta: 0:42:41 lr: 0.000100 loss: 13.7710 (20.2928) time: 0.2848 data: 0.0647 max mem: 4454 Epoch: [0] [ 70/8819] eta: 0:41:28 lr: 0.000100 loss: 12.7223 (19.3061) time: 0.2573 data: 0.0365 max mem: 5913 Epoch: [0] [ 80/8819] eta: 0:40:41 lr: 0.000100 loss: 12.4632 (18.5024) time: 0.2398 data: 0.0171 max mem: 5913 Epoch: [0] [ 90/8819] eta: 0:40:52 lr: 0.000100 loss: 12.4809 (17.8598) time: 0.2686 data: 0.0488 max mem: 5913
Hi,very thanks for your contribution.
I am trying to eval the model on york data,but i meet the error as follows.
"Unable to read file 'evaluation/data/york/valid_copy/P1020171_line.mat'. No such file or directory."
I check the dataset you provided, it lost the .mat files in york valid file.
So could you please update or send me the full York_eval data?
Thank you very much!
/home/zhaohanguang/图片/2022-04-26 18-32-53屏幕截图.png
Need download all weight in script/train ?
hello ,thanks for you code.
when i try to prepare ShanghaiTech Train Data,use your download code,but i can not download.
is there any other way to download data?
thanks
I launched a notebook (demo_letr.ipynb) with example with pretrained model res101_stage2_focal. If I run it once everything is ok but if I try to run in again I always get gpu memory issue (I have RTX 3080). My investigation shows that after infer model doesn’t free memory. What am I doing wrong or is it a bug?
Addition of focal self-attention layer instead of self-attention.
I would like to classify the line segments into different categories. How can I do it?
What version of pytorch is required for the environment? When I run the given demo_letr with version 1.10.0, I get the following error, ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops'.Is there a compatibility problem with my environment so I can't import it?
Could you please provide a list of all the packages and their versions from your conda environment?
I got the problem when I tried to run inference the �'second time' with same image (the first time is good).
outputs = model(inputs)[0]
I run the demo_letr.ipynb �on colab pro (25GB RAM).
I tried use
torch.cuda.empty_cache()
gc.collect() # garbage collection
but still got this error.
Thank you in advance
the data can not be downloaded?
Hi guys,
I'm trying to run: bash script/train/a0_train_stage1_res50.sh res50_stage1
But unfortunately, I'm getting the following error:
Traceback (most recent call last):
File "/home/cristopher/Workspace/LETR/src/main.py", line 13, in <module>
import datasets
File "/home/cristopher/Workspace/LETR/src/datasets/__init__.py", line 5, in <module>
from .coco import build as build_coco
File "/home/cristopher/Workspace/LETR/src/datasets/coco.py", line 10, in <module>
import datasets.transforms as T
File "/home/cristopher/Workspace/LETR/src/datasets/transforms.py", line 18, in <module>
from util.misc import interpolate
File "/home/cristopher/Workspace/LETR/src/util/misc.py", line 22, in <module>
from torchvision.ops import _new_empty_tensor
ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops' (/home/cristopher/miniconda3/envs/letr/lib/python3.9/site-packages/torchvision/ops/__init__.py)
Traceback (most recent call last):
File "/home/cristopher/Workspace/LETR/src/main.py", line 13, in <module>
import datasets
File "/home/cristopher/Workspace/LETR/src/datasets/__init__.py", line 5, in <module>
from .coco import build as build_coco
File "/home/cristopher/Workspace/LETR/src/datasets/coco.py", line 10, in <module>
import datasets.transforms as T
File "/home/cristopher/Workspace/LETR/src/datasets/transforms.py", line 18, in <module>
from util.misc import interpolate
File "/home/cristopher/Workspace/LETR/src/util/misc.py", line 22, in <module>
from torchvision.ops import _new_empty_tensor
ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops' (/home/cristopher/miniconda3/envs/letr/lib/python3.9/site-packages/torchvision/ops/__init__.py)
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/cristopher/Workspace/LETR/src/main.py", line 13, in <module>
File "/home/cristopher/Workspace/LETR/src/main.py", line 13, in <module>
import datasetsimport datasets
File "/home/cristopher/Workspace/LETR/src/datasets/__init__.py", line 5, in <module>
File "/home/cristopher/Workspace/LETR/src/datasets/__init__.py", line 5, in <module>
from .coco import build as build_cocofrom .coco import build as build_coco
File "/home/cristopher/Workspace/LETR/src/datasets/coco.py", line 10, in <module>
File "/home/cristopher/Workspace/LETR/src/datasets/coco.py", line 10, in <module>
import datasets.transforms as Timport datasets.transforms as T
File "/home/cristopher/Workspace/LETR/src/datasets/transforms.py", line 18, in <module>
File "/home/cristopher/Workspace/LETR/src/datasets/transforms.py", line 18, in <module>
from util.misc import interpolatefrom util.misc import interpolate
File "/home/cristopher/Workspace/LETR/src/util/misc.py", line 22, in <module>
File "/home/cristopher/Workspace/LETR/src/util/misc.py", line 22, in <module>
from torchvision.ops import _new_empty_tensorfrom torchvision.ops import _new_empty_tensor
ImportErrorImportError: : cannot import name '_new_empty_tensor' from 'torchvision.ops' (/home/cristopher/miniconda3/envs/letr/lib/python3.9/site-packages/torchvision/ops/__init__.py)cannot import name '_new_empty_tensor' from 'torchvision.ops' (/home/cristopher/miniconda3/envs/letr/lib/python3.9/site-packages/torchvision/ops/__init__.py)
My setup has the following packages:
pytorch 1.9.0 py3.9_cuda10.2_cudnn7.6.5_0 pytorch
torchvision 0.10.0 pypi_0 pypi
I tried finding what was the actual problem, and had failed so far.
Hi,
Thanks for sharing the code and nice/solid works! I wonder do you have some example code for inference on the test images without ground truth? The current implementation is only for train/evaluation, and the wrapper of dataloader is with both image and target. Thank you.
Hello,
In the paper you mention the COCO dataset, How did you use it with polyline ? There is no lines in COCO dataset
Could you publish an example of your dataset in coco ?
Thanks
The website of the evaluation data does not exist. Could you please give me a new address for downloading the data? Thank you!
If my training was interrupted, how can I continue train it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.