when I run train.py ,I run into some error File "/ghome/zhenye/ALFNet-master/keras

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

run train.py occur error about alfnet HOT 24 OPEN

liuwei16 commented on June 18, 2024

run train.py occur error

from alfnet.

Comments (24)

VideoObjectSearch commented on June 18, 2024 11

@yongqiangzhang1 @zhangxydlut @MADONOKOUKI @pnnnnnnn
Please try this compiled document
utils.zip

from alfnet.

VideoObjectSearch commented on June 18, 2024 3

@yongqiangzhang1
You can have a try.
nms.zip

from alfnet.

yongqiangzhang1 commented on June 18, 2024 1

nms works, thank you very much.

from alfnet.

zhangxydlut commented on June 18, 2024

I have the same problem.

from alfnet.

MADONOKOUKI commented on June 18, 2024

@zhenyezi @zhenyezi
num of training samples: 2112
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "train.py", line 35, in
from keras_alfnet.model.model_1step import Model_1step
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: No module named cython_bbox
I also have similar problems...

from alfnet.

pnnnnnnn commented on June 18, 2024

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git and cd py-faster-rcnn/lib and make
then copy the utils document from py-faster-rcnn to the utils document from ALFNet
then uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps"
it works for me...

from alfnet.

yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

from alfnet.

pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

from alfnet.

yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?

from alfnet.

pnnnnnnn commented on June 18, 2024

@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?

oh, sorry, it's "comment"
comment all "from .utils.bbox import box_op"
change the remaining "box_op" to "bbox_overlaps"

from alfnet.

yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?

from alfnet.

pnnnnnnn commented on June 18, 2024

@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?

there's no box_op function

from alfnet.

yongqiangzhang1 commented on June 18, 2024

"No module named cython_bbox" and "No module named bbox" are solved by your compiled utils.zip files. But there is a new error from nms.gpu_nms import gpu_nms; ImportError: No module named gpu_nms, can you compile the nms and upload the compiled nms document. Thanks.

from alfnet.

Chen94yue commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

from alfnet.

pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

from alfnet.

youtang1993 commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.

from alfnet.

pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.

the best i've got is 13.18, maybe it's because my small batchsize(only 4) that i can't reach 12.01

from alfnet.

ou525 commented on June 18, 2024

hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch
Traceback (most recent call last):
File "test.py", line 32, in
from keras_alfnet.model.model_1step import Model_1step
File "/home/ou/workplace/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/home/ou/workplace/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/home/ou/workplace/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: /home/ou/workplace/ALFNet/keras_alfnet/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct

from alfnet.

m1nt07 commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

i use CUDA9.0, how can i compile to make it work?

from alfnet.

xiaoshang123 commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip
hi,when i use the nms.zip,i solve the problem "ImportError: No module named gpu_nms",but the new problem comes:
Traceback (most recent call last):
File "train.py", line 40, in
from keras_alfnet.model.model_2step import Model_2step
File "/home/by/ma/ALFNet-master/keras_alfnet/model/model_2step.py", line 7, in
from keras_alfnet import bbox_process
File "/home/by/ma/ALFNet-master/keras_alfnet/bbox_process.py", line 7, in
from nms_wrapper import nms
File "/home/by/ma/ALFNet-master/keras_alfnet/nms_wrapper.py", line 9, in
from nms.cpu_nms import cpu_nms
ImportError: /home/by/ma/ALFNet-master/keras_alfnet/nms/cpu_nms.so: undefined symbol: PyFPE_jbuf
how can i solve it? Thank you.

from alfnet.

weizheliu commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

i use CUDA9.0, how can i compile to make it work?

I meet the same problem, do you find the cuda 9 version of nms?

from alfnet.

whitenightwu commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
i use CUDA9.0, how can i compile to make it work?

I meet the same problem, do you find the cuda 9 version of nms?

you can try nms.zip

from alfnet.

xiefeiwhu commented on June 18, 2024

hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch
Traceback (most recent call last):
File "test.py", line 32, in
from keras_alfnet.model.model_1step import Model_1step
File "/home/ou/workplace/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/home/ou/workplace/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/home/ou/workplace/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: /home/ou/workplace/ALFNet/keras_alfnet/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct

hi, i meet the same question, have you solved it?

from alfnet.

nankeermeng commented on June 18, 2024

@yongqiangzhang1 @pnnnnnnn
follow the code I can train 150 epochs, but when i run the test.py using the train result resnet_e3_l1.15433712553.hdf5 , I cannot get test result, the val_det.txt is empty, why?

from alfnet.

run train.py occur error about alfnet HOT 24 OPEN

Comments (24)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs