GithubHelp home page GithubHelp logo

run train.py occur error about alfnet HOT 24 OPEN

liuwei16 avatar liuwei16 commented on June 18, 2024
run train.py occur error

from alfnet.

Comments (24)

VideoObjectSearch avatar VideoObjectSearch commented on June 18, 2024 11

@yongqiangzhang1 @zhangxydlut @MADONOKOUKI @pnnnnnnn
Please try this compiled document
utils.zip

from alfnet.

VideoObjectSearch avatar VideoObjectSearch commented on June 18, 2024 3

@yongqiangzhang1
You can have a try.
nms.zip

from alfnet.

yongqiangzhang1 avatar yongqiangzhang1 commented on June 18, 2024 1

nms works, thank you very much.

from alfnet.

zhangxydlut avatar zhangxydlut commented on June 18, 2024

I have the same problem.

from alfnet.

MADONOKOUKI avatar MADONOKOUKI commented on June 18, 2024

@zhenyezi @zhenyezi
num of training samples: 2112
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "train.py", line 35, in
from keras_alfnet.model.model_1step import Model_1step
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/var/docker/share/madono/summer/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: No module named cython_bbox
I also have similar problems...

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git and cd py-faster-rcnn/lib and make
then copy the utils document from py-faster-rcnn to the utils document from ALFNet
then uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps"
it works for me...

from alfnet.

yongqiangzhang1 avatar yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

from alfnet.

yongqiangzhang1 avatar yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?

oh, sorry, it's "comment"
comment all "from .utils.bbox import box_op"
change the remaining "box_op" to "bbox_overlaps"

from alfnet.

yongqiangzhang1 avatar yongqiangzhang1 commented on June 18, 2024

@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?

there's no box_op function

from alfnet.

yongqiangzhang1 avatar yongqiangzhang1 commented on June 18, 2024

"No module named cython_bbox" and "No module named bbox" are solved by your compiled utils.zip files. But there is a new error from nms.gpu_nms import gpu_nms; ImportError: No module named gpu_nms, can you compile the nms and upload the compiled nms document. Thanks.

from alfnet.

Chen94yue avatar Chen94yue commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

from alfnet.

youtang1993 avatar youtang1993 commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.

from alfnet.

pnnnnnnn avatar pnnnnnnn commented on June 18, 2024

@pnnnnnnn , Is the trained results right?

still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19

Did you get the same MR as the paper?

not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs

Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.

the best i've got is 13.18, maybe it's because my small batchsize(only 4) that i can't reach 12.01

from alfnet.

ou525 avatar ou525 commented on June 18, 2024

hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch
Traceback (most recent call last):
File "test.py", line 32, in
from keras_alfnet.model.model_1step import Model_1step
File "/home/ou/workplace/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/home/ou/workplace/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/home/ou/workplace/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: /home/ou/workplace/ALFNet/keras_alfnet/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct

from alfnet.

m1nt07 avatar m1nt07 commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

i use CUDA9.0, how can i compile to make it work?

from alfnet.

xiaoshang123 avatar xiaoshang123 commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip
hi,when i use the nms.zip,i solve the problem "ImportError: No module named gpu_nms",but the new problem comes:
Traceback (most recent call last):
File "train.py", line 40, in
from keras_alfnet.model.model_2step import Model_2step
File "/home/by/ma/ALFNet-master/keras_alfnet/model/model_2step.py", line 7, in
from keras_alfnet import bbox_process
File "/home/by/ma/ALFNet-master/keras_alfnet/bbox_process.py", line 7, in
from nms_wrapper import nms
File "/home/by/ma/ALFNet-master/keras_alfnet/nms_wrapper.py", line 9, in
from nms.cpu_nms import cpu_nms
ImportError: /home/by/ma/ALFNet-master/keras_alfnet/nms/cpu_nms.so: undefined symbol: PyFPE_jbuf
how can i solve it? Thank you.

from alfnet.

weizheliu avatar weizheliu commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

i use CUDA9.0, how can i compile to make it work?

I meet the same problem, do you find the cuda 9 version of nms?

from alfnet.

whitenightwu avatar whitenightwu commented on June 18, 2024

@yongqiangzhang1
You can have a try.
nms.zip

hi, @VideoObjectSearch , when i use the nms.zip, i have the problem:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
i use CUDA9.0, how can i compile to make it work?

I meet the same problem, do you find the cuda 9 version of nms?

you can try nms.zip

from alfnet.

xiefeiwhu avatar xiefeiwhu commented on June 18, 2024

hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch
Traceback (most recent call last):
File "test.py", line 32, in
from keras_alfnet.model.model_1step import Model_1step
File "/home/ou/workplace/ALFNet/keras_alfnet/model/model_1step.py", line 1, in
from .base_model import Base_model
File "/home/ou/workplace/ALFNet/keras_alfnet/model/base_model.py", line 2, in
from keras_alfnet import data_generators
File "/home/ou/workplace/ALFNet/keras_alfnet/data_generators.py", line 7, in
from .utils.cython_bbox import bbox_overlaps
ImportError: /home/ou/workplace/ALFNet/keras_alfnet/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct

hi, i meet the same question, have you solved it?

from alfnet.

nankeermeng avatar nankeermeng commented on June 18, 2024

@yongqiangzhang1 @pnnnnnnn
follow the code I can train 150 epochs, but when i run the test.py using the train result resnet_e3_l1.15433712553.hdf5 , I cannot get test result, the val_det.txt is empty, why?

from alfnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.