Hi - I can successfully run darkflow in demo mode with the camera. Great! I a

Guidelines on training for PascalVOC data? about darkflow HOT 15 CLOSED

thtrieu commented on July 17, 2024

Guidelines on training for PascalVOC data?

from darkflow.

Comments (15)

thtrieu commented on July 17, 2024

It will handle the parsing itself as stated in README. Hope there's no bug.

from darkflow.

rtrahms commented on July 17, 2024

Thanks. I think I found the issue. I had a labels.txt file with all 20 classes in it, and needed to use the yolo-full.cfg file. Once I entered the line below, things moved along into training nicely (including the parsing at the beginning):

./flow --model cfg/v1/yolo-full.cfg --dataset /home/rob/Data_PascalVOC/VOCdevkit/VOC2012/JPEGImages/ --annotation /home/rob/Data_PascalVOC/VOCdevkit/VOC2012/Annotations/ --backup yolo_backup --train

Since my version of TF is built with CUDA support, I assume I don't need to call --gpu 1.0 explicitly to use the GPU for training, correct?

from darkflow.

thtrieu commented on July 17, 2024

You do have to
P/S lucky you to have GPU at hand

from darkflow.

rtrahms commented on July 17, 2024

Okay. I do only have 8GB of GPU mem (NVidia GTX 1080), and will need to find a way to trim the memory use down. When using Caffe/DIGITS, I was able to trim batch sizes, not sure if I can do that here.

from darkflow.

thtrieu commented on July 17, 2024

Decreasing batch size by a factor of 2.0 will decrease the standard deviation only by a factor of sqrt(2.0), so it is actually beneficial to trim batch size if you have limited resources, especially when it comes to resource-extensive tools like TF.

Good luck with the training.

from darkflow.

rtrahms commented on July 17, 2024

Good to know! Reducing batch size on yolo-full.cfg (even down to 2) did not solve my problem, but running with yolo-4c.cfg and gpu 1.0 was successful - gpu training off and running. I'll have to investigate more ways to lean this up for my current GPU. Thanks.

from darkflow.

rtrahms commented on July 17, 2024

Hi -
I tried to stop and restart the training, using a saved checkpoint. I followed the format you specified below:

./flow --train --model cfg/yolo-2c.cfg --load ckpt/yolo-3c-1500

In my case I have a collection of checkpoint files for yolo-4c, the latest of which is 1162. The following command:

./flow --model cfg/v1/yolo-4c.cfg --dataset /home/rob/Data_PascalVOC/VOCdevkit/VOC2012/JPEGImages/ --annotation /home/rob/Data_PascalVOC/VOCdevkit/VOC2012/Annotations/ --train --load ckpt/yolo-4c-1162 --gpu 1.0

gives an error:
Traceback (most recent call last):
File "./flow", line 42, in
tfnet = TFNet(FLAGS)
File "/home/rob/darkflow/net/build.py", line 34, in init
darknet = Darknet(FLAGS)
File "/home/rob/darkflow/dark/darknet.py", line 12, in init
self.get_weight_src(FLAGS)
File "/home/rob/darkflow/dark/darknet.py", line 46, in get_weight_src
'{} not found'.format(FLAGS.load)
AssertionError: ckpt/yolo-4c-1162 not found

Am I calling this correctly?

from darkflow.

thtrieu commented on July 17, 2024

Looking at the assertion error, clearly ckpt/yolo-4c-1162 does not exist. Make sure it is there

from darkflow.

rtrahms commented on July 17, 2024

There is no file with that prefix only, but checkpoint files seem to have four different types: '00000-of-00001', index, meta and profile. This is the same for every checkpoint, not just 1162. 1162 has this as you see below. I tried --load with each of the files below, no go. Should I be expecting a prefix only file in addition to the four files I see?

rob@skynet1:/darkflow$ cd ckpt
rob@skynet1:/darkflow/ckpt$ ls yolo-4c-1162*
yolo-4c-1162.data-00000-of-00001 yolo-4c-1162.index yolo-4c-1162.meta yolo-4c-1162.profile

from darkflow.

rtrahms commented on July 17, 2024

I figured it out! I was using the filename, not the iteration number! All is well now, training resumed after specifying "--load 1162". Thanks.

from darkflow.

thtrieu commented on July 17, 2024

Great, the files you listed are new to me, that must be TF 0.12; nevertheless, good luck with getting back on track.

from darkflow.

rtrahms commented on July 17, 2024

Reviewing the yolo code, I noticed that there are thresholds defined for four classes in test.py:
_thresh = dict({
'person': .2,
'pottedplant': .1,
'chair': .12,
'tvmonitor': .13
})

Do I need to adjust this for the number of classes I have in labels.txt and the .cfg file? Just curious.

from darkflow.

thtrieu commented on July 17, 2024

Actually that was an experiment of mine that I forgot to remove, please delete that dictionary, or modify it so that each class has its own threshold, over-riding the threshold specified in `.cfg`, It's up to you and it does not harm the training process.

…

On Fri, Dec 30, 2016 at 12:46 PM, Rob Trahms ***@***.***> wrote: Reviewing the yolo code, I noticed that there are thresholds defined for four classes in test.py: _thresh = dict({ 'person': .2, 'pottedplant': .1, 'chair': .12, 'tvmonitor': .13 }) Do I need to adjust this for the number of classes I have in labels.txt and the .cfg file? Just curious. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#23 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIxLB-RFedCqNyaCgaPs3-hS-3fmDCVxks5rNJqegaJpZM4LXDOk> .

from darkflow.

rtrahms commented on July 17, 2024

Will do - thanks!

from darkflow.

hemavakade commented on July 17, 2024

@rtrahms How did you test after you trained the model. Can you please help me with the command. I have a saved checkpoint in the ckpt folder. Should I use this?

from darkflow.

Guidelines on training for PascalVOC data? about darkflow HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs