experiencor / keras-yolo3 Goto Github PK
View Code? Open in Web Editor NEWTraining and Detecting Objects with YOLO3
License: MIT License
Training and Detecting Objects with YOLO3
License: MIT License
ValueError:Dimension 0 in both shapes must be equal, but are 1 and 255. Shapes are [1,1,1024,18] and [255,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,18], [255,1024,1,1].
I wonder how I can solve it. Thank U
Python 3.6, Win10, GTX 1060
I've trained a model to detect a single class (similar to the examples).
When using the predict
script on an image where it doesn't detect any object of said class, the subsequent code for handling bounding boxes runs into a division by zero bbox/bbox_iou()
because union
becomes zero.
I'm not sure how to fix the underlying issue but was able to work around it.
I've worked around this by naively replacing the return statement with:
return float(intersect) / union if union != 0 else 0.
This get's me further but then also had to add this to bbox/draw_boxes()
if abs(box.xmin) > 1000 or abs(box.xmax) > 1000 or abs(box.ymin) > 1000 or abs(box.ymax) > 1000:
continue
because the [xy]min/max values are basically invalid values (insanely small/large)
I guess somewhere before all this the boxes should have been discarded to never even reach those code paths but I'm not familiar enough to spot it.
Would be great to be able to reproduce your env and results
I run the command
python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg
and it appear the error
OSError: Unable to open file (unable to open file: name = 'model.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
I don't see the model.hd5 in the solution. Can you update it.
One more thing, in line 400 to 403, the code is
load the weights trained on COCO into the model
#weight_reader = WeightReader(weights_path) yolov3.load_weights("model.h5") yolov3.save("backend.h5")
That mean we do not use the input weight for predicting, insted of we use a file "model.h5"
and in line 403 yolov3.save("backend.h5"). Why we must save model to file?
Thank you very much.
With the grid scales == [ 1, 1, 1] I notice that the losses obviously scale based on the grid size. I have a few questions related to this.
Do you want the losses to be roughly equal per output head and if so is it fine to adjust the grid_scale accordingly ( ie [1, .5, .25])
What do you find is a good loss now for a model? It used to be that I knew I had a good yolov2 model when the loss was < 0.1. Now I am getting losses around ~20. Possible there are other issues at play here and I have made a mistake somewhere. I am not using your code as is, but instead incorporating the new loss in my own code.
1.Did your backend.h5 pre-trained on the ImageNet?
2.Can you give me the details of training process about backend.h5?
Thank you so much!
`
2018-04-15 16:37:36.355877: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****************************************************************************************************
2018-04-15 16:37:36.362129: W tensorflow/core/framework/op_kernel.cc:1202] OP_REQUIRES failed at conv_ops.cc:677 : Resource exhausted: OOM when allocating tensor with shape[8,1024,13,13] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,13,13,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: replica_0/model_1/conv_73/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](replica_0/model_1/leaky_72/LeakyRelu/Maximum, conv_73/kernel/read/_2395)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[Node: training/Adam/gradients/replica_0/model_1/bnorm_47/FusedBatchNorm_grad/FusedBatchNormGrad/_5027 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39158...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 251, in
main(args)
File "train.py", line 210, in main
max_queue_size = 8
File "/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 2224, in fit_generator
class_weight=class_weight)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1883, in train_on_batch
outputs = self.train_function(ins)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2478, in call
**self.session_kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,13,13,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: replica_0/model_1/conv_73/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](replica_0/model_1/leaky_72/LeakyRelu/Maximum, conv_73/kernel/read/_2395)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[Node: training/Adam/gradients/replica_0/model_1/bnorm_47/FusedBatchNorm_grad/FusedBatchNormGrad/_5027 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39158...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'replica_0/model_1/conv_73/convolution', defined at:
File "train.py", line 251, in
main(args)
File "train.py", line 190, in main
saved_weights_name = config['train']['saved_weights_name']
File "train.py", line 113, in create_model
train_model = multi_gpu_model(template_model, gpus=multi_gpu)
File "/content/drive/drive/keras-yolo3/utils/multi_gpu_model.py", line 48, in multi_gpu_model
outputs = model(inputs)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 619, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2085, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2236, in run_internal_graph
output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
File "/usr/local/lib/python3.6/dist-packages/keras/layers/convolutional.py", line 168, in call
dilation_rate=self.dilation_rate)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 3335, in conv2d
data_format=tf_data_format)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 781, in convolution
return op(input, filter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 869, in call
return self.conv_op(inp, filter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 521, in call
return self.call(inp, filter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 205, in call
name=self.name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 631, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1650, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[8,13,13,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: replica_0/model_1/conv_73/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](replica_0/model_1/leaky_72/LeakyRelu/Maximum, conv_73/kernel/read/_2395)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[Node: training/Adam/gradients/replica_0/model_1/bnorm_47/FusedBatchNorm_grad/FusedBatchNormGrad/_5027 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_39158...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
`
Could you explain how you got the pretrained weights for backend(backend.h5)
count [44][5]
loss: [44][57.3060036]
loss: [44][173.999207]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 578, in get
inputs = self.queue.get(block=True).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 401, in get_index
return _SHARED_SEQUENCES[uid][i]
File "/content/drive/keras-yolo3/generator.py", line 73, in __getitem__
img, all_objs = self._aug_image(train_instance, net_h, net_w)
File "/content/drive/keras-yolo3/generator.py", line 160, in _aug_image
image = cv2.imread(image_name)[:,:,::-1] # RGB image
TypeError: 'NoneType' object is not subscriptable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train.py", line 251, in <module>
_main_(args)
File "train.py", line 210, in _main_
max_queue_size = 4
File "/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 2192, in fit_generator
generator_output = next(output_generator)
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 584, in get
six.raise_from(StopIteration(e), e)
File "<string>", line 3, in raise_from
StopIteration: 'NoneType' object is not subscriptable
Thank you very much for your contribution!
I download your code from github, and test on my computer.
When I run "python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg",the problem occured:
Traceback (most recent call last):
File "train.py", line 101, in
main(args)
File "train.py", line 70, in main
anchors = config['model']['anchors'])
x = LeakyReLU(alpha=0.1)(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/base_layer.py", line 454, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/advanced_activations.py", line 46, in call
return K.relu(inputs, alpha=self.alpha)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 2933, in relu
x = tf.nn.leaky_relu(x, alpha)
AttributeError: 'module' object has no attribute 'leaky_relu'
I don't know why. I also run yolo2 on my computer, and I can train on my own dataset, but after this error occured, yolo2 can't run,too. And has the same problem. Do you have any idea?
using train.py with only change dataset path, oom after print
Epoch 1/103
('resizing: ', 320, 320)
i am using tf gpu 1.4.1 keras gpu 2.1.5 python 2.7.13 single gtx1080 yolo3_one_file_to_detect_them_all is valid
I just trained my dataset for 50 epochs, but when it almost completed, it stopped early. And it couldn't be evaluated when I ran evaluate.py. Like this:
Can anybody tell me what the problem is and how to solve it?
@experiencor
Hi there! Thanks a lot that I successfully trained on my own data using your code.
Yet I found some bugs in predict.py:
Line 109: image_paths = [inp_file for inp_file in image_paths if (inp_file[-4:] == '.jpg' or inp_file == '.png')]
The 'inp_file' should be 'inp_file[-4:]'
And because it didn't include '.JPEG' files, it confused me for a while when I tested on my data which are all .JPEG images (with no outputs), so I changed this line as:
image_paths = [inp_file for inp_file in image_paths if (inp_file[-4:] in ['.jpg', '.png', 'JPEG'])]
Still, I got a warning saying: 'warnings.warn('No training configuration found in save file: '
I don't know why but it does not matter.
I want to learn COCO Data.
DarkNet use subdivision because of batchnorm. Is it implemented here?
thank you..
Hello,
I found a little mistake in predict file, line 109
image_paths = [inp_file for inp_file in image_paths if (inp_file[-4:] == '.jpg' or inp_file == '.png')]
inp_file for png should be inp_file[-4:] . People who have png images will never predict there output.
correct line should be
image_paths = [inp_file for inp_file in image_paths if (inp_file[-4:] == '.jpg' or inp_file[-4:] == '.png')]
best of luck
1.Today I found my computer was shut down and my training was interrupted.If I restart the training,the project can reread the weight file I trained before?Or the project just start another training again?
I hope you can help me Thankyou
please modify
Report error 'Segmentation fault(core dumped)' and interrupted training .What is this problem?
For abjectness scores it use logistic regression . what is this advantage ? in the yolo2 what is use for abjectness scores? why this train multi-label classification? what was in the v2? multi-class classification ? what is advantage of this ?
Could you make this as a package so I can install it with pip, It would be useful on kaggle (which doesn't allow loose files but does allow for pip installs from github)
Hi, I just downloaded the whole code. I wanna run the code on GPU, but dont want to occupy the whole GPU, how to change the GPU occupancy rate?
Traceback
(most recent call last): File "train.py", line 284, in <module> _main_(args) File "train.py", line 231, in _main_ scales = config['train']['scales'], File "train.py", line 144, in create_model template_model.load_weights(saved_weights_name) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2647, in load_weights with h5py.File(filepath, mode='r') as f: File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 269, in __init__ fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr) File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 99, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 78, in h5py.h5f.open IOError:
Unable to open file (file signature not found)
I met the error when I tried to train the raccoon model.
I put both raccoon.h5 and backend.h5 files in the folder.
Thank you!
I'm currently using Tensorflow 1.3 and Keras 2.1.5, and I'm experiencing issues related to LeakyRelu. Could you please confirm what version's are required for this code to run?
Thanks in advance.
def get_yolo_boxes(...) of utils.py
image_h image_w might be different in images which is useful for correct_yolo_boxes
cause error when try using batch to boost evaluate ( but its okay to eval pic one by one)
Hi,
Thank you for the sharing.
I am wondering if I can use the model trained from darknet?
And is the available pre-trained model trained by raccoon-dataset?
Thank you
Firstly,I want to thank you for answering my stupid questions.There is another question:
I used the weights that I trained last night to predict the raccoon, but on the test image ,there is only a boundingbox without label name .Is this right? And how can I get the label name?
This is my config.json,I do add the label name .
So my question how long does it take to train model from scratch? I was using SSD512 before this, and it gives me good result after 14hours.
When I try to do prediction I am getting this error:
Traceback (most recent call last):
File "predict.py", line 151, in <module>
_main_(args)
File "predict.py", line 34, in _main_
infer_model = load_model(config['train']['saved_weights_name'])
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\models.py", line 243, in load_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\models.py", line 317, in model_from_config
return layer_module.deserialize(config, custom_objects=custom_objects)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\layers\__init__.py", line 55, in deserialize
printable_module_name='layer')
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\utils\generic_utils.py", line 144, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\engine\topology.py", line 2514, in from_config
process_layer(layer_data)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\engine\topology.py", line 2500, in process_layer
custom_objects=custom_objects)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\layers\__init__.py", line 55, in deserialize
printable_module_name='layer')
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\utils\generic_utils.py", line 144, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\engine\topology.py", line 2514, in from_config
process_layer(layer_data)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\engine\topology.py", line 2500, in process_layer
custom_objects=custom_objects)
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\layers\__init__.py", line 55, in deserialize
printable_module_name='layer')
File "C:\Users\ankit\AppData\Roaming\Python\Python36\site-packages\keras\utils\generic_utils.py", line 138, in deserialize_keras_object
': ' + class_name)
ValueError: Unknown layer: YoloLayer
I tried adding YoloLayer to custom objects as well but then I get:
TypeError: __init__() missing 5 required positional arguments: 'anchors', 'max_grid', 'batch_size', 'warmup_batches', and 'ignore_thresh'
Any clues?
Thanks for a great work.
I wonder 2 things that I cant understand.
How can i add 3 new labels in config.json the example you show was only raccoons ,but what I want to do is to add 3 labels in one picture so I guess below
"labels": ["a","b","c"]
Does it work?
the pkl file is newly added and it want on readme. So can you tell me how to make pkl files?
I m using google colab to run the code but training and saving the weights takes too much time.
Is there any way i can run this code on google colab?
Hey man,
This is the error I am getting right now.
Traceback (most recent call last): File "gen_anchors.py", line 132, in <module> _main_(args) File "gen_anchors.py", line 111, in _main_ centroids = run_kmeans(annotation_dims, num_anchors) File "gen_anchors.py", line 57, in run_kmeans indices = [random.randrange(ann_dims.shape[0]) for i in range(anchor_num)] File "gen_anchors.py", line 57, in <listcomp> indices = [random.randrange(ann_dims.shape[0]) for i in range(anchor_num)] File "/home/abraham/anaconda3/envs/yolo/lib/python3.5/random.py", line 195, in randrange raise ValueError("empty range for randrange()") ValueError: empty range for randrange()
Its due to the pkl file for the dataset. How to create pkl for the same?
What is the content for the pkl file?
*Btw its a nice repo ;), you made yolo v3 in such a short notice
Thank you for your code post. Recently, I trained my own dataset, the results were great in predicting images. But when I tested my video which was recorded by my cellphone, got nothing! The video format I tested was MP4. Is the video resolution much high? Are there some other requirements to video except format??
I trained a yolo3 model with backend.h5 for hours and get a new_weights.h5 (739MB), when I tried to continue training by loading new_weights.h5, there is error below, could someone kindly explain to me?
Loading pretrained weights.
Traceback (most recent call last):
File "train.py", line 252, in
main(args)
File "train.py", line 190, in main
saved_weights_name = config['train']['saved_weights_name']
File "train.py", line 108, in create_model
template_model.load_weights(saved_weights_name)
File "/home/hemp/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 2656, in load_weights
f, self.layers, reshape=reshape)
File "/home/hemp/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 3354, in load_weights_from_hdf5_group
str(len(filtered_layers)) + ' layers.')
ValueError: You are trying to load a weight file containing 1 layers into a model with 147 layers.
Hi, while i run python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg
command.i get those error message, can u help me?
Traceback (most recent call last):
File "yolo3_one_file_to_detect_them_all.py", line 431, in <module>
_main_(args)
File "yolo3_one_file_to_detect_them_all.py", line 407, in _main_
new_image = preprocess_input(image, net_h, net_w)
File "yolo3_one_file_to_detect_them_all.py", line 270, in preprocess_input
resized = cv2.resize(image[:,:,::-1]/255., (new_w, new_h))
TypeError: integer argument expected, got float
is that possible to train yolo3 with only one GTX 1080 (8G) on my own dataset? likely this is not possible only one GPU 8G , right?
First of all,thank you for your post, i'm working on a project, it helped me a lot. But i also met a problem.
while training, bbox.py needs to import get_color from utils.colors, but there is no such a script in utils folder.
I convert the yolov3.weights to yolov3.h5, and then test the image(dog.jpg). The results look like this,
we can find the coordinates of bbox are something wrong, 'y' coordinate seems larger than the darknet test.
I compare the code with the darknet code, it looks all correct. Is there anyone meets the same problem? How can i solve it?
Can I use labelImg to make my own dataset? Is the xml document made by labelImg same as xml you offered ?
Hi Experiencor,
I was waiting for YOLOV3. Thanks for posting.
As you know TF.Estimators are on the way, If possible please add multiple gpu training. Its take too much time to train even on Single GPU.
getting ZeroDivisionError: float division by zero on function bbox_iou while running python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg. Iam not sure if i missing anything
loading weights of convolution #1
loading weights of convolution #2
loading weights of convolution #3
loading weights of convolution #4
Traceback (most recent call last):
File "yolo3_one_file_to_detect_them_all.py", line 440, in
main(args)
File "yolo3_one_file_to_detect_them_all.py", line 411, in main
weight_reader.load_weights(yolov3)
File "yolo3_one_file_to_detect_them_all.py", line 67, in load_weights
size = np.prod(norm_layer.get_weights()[0].shape)
AttributeError: 'NoneType' object has no attribute 'get_weights'
using python3.6.1 keras 2.0.3 tf 1.4.1
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="2"
have no clue to debug, thanks for hint
Thank you for creating this repo! I have a few questions on training and config files:
The config_kangaroo.json and config_raccoon.json has different parameters, and config_raccoon.json seems more match the code. Which one is the set of parameters to use? Does "scales" mean train on different scales of input images?
What parameter would reproduce the result in README?
For example, only "scale" was used in train.py
train_model, infer_model = create_model(
nb_class = len(labels),
anchors = config['model']['anchors'],
max_box_per_image = max_box_per_image,
max_grid = [config['model']['max_input_size'], config['model']['max_input_size']],
batch_size = config['train']['batch_size'],
warmup_batches = warmup_batches,
ignore_thresh = config['train']['ignore_thresh'],
multi_gpu = multi_gpu,
saved_weights_name = config['train']['saved_weights_name'],
lr = config['train']['learning_rate'],
scales = config['train']['scales'],
)
config_raccoon.json
"scales": [1,5,10],
config_kangaroo.json
"grid_scales": [1,1,1],
"obj_scale": 5,
"noobj_scale": 1,
"xywh_scale": 1,
"class_scale": 1,
After run training, I got:
Epoch 00035: loss did not improve from 10.03454
Epoch 00035: reducing learning rate to 1.00000001169e-08.
- 36s - loss: 10.3701 - yolo_layer_1_loss: 1.2003 - yolo_layer_2_loss: 3.5482 -
yolo_layer_3_loss: 5.6216
Epoch 00035: early stopping
Premature end of JPEG file
kangaroo: 0.7736
mAP: 0.7736
Hi, I successfully trained this keras yolo3 network on my own custom dataset and it is giving me very good results, however the network takes up to 2 seconds just to generate predictions on each image.
(This is specifically the time for boxes = get_yolo_boxes....
to run, not any of the extra stuff)
Is it possible to speed up predictions in any way, or even use these weights with the straight C version of Darknet?
Thanks a lot.
There are some code that I can't figure out by myself, please help!
in generator.py file, should the true_box_index be set to zero for every train_instance ?
Line 68 in 25af1ba
in yolo.py file, pred_box_conf - 0 and true_box_wh + tf.zeros_like(true_box_wh) * (1-object_mask) make no difference, I really confused about warmup training, what it is trying to do ?
please give more explanations.
Line 74 in 25af1ba
Line 153 in 25af1ba
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.