aiwintermuteai / axelerate Goto Github PK

Keras-based framework for AI on the Edge

License: MIT License

Python 83.62% Jupyter Notebook 16.14% Shell 0.13% HTML 0.11%

axelerate's Issues

Issue: OSError: run error; 2013

I got some error on kpu.run_yolo2(task, img) in my detector model..
I already put this code "kpu.set_outputs(task, 7,7,5,20)" for reshape layer,reshape_1 (Reshape) (None, 7, 7, 5, 20).

config:
"architecture": "MobileNet7_5",
"backend": "imagenet"
Do you know some solution for it?

tensorflow 2.5 error！

Epoch 1/200
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\threading.py", line 926, in _bootstrap_inner
self.run()
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\site-packages\tensorflow\python\keras\utils\data_utils.py", line 726, in _run
with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\site-packages\tensorflow\python\keras\utils\data_utils.py", line 705, in pool_fn
initargs=(seqs, None, get_worker_id_queue()))
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\context.py", line 119, in Pool
context=self.get_context())
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\pool.py", line 176, in init
self._repopulate_pool()
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
w.start()
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle weakref objects

Traceback (most recent call last):
File "", line 1, in
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\ProgramData\Anaconda3\envs\axelerate\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

----What's Mean? system is win10.

Support for custom input_size no work in kmodel

Support for custom input_size no work.

It only supports input_size 224, with input_size 240 it returns error

Cannot convert tf to onnx in object detection

Describe the bug
In a simple object detection training, I need to convert the trained model to onnx, but an error saying AttributeError: module 'tensorflow.keras.backend' has no attribute 'get_session' happens;

To Reproduce
To reproduce, I ran a simple person detector training and the converter set to 'onnx'

Expected behavior
Just to convert it so I can use it on my Jetson Nano.

Screenshots

Environment (please complete the following information):

Using Google Colab right now

Additional context
I saw that there aren't any examples of object detection, but I assume that it would work as well.

This is my config dict:

{
	"model":{
		"type":                 "Detector",
		"architecture":         "MobileNet1_0", # MobileNet7_5
		"input_size":           224,
		"anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
		"labels":               ["bobina"],
		"coord_scale" : 		1.0,
		"class_scale" : 		1.0,
		"object_scale" : 		5.0,
		"no_object_scale" : 	3.0 # 1.0e
	},
	"weights" : {
		"full":   				"mobilenet_1_0_224_tf_no_top.h5",
		"backend":   		  "" # mobilenet_1_0_224_tf_no_top.h5
	},
	"train" : {
		"actual_epoch":         15,
		"train_image_folder":   "bobinas_vert/imgs",
		"train_annot_folder":   "bobinas_vert/anns",
		"train_times":          10,
		"valid_image_folder":   "bobinas_vert/imgs_validation",
		"valid_annot_folder":   "bobinas_vert/anns_validation",
		"valid_times":          5,
		"valid_metric":         "mAP",
		"batch_size":           8,
		"learning_rate":        1e-4,
		"saved_folder":   		"TESTE_ZERO_MEU",
		"first_trainable_layer": "", #conv_pw_13_bn
		"augumentation":		True,
		"is_only_detect" : 		False
	},
	"converter" : {
		"type":   				["onnx"]
	}
}

Problems with the config - How to train more epochs

I seem to be having issues with the config, as said in a previous issue. Is there a way to train more epochs?

Support for yolov4/5

Do you plan to support yolov4 (darknet) or yolov5 (pytorch)?

Problem with TFLite in RaspberryPi ? /MaixPy OCR

Hi @AIWintermuteAI ,I'm working on an object detection model and Axelerate is fantastic, I still don't know how it works completely because the training is very fast and the results are very good in contrast with the usual training for (CSV files/TF records...etc).

Now the issues : I've made a vehicular plate recognition and the model for the MaixPy works correctly but when I try to replicate this in the Raspberry with tflite format it just doesn't run. In the Raspeberry console show me : "IndexError: list index out of range".

Ive been working with "https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi" for the implementation, and the training is with Axelerate. In the issues on EdjeElectronics the error is showed and answer is "Hi all, the "list index out of range" error is occurring because you are using an "image classification" model rather than an "object detection" model. So I don't understad what is really happening , I worked on Google Colab (Object Detection) and the Config File is :

The script Im using to run the Raspberry is here: https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/TFLite_detection_webcam.py

So, is the conversion failing or am I executing badly?.

The other thing is about MaixPy , the Kmodel works perctly and the model is really amazing detecting plates but for a strange reason the camera is always rotating, the code Im using is this:

I modify the example script because in line 18 "img = sensor.snapshot().rotation_corr(z_rotation=90.0)" the rotation part always marked an atribute error.

To finish Im planning on do OCR to the plate, with the Raspberry I'll work with OpenCV/pytesseract but in the MaixPy I dont know how to do it. Can I install those libraries? or Can give me an idea in how to make it?

Thanks a lot for the help if u can and u have done an amazing work with Axelerate.

Colab Notebook not able to convert to kmodel

With the Colab detector notebook, I'm no longer able to convert the models, I can generate my models and convert to Tflite but they do not convert to kmodel

here is the Colab notebook: https://colab.research.google.com/drive/1eRs-dJdV9ij6RVN4X6rTesuoxrNPi5Ng?usp=sharing

I used this notebook to train a custom detector made with my dataset and it worked well, now I created a new PASCAL VOC dataset with VOTT label tool and after the training, the model is not converted to kmodel no matter how much I try. The folder structure is exactly the same as the old dataset and the training is just fine.

I tried again with the old dataset and it works, I really do not know why, the size of the images are the same, same format, same programs used.

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Local or Google Colab/Jupyter Notebook
If Local: conda or native install/ virtual env?
If Google Colab: tensorflow version?

Additional context
Add any other context about the problem here.

k210 segnet

Hi,
Can you please provide the example scripts for segnet on k210
And one more question, do you think it possible to run pose estimate / posenet on K210?
Many Thanks.

AttributeError: 'Image' object has no attribute 'rotation_corr'

I tried running your script : example_scripts/k210/detector/person_detector_v4.py
but it raised an error below....

[MAIXPY]: find ov2640
Traceback (most recent call last):
File "", line 19, in
AttributeError: 'Image' object has no attribute 'rotation_corr'

MicroPython v0.5.0-22-g7ac6b09 on 2020-03-04; Sipeed_M1 with kendryte-k210

Error importing axelerate python3 library (Illegal instruction (core dumped))

Describe the bug
I'm trying to train a Mobilenet inside a fresh installed Ubuntu 18.04. After running pip install axelerate (or pip install -git+https://github.com/AIWintermuteAI/aXeleRate) I can't import axelerate inside python3.

To Reproduce
I don't think it is going to be easy to reproduce it because of my environment. I'm running a Windows 10 Server with an Oracle VirtualBox. Then I installed a fresh Ubuntu 18.04 and upgraded the system with python3, build-essentials and some other libraries.
Installed axelerate through pip and then the error comes in

Expected behavior
Expecting just to import normally.

Screenshots
The error is in portuguese, but it is a Illegal instruction error and python crashes

Environment (following the order):

Windows Server 10
Oracle VM VirtualBox
Ubuntu 18.04
5.4.0-66-generic headers

Additional context
I'll also try to create another VM locally or on cloud to test it. I was originally running my tests on a Windows 10 normal machine, with all dependencies installed, but training isn't working as well there, so I switched ambient to test and train my network. On the Win machine, the script isn't finding training and validation images, I'll create another issue if I can't sort it out myself.

Thanks in advance

Issue installing

Hi
Just tried a fresh install on a new VM (Ubuntu Desktop 16.04.6 64-bit), followed instructions but when I tried running tests_training.py I get the following error:
ModuleNotFoundError: No module named 'tensorflow.keras.layers.experimental.preprocessing'
...
ImportError: Keras requires Tensorflow 2.2 or higher.

Obviously a dependency issue...just not sure which has the wrong version... TF or Keras??

Any suggestions?
Thanks
Tim

The loss is not converge when training detector on VOC 2012

Describe the bug
Hi, I tried to reproduce your MobileNet_yolov2 on VOC 2012 dataset. During the training process, the mAP is increasing but the value of loss function is unstable and not converge in the end.

To Reproduce

Clone the repository to local machine
modify the image/annotation folder of the configs/pascal_20_detector.json
run train.py -c configs/pascal_20_detector.json

Expected behavior
The loss value return from (loss_xy + loss_wh + loss_conf + loss_class) should be close to zero after the training process.

Screenshots
The value of total loss from all steps :

Environment (please complete the following information):

Local machine -> Ubuntu 18.04 & RTX 3080 GPU & CUDA 11.2
Python -> python 3.6 & tf-nightly-gpu==2.6.0.dev20210420

Thanks for your helping.

Run in Sipeed Maix Bit

I'm training for 2 classes and for detection, everything went well in the training, but when I'm going to run on a sipeed maix bit, through Maixpy, I always get an error whether it is the 2006 memory error or the others in version v3 / v4, whether or not using a card from memory. What is the correct script to run detection with Axelerate and what firmware version?

没有分割代码呀，还有你的backbone是Mobilenet2么？预训练模型导入不进去呢

如题～

Please use templates! Issues and feature requests without templates will be ignored.

Describe the bug
Users are not using template, which causes a waste of time - additional information is needed to be asked by contributors to find the source of the problem.

To Reproduce
Steps to reproduce the behavior:

Go to 'Issues'
Click on 'closed'
Scroll down to 'everywhere'
See all issues without templates - people are creating issues and asking "what is wrong, it doesn't work" without explaining what is the bug and what is expected behavior

Expected behavior
A clear and concise description of what you expected to happen when you run the code, what is your dataset, what is the error output in the terminal. Ideally, create a colab notebook to reproduce the error and share it here.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Local or Google Colab/Jupyter Notebook
If Local: conda or native install/ virtual env?
If Google Colab: tensorflow version?

Additional context
Add any other context about the problem here.

Can't load model trained with aXeleRate in Maix Go

Hello, I've tried training multiple models with your software, but whenever I try to load them with Micropython in my Maix Go, the program just crashes when trying to load the model, and it doesn't give any information as to why it crashed. I tried loading the pretrained 20 class model and it works, so I know there's no problem with the board or the SD card from which I'm loading the model. I also thought about the possibility of the model size being too big, but the pretrained model is 1MB in size, and mine was 900kB. I've tried training with MobileNet7_5 and Tiny Yolo (although the latter yields a bigger file size of 2.3MB), and I haven't been able to load either of them. I'm wondering if I'm doing something wrong or configuring something wrong, but I'm stuck and I can't find much info on training a model for this device apart from your repo.

Thanks in advance.

Can the aXeleRate support the format?

emm, I mean motorbike and bike 2 classes objects, but I only annotate label bike for single class object to train(positive sample)。

I don't want annotate label these voc-xml of motorbike for train(negative samples), but these motorbike and bike look is very similar！

Usually these bikes and motorbikes are not in the same image of dataset!

So, I want just add these no annotation label voc-xml of motorbike to train(object element is not exist), that can improve recall rate or AP ?

voc-xml of motorbike example(a negative sample)：
<annotation>
	<folder>VOC2007</folder>
	<filename>fyb004595.jpg</filename>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>600</width>
		<height>775</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
</annotation>

The VOC API have supported the format of XML.

Training is not using GPU

It seems CUDA_VISIBLE_DEVICES env variable has no effect on this training speed.

When class names more than one, the mAP is false?

When just have one class the mAp calculate is true, what's happened?

https://ftp.bmp.ovh/imgs/2021/07/9a67e2599fcda7c9.png

Multi classes the mAp is very low.

[MAIXPY]kpu: set_outputs arg value error: w,c,ch size not match output size

I followed your GitHub operation and finally trained a 5 num class model which the map is 0.76, then i flash the you firmware maixpy.bin which you post in the another question, after that i flash the model to the k210 , when run the script racoon_decetor.py ,but get error [MAIXPY]kpu: set_outputs arg value error: w,c,ch size not match output size, error line is :

a = kpu.set_outputs(task, 0, 7,7,30) #the actual shape needs to match the last layer shape of your model(before Reshape)

is it the problem my model is 5 num class ? what does the number 0,7,7,30 mean?
thanks!

Config Params (which architecture and backend?)

I am wanting to create a detector type model. I have a dataset with 1000 images and created the annotation files for them all.

Within the dataset there around around 12 different labels. I intend to run this on a k210 board.

Which architecture should I use (I noticed in your example configs you use both MobileNet7_5 and TinyYolo)?
Also do I need to use a 'backend' - I noticed when you use MobileNet7_5 you specify a backend of imagenet, but when you use TinyYolo the backend parameter is empty.

Also, I have been trying to understand anchors. I read the post you recommended (darknet..) but it doensnt make a lot of sense - surely it should use the boxes specified in the training annotation files?? Also in the article (and other articles I've read online) they only mention anchors as sets of x/y coordinates that define rectangles - yet in your config files they are just a series of single numbers - how do these relate to rectangles?

Many thanks
Tim

[unstable branch] yolo k210 converter failed while nncase compile and YOLOv3 convert incorrectly

Describe the bug
When converting YOLOv3 with k210 target:

dynamic input shape(batch) lead to upsampling layer convert to several layers in tflite model, thus first layer of Upsampling2D converted tflite layer "shape" does not supported by nncase;
YOLOv3 with 2 outputs is converted to only 1 output in tflite

Expected behavior

fix input layer by adding batch=1
treat YOLOv2 and YOLOv3 separately while tflite convert

Environment (please complete the following information):

Local conda environment, tensorflow-gpu==2.4.1

Object Detection Doesn't Find Any Objects Out of the Box

Thanks for all your work and tutorials. It's a great help getting started, but one road block I keep hitting is that your Object Detection example doesn't detect any objects in the test images when trained out of the box.

After looking through the output available, I noticed this message"Fail to load pre-trained weights-starting training from scratch" below the pull from this repo https://github.com/fchollet/deep-learning-models/

Is it possible it's missing getting the pre-weighted network and just trying to train off only two images?

Happy to help put in some of the leg work to help get this one solved.

Object detection Google Colab

Hi, im working on the Google Colab (for object detection ) with a Dataset for car plates. I have two questions : First , how can i know the anchors value in the config cell for this dataset? Second, when i run the training and the validation cells i can't see the plots and i dont know if its a bug.

I work on Google Colab.

Thanks for the help

After installation of aXeleRate, test_training_inference.py are freezing

Describe the bug
I have followed your very helpfull and clear procedure explained here : https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

After having installed conda, activating an environment and installing aXelerate on my Mac, I have launched some tests to check all is fine. So I have made different test :
python ./tests_training_inference.py
python ./tests_training_inference.py -t classifier -a 'Tini Yolo'
python ./tests_training_inference.py -t classifier -a 'Full Yolo'
there is always the same error : Epoch 1/5 is ok, but when it comes to Epoch 2/5, it is freezing at 1/5 steps

To Reproduce
Steps to reproduce the behavior:
just follow https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

Expected behavior
A clear and concise description of what you expected to happen.

Expected behavior : be able to run Epoch 1 to 5 without any errors or freeze

Screenshots
Epoch 1/5
5/5 [==============================] - 8s 1s/step - loss: 1.6890 - accuracy: 0.2889 - val_loss: 1.6095 - val_accuracy: 0.2000

Epoch 00001: val_accuracy improved from -inf to 0.20000, saving model to projects/classifier/2021-01-08_15-34-56/Classifier_best_val_accuracy.h5
Epoch 00000: Learning rate is 2.6666666666666667e-05.

Epoch 2/5
1/5 [=====>........................] - ETA: 6s - loss: 1.4810 - accuracy: 0.5000

Environment (please complete the following information):
environment local : MacOs X Catalina, miniconda 3 installed and one dedicated environment activated
conda create -n yolo python=3.7
conda activate yolo
pip install git+https://github.com/AIWintermuteAI/aXeleRate
inside aXeleRate folder
python ./tests_training_inference.py

Additional context
Add any other context about the problem here.
I have found many people have similar pbs but in different context
https://www.google.com/search?client=firefox-b-e&q=keras+freeze+during+training

Weights Divergence

Describe the bug
When converting to kmodel, it reports fallback to float conv2d due to weights divergence
How should I fix this?

To Reproduce

Expected behavior

Screenshots

Environment (please complete the following information):
Here is the config.json
{
"model" : {
"type": "Detector",
"architecture": "MobileNet7_5",
"input_size": [224,224],
"anchors": [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
"labels": ["person"],
"coord_scale" : 1.0,
"class_scale" : 1.0,
"object_scale" : 5.0,
"no_object_scale" : 1.0
},
"weights" : {
"full": "",
"backend": "imagenet"
},
"train" : {
"actual_epoch": 100,
"train_image_folder": "/media/storage/pool/1/",
"train_annot_folder": "/media/storage/pool/1/labelimg/detector/annotation",
"train_times": 2,
"valid_times": 2,
"valid_metric": "mAP",
"valid_image_folder": "",
"valid_annot_folder": "",
"batch_size": 4,
"learning_rate": 1e-4,
"saved_folder": "detector",
"first_trainable_layer": "",
"augumentation": true,
"is_only_detect" : false
},
"converter" : {
"type": ["k210"]
}
}

Additional context
Add any other context about the problem here.

Weights issue

Hi
Not sure if this is the same as the previous issue.....

When running the Colab Detector sample code against my own images I get the following error:

`Project folder detector already exists. Creating a folder for new training session.
Tflite Converter ready
K210 Converter ready
['shark', 'dolphin', 'surfer', 'swimmer', 'bird', 'boogieboarder', 'boat', 'jetski', 'whale']

KeyError Traceback (most recent call last)
in ()
----> 1 model_path = setup_training(config_dict=config)

1 frames
/content/aXeleRate/axelerate/train.py in train_from_config(config, project_folder)
114
115 # 2. Load the pretrained weights (if any)
--> 116 yolo.load_weights(config['weights']['full'], by_name=True)
117
118 # 3. actual training

KeyError: 'weights'`

I get the same error whether using MobileNet7_5, Tiny Yolo and Full Yolo.

Any suggestions?
Thanks
Tim
Config as follows:
config = { "model":{ "type": "Detector", "architecture": "MobileNet7_5", "input_size": 224, "anchors": [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828], "labels": ["shark","dolphin","surfer","swimmer","bird","boogieboarder","boat","jetski","whale"], "coord_scale" : 1.0, "class_scale" : 1.0, "object_scale" : 5.0, "no_object_scale" : 1.0 }, "pretrained" : { "full": "" }, "train" : { "actual_epoch": 5, "train_image_folder": "testtraining/train_imgs", "train_annot_folder": "testtraining/train_anns", "train_times": 4, "valid_image_folder": "testtraining/validation_imgs", "valid_annot_folder": "testtraining/validation_anns", "valid_times": 4, "batch_size": 4, "learning_rate": 1e-4, "saved_folder": "detector", "first_trainable_layer": "", "augumentation": True, "is_only_detect" : False }, "converter" : { "type": ["k210","tflite"] } }

How calculate anchors by kmeans?

it will support mobilenet7_5

Does training with negative samples(pictures without objects of interest) increase accuracy?

if whether these empty negative sample be trained that is no target of image and just a background image, that relevant xml also no target element.

The aXeleRate network have supported these empty negative sample that can be trained?

So, the work can improve AP or PR?

Exemple did not work

Describe the bug

Hello,

I have install (on OS X) aXelerate to learn how to create a model

conda -n ml python=3.7
conda activate ml
pip install git+https://github.com/AIWintermuteAI/aXeleRate
git clone https://github.com/AIWintermuteAI/aXeleRate

all is OK for installation

I start the script tests_training_and_inference.py

python tests_training_and_inference.py and this script crash at segmentation part :

Project folder projects/segment already exists. Creating a folder for new training session.
Segmentation
Failed to load pre-trained weights for the whole model. It might be because you didn't specify any or the weight file cannot be found
Current training session folder is projects/segment/2020-09-23_14-06-08

Model: "model_4"

Layer (type) Output Shape Param #

input_4 (InputLayer) (None, 320, 240, 3) 0

conv1_pad (ZeroPadding2D) (None, 322, 242, 3) 0

conv1 (Conv2D) (None, 160, 120, 24) 648

conv1_bn (BatchNormalization (None, 160, 120, 24) 96

conv1_relu (ReLU) (None, 160, 120, 24) 0

conv_dw_1 (DepthwiseConv2D) (None, 160, 120, 24) 216

conv_dw_1_bn (BatchNormaliza (None, 160, 120, 24) 96

conv_dw_1_relu (ReLU) (None, 160, 120, 24) 0

conv_pw_1 (Conv2D) (None, 160, 120, 48) 1152

conv_pw_1_bn (BatchNormaliza (None, 160, 120, 48) 192

conv_pw_1_relu (ReLU) (None, 160, 120, 48) 0

conv_pad_2 (ZeroPadding2D) (None, 162, 122, 48) 0

conv_dw_2 (DepthwiseConv2D) (None, 80, 60, 48) 432

conv_dw_2_bn (BatchNormaliza (None, 80, 60, 48) 192

conv_dw_2_relu (ReLU) (None, 80, 60, 48) 0

conv_pw_2 (Conv2D) (None, 80, 60, 96) 4608

conv_pw_2_bn (BatchNormaliza (None, 80, 60, 96) 384

conv_pw_2_relu (ReLU) (None, 80, 60, 96) 0

conv_dw_3 (DepthwiseConv2D) (None, 80, 60, 96) 864

conv_dw_3_bn (BatchNormaliza (None, 80, 60, 96) 384

conv_dw_3_relu (ReLU) (None, 80, 60, 96) 0

conv_pw_3 (Conv2D) (None, 80, 60, 96) 9216

conv_pw_3_bn (BatchNormaliza (None, 80, 60, 96) 384

conv_pw_3_relu (ReLU) (None, 80, 60, 96) 0

conv_pad_4 (ZeroPadding2D) (None, 82, 62, 96) 0

conv_dw_4 (DepthwiseConv2D) (None, 40, 30, 96) 864

conv_dw_4_bn (BatchNormaliza (None, 40, 30, 96) 384

conv_dw_4_relu (ReLU) (None, 40, 30, 96) 0

conv_pw_4 (Conv2D) (None, 40, 30, 192) 18432

conv_pw_4_bn (BatchNormaliza (None, 40, 30, 192) 768

conv_pw_4_relu (ReLU) (None, 40, 30, 192) 0

conv_dw_5 (DepthwiseConv2D) (None, 40, 30, 192) 1728

conv_dw_5_bn (BatchNormaliza (None, 40, 30, 192) 768

conv_dw_5_relu (ReLU) (None, 40, 30, 192) 0

conv_pw_5 (Conv2D) (None, 40, 30, 192) 36864

conv_pw_5_bn (BatchNormaliza (None, 40, 30, 192) 768

conv_pw_5_relu (ReLU) (None, 40, 30, 192) 0

conv_pad_6 (ZeroPadding2D) (None, 42, 32, 192) 0

conv_dw_6 (DepthwiseConv2D) (None, 20, 15, 192) 1728

conv_dw_6_bn (BatchNormaliza (None, 20, 15, 192) 768

conv_dw_6_relu (ReLU) (None, 20, 15, 192) 0

conv_pw_6 (Conv2D) (None, 20, 15, 384) 73728

conv_pw_6_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_pw_6_relu (ReLU) (None, 20, 15, 384) 0

conv_dw_7 (DepthwiseConv2D) (None, 20, 15, 384) 3456

conv_dw_7_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_dw_7_relu (ReLU) (None, 20, 15, 384) 0

conv_pw_7 (Conv2D) (None, 20, 15, 384) 147456

conv_pw_7_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_pw_7_relu (ReLU) (None, 20, 15, 384) 0

conv_dw_8 (DepthwiseConv2D) (None, 20, 15, 384) 3456

conv_dw_8_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_dw_8_relu (ReLU) (None, 20, 15, 384) 0

conv_pw_8 (Conv2D) (None, 20, 15, 384) 147456

conv_pw_8_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_pw_8_relu (ReLU) (None, 20, 15, 384) 0

conv_dw_9 (DepthwiseConv2D) (None, 20, 15, 384) 3456

conv_dw_9_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_dw_9_relu (ReLU) (None, 20, 15, 384) 0

conv_pw_9 (Conv2D) (None, 20, 15, 384) 147456

conv_pw_9_bn (BatchNormaliza (None, 20, 15, 384) 1536

conv_pw_9_relu (ReLU) (None, 20, 15, 384) 0

conv_dw_10 (DepthwiseConv2D) (None, 20, 15, 384) 3456

conv_dw_10_bn (BatchNormaliz (None, 20, 15, 384) 1536

conv_dw_10_relu (ReLU) (None, 20, 15, 384) 0

conv_pw_10 (Conv2D) (None, 20, 15, 384) 147456

conv_pw_10_bn (BatchNormaliz (None, 20, 15, 384) 1536

conv_pw_10_relu (ReLU) (None, 20, 15, 384) 0

conv_dw_11 (DepthwiseConv2D) (None, 20, 15, 384) 3456

conv_dw_11_bn (BatchNormaliz (None, 20, 15, 384) 1536

conv_dw_11_relu (ReLU) (None, 20, 15, 384) 0

conv_pw_11 (Conv2D) (None, 20, 15, 384) 147456

conv_pw_11_bn (BatchNormaliz (None, 20, 15, 384) 1536

conv_pw_11_relu (ReLU) (None, 20, 15, 384) 0

zero_padding2d_1 (ZeroPaddin (None, 22, 17, 384) 0

conv2d_1 (Conv2D) (None, 20, 15, 256) 884992

batch_normalization_1 (Batch (None, 20, 15, 256) 1024

up_sampling2d_1 (UpSampling2 (None, 40, 30, 256) 0

zero_padding2d_2 (ZeroPaddin (None, 42, 32, 256) 0

conv2d_2 (Conv2D) (None, 40, 30, 128) 295040

batch_normalization_2 (Batch (None, 40, 30, 128) 512

up_sampling2d_2 (UpSampling2 (None, 80, 60, 128) 0

zero_padding2d_3 (ZeroPaddin (None, 82, 62, 128) 0

conv2d_3 (Conv2D) (None, 80, 60, 64) 73792

batch_normalization_3 (Batch (None, 80, 60, 64) 256

up_sampling2d_3 (UpSampling2 (None, 160, 120, 64) 0

zero_padding2d_4 (ZeroPaddin (None, 162, 122, 64) 0

conv2d_4 (Conv2D) (None, 160, 120, 32) 18464

batch_normalization_4 (Batch (None, 160, 120, 32) 128

conv2d_5 (Conv2D) (None, 160, 120, 20) 5780

activation_1 (Activation) (None, 160, 120, 20) 0

Total params: 2,207,108
Trainable params: 2,195,108
Non-trainable params: 12,000

Epoch 1/5
4/4 [==============================] - 5s 1s/step - loss: 3.4831 - val_loss: 2.9801

Epoch 00001: val_loss improved from inf to 2.98010, saving model to projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
/Users/michael/aXeleRate/axelerate/networks/common_utils/fit.py:165: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
plt.show(block=False)
/Users/michael/aXeleRate/axelerate/networks/common_utils/fit.py:166: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
plt.pause(1)
Epoch 2/5
4/4 [==============================] - 2s 533ms/step - loss: 3.2988 - val_loss: 2.9247

Epoch 00002: val_loss improved from 2.98010 to 2.92473, saving model to projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
Epoch 3/5
4/4 [==============================] - 2s 531ms/step - loss: 3.0916 - val_loss: 2.7481

Epoch 00003: val_loss improved from 2.92473 to 2.74812, saving model to projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
Epoch 4/5
4/4 [==============================] - 2s 539ms/step - loss: 2.9650 - val_loss: 2.5661

Epoch 00004: val_loss improved from 2.74812 to 2.56610, saving model to projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
Epoch 5/5
4/4 [==============================] - 2s 542ms/step - loss: 2.9098 - val_loss: 2.5182

Epoch 00005: val_loss improved from 2.56610 to 2.51816, saving model to projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
39-seconds to train
Folder projects/segment/2020-09-23_14-06-08/Inference_results is created.
Segmentation
Loading pre-trained weights for the whole model: projects/segment/2020-09-23_14-06-08/Segnet_best_val_loss.h5
Found the following classes in the segmentation image: [ 0 1 6 7 8 9 12 16 17]
Traceback (most recent call last):
File "tests_training_and_inference.py", line 160, in
setup_inference(item, model_path)
File "/Users/michael/aXeleRate/axelerate/infer.py", line 100, in setup_inference
predict(model=segnet._network, inp=input_arr, image = orig_image, out_fname=out_fname)
File "/Users/michael/aXeleRate/axelerate/networks/segnet/predict.py", line 136, in predict
seg_img = visualize_segmentation(pr, inp_img=image, n_classes=n_classes, overlay_img=True, colors=colors)
File "/Users/michael/aXeleRate/axelerate/networks/segnet/predict.py", line 102, in visualize_segmentation
seg_img = get_colored_segmentation_image(seg_arr, n_classes , colors=colors)
File "/Users/michael/aXeleRate/axelerate/networks/segnet/predict.py", line 52, in get_colored_segmentation_image
seg_img[:, :, 0] += ((seg_arr[:, :] == c)*(colors[c][0])).astype('uint8')
ValueError: operands could not be broadcast together with shapes (120,160) (160,120) (120,160)
operands could not be broadcast together with shapes (120,160) (160,120) (120,160)
['SegNet MobileNet7_5 operands could not be broadcast together with shapes (120,160) (160,120) (120,160) ']

Do you know why ?

thanks a lot in avance !

[unstable branch] validation frequency adding to config.json

Is your feature request related to a problem? Please describe.
Validation per epoch will slow down training process severely when validation dataset is large.

Describe the solution you'd like
Validation frequency can be modified in config.json

low fps

Hi, I noticed that the models are converted using nncase 0.2 which returns a kmodel V4.

Considering that they still are not fully supported, wouldn't it be better to model the net in a way that it's possible to convert it using nncase 0.1.5 which returns a kmodel V3?

In particular, nncase 0.1.5 doesn't support tensorflow reshape, is it possible to remove it?

I'm writing this because I noticed that the original 20classes_yolo offered as a demo runs at 19.5 fps on a maix Go while a tiny yolo v2 net trained with you tool runs at about 13 fps.

Anyway you did a great job, Thank you.

Perform "Full Yolo" training, fail to convert tflite.

Describe the bug:
Perform "Full Yolo" training, fail to convert tflite.
描述錯誤：
進行 "Full Yolo" 訓練，轉換 tflite 失敗

To Reproduce:
se the following json settings to run, and finally report an error. But the same dataset using
"Tiny Yolo" or "MobileNet7_5" can be normal and export the kmodel file.
重現：
使用下面 json 設置運行，最後報錯。但是相同 dataset 使用 "Tiny Yolo" 或 "MobileNet7_5" 皆可以
正常並且轉出 kmodel 檔案。


{
    "model" : {
        "type":                 "Detector",
        "architecture":         "Full Yolo",
        "input_size":           [224,224],
        "anchors":              [1.30,1.73, 2.50,2.80, 2.91,4.62, 4.35,5.16, 6.00,6.16],
        "labels":               ["cat_face","dog_face"],
        "coord_scale" : 	1.0,
        "class_scale" : 	1.0,
        "object_scale" : 	5.0,
        "no_object_scale" : 	1.0
    },
    "weights" : {
        "full":   		"",
        "backend":              "imagenet"
    },
    "train" : {
        "actual_epoch":         30,
        "train_image_folder":   "dc_dataset/images",
        "train_annot_folder":   "dc_dataset/annotations",
        "train_times":          12,
        "valid_image_folder":   "dc_dataset/val_images",
        "valid_annot_folder":   "dc_dataset/val_annotations",
        "valid_times":          4,
        "valid_metric":         "mAP",
        "batch_size":           32,
        "learning_rate":        1e-4,
        "saved_folder":   	"dc_fyolo",
        "first_trainable_layer": "",
        "augumentation":	true,
        "is_only_detect" : 	false
    },
    "converter" : {
        "type":   		["k210"]
    }
}

Expected behavior / 報錯：

Using TensorFlow backend.
Project folder projects/dc_fyolo already exists. Creating a folder for new training session.
K210 Converter ready
['cat_face', 'dog_face']
Imagenet for YOLO backend are not available yet, defaulting to random weights
Failed to load pre-trained weights for the whole model. It might be because you didn't specify any or the weight file cannot be found
Current training session folder is projects/dc_fyolo/2020-10-07_14-52-01


Model: "model_2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv_1 (Conv2D)                 (None, 224, 224, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
norm_1 (BatchNormalization)     (None, 224, 224, 32) 128         conv_1[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 224, 224, 32) 0           norm_1[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 112, 112, 32) 0           leaky_re_lu_1[0][0]              
__________________________________________________________________________________________________
conv_2 (Conv2D)                 (None, 112, 112, 64) 18432       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
norm_2 (BatchNormalization)     (None, 112, 112, 64) 256         conv_2[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU)       (None, 112, 112, 64) 0           norm_2[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 56, 56, 64)   0           leaky_re_lu_2[0][0]              
__________________________________________________________________________________________________
conv_3 (Conv2D)                 (None, 56, 56, 128)  73728       max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
norm_3 (BatchNormalization)     (None, 56, 56, 128)  512         conv_3[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU)       (None, 56, 56, 128)  0           norm_3[0][0]                     
__________________________________________________________________________________________________
conv_4 (Conv2D)                 (None, 56, 56, 64)   8192        leaky_re_lu_3[0][0]              
__________________________________________________________________________________________________
norm_4 (BatchNormalization)     (None, 56, 56, 64)   256         conv_4[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU)       (None, 56, 56, 64)   0           norm_4[0][0]                     
__________________________________________________________________________________________________
conv_5 (Conv2D)                 (None, 56, 56, 128)  73728       leaky_re_lu_4[0][0]              
__________________________________________________________________________________________________
norm_5 (BatchNormalization)     (None, 56, 56, 128)  512         conv_5[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU)       (None, 56, 56, 128)  0           norm_5[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 28, 28, 128)  0           leaky_re_lu_5[0][0]              
__________________________________________________________________________________________________
conv_6 (Conv2D)                 (None, 28, 28, 256)  294912      max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
norm_6 (BatchNormalization)     (None, 28, 28, 256)  1024        conv_6[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU)       (None, 28, 28, 256)  0           norm_6[0][0]                     
__________________________________________________________________________________________________
conv_7 (Conv2D)                 (None, 28, 28, 128)  32768       leaky_re_lu_6[0][0]              
__________________________________________________________________________________________________
norm_7 (BatchNormalization)     (None, 28, 28, 128)  512         conv_7[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU)       (None, 28, 28, 128)  0           norm_7[0][0]                     
__________________________________________________________________________________________________
conv_8 (Conv2D)                 (None, 28, 28, 256)  294912      leaky_re_lu_7[0][0]              
__________________________________________________________________________________________________
norm_8 (BatchNormalization)     (None, 28, 28, 256)  1024        conv_8[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU)       (None, 28, 28, 256)  0           norm_8[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 14, 14, 256)  0           leaky_re_lu_8[0][0]              
__________________________________________________________________________________________________
conv_9 (Conv2D)                 (None, 14, 14, 512)  1179648     max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
norm_9 (BatchNormalization)     (None, 14, 14, 512)  2048        conv_9[0][0]                     
__________________________________________________________________________________________________
leaky_re_lu_9 (LeakyReLU)       (None, 14, 14, 512)  0           norm_9[0][0]                     
__________________________________________________________________________________________________
conv_10 (Conv2D)                (None, 14, 14, 256)  131072      leaky_re_lu_9[0][0]              
__________________________________________________________________________________________________
norm_10 (BatchNormalization)    (None, 14, 14, 256)  1024        conv_10[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_10 (LeakyReLU)      (None, 14, 14, 256)  0           norm_10[0][0]                    
__________________________________________________________________________________________________
conv_11 (Conv2D)                (None, 14, 14, 512)  1179648     leaky_re_lu_10[0][0]             
__________________________________________________________________________________________________
norm_11 (BatchNormalization)    (None, 14, 14, 512)  2048        conv_11[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_11 (LeakyReLU)      (None, 14, 14, 512)  0           norm_11[0][0]                    
__________________________________________________________________________________________________
conv_12 (Conv2D)                (None, 14, 14, 256)  131072      leaky_re_lu_11[0][0]             
__________________________________________________________________________________________________
norm_12 (BatchNormalization)    (None, 14, 14, 256)  1024        conv_12[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_12 (LeakyReLU)      (None, 14, 14, 256)  0           norm_12[0][0]                    
__________________________________________________________________________________________________
conv_13 (Conv2D)                (None, 14, 14, 512)  1179648     leaky_re_lu_12[0][0]             
__________________________________________________________________________________________________
norm_13 (BatchNormalization)    (None, 14, 14, 512)  2048        conv_13[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU)      (None, 14, 14, 512)  0           norm_13[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 7, 7, 512)    0           leaky_re_lu_13[0][0]             
__________________________________________________________________________________________________
conv_14 (Conv2D)                (None, 7, 7, 1024)   4718592     max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
norm_14 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_14[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_14[0][0]                    
__________________________________________________________________________________________________
conv_15 (Conv2D)                (None, 7, 7, 512)    524288      leaky_re_lu_14[0][0]             
__________________________________________________________________________________________________
norm_15 (BatchNormalization)    (None, 7, 7, 512)    2048        conv_15[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU)      (None, 7, 7, 512)    0           norm_15[0][0]                    
__________________________________________________________________________________________________
conv_16 (Conv2D)                (None, 7, 7, 1024)   4718592     leaky_re_lu_15[0][0]             
__________________________________________________________________________________________________
norm_16 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_16[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_16[0][0]                    
__________________________________________________________________________________________________
conv_17 (Conv2D)                (None, 7, 7, 512)    524288      leaky_re_lu_16[0][0]             
__________________________________________________________________________________________________
norm_17 (BatchNormalization)    (None, 7, 7, 512)    2048        conv_17[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_17 (LeakyReLU)      (None, 7, 7, 512)    0           norm_17[0][0]                    
__________________________________________________________________________________________________
conv_18 (Conv2D)                (None, 7, 7, 1024)   4718592     leaky_re_lu_17[0][0]             
__________________________________________________________________________________________________
norm_18 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_18[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_18 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_18[0][0]                    
__________________________________________________________________________________________________
conv_19 (Conv2D)                (None, 7, 7, 1024)   9437184     leaky_re_lu_18[0][0]             
__________________________________________________________________________________________________
norm_19 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_19[0][0]                    
__________________________________________________________________________________________________
conv_21 (Conv2D)                (None, 14, 14, 64)   32768       leaky_re_lu_13[0][0]             
__________________________________________________________________________________________________
leaky_re_lu_19 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_19[0][0]                    
__________________________________________________________________________________________________
norm_21 (BatchNormalization)    (None, 14, 14, 64)   256         conv_21[0][0]                    
__________________________________________________________________________________________________
conv_20 (Conv2D)                (None, 7, 7, 1024)   9437184     leaky_re_lu_19[0][0]             
__________________________________________________________________________________________________
leaky_re_lu_21 (LeakyReLU)      (None, 14, 14, 64)   0           norm_21[0][0]                    
__________________________________________________________________________________________________
norm_20 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_20[0][0]                    
__________________________________________________________________________________________________
lambda_1 (Lambda)               (None, 7, 7, 256)    0           leaky_re_lu_21[0][0]             
__________________________________________________________________________________________________
leaky_re_lu_20 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_20[0][0]                    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 7, 7, 1280)   0           lambda_1[0][0]                   
                                                                 leaky_re_lu_20[0][0]             
__________________________________________________________________________________________________
conv_22 (Conv2D)                (None, 7, 7, 1024)   11796480    concatenate_1[0][0]              
__________________________________________________________________________________________________
norm_22 (BatchNormalization)    (None, 7, 7, 1024)   4096        conv_22[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_22 (LeakyReLU)      (None, 7, 7, 1024)   0           norm_22[0][0]                    
__________________________________________________________________________________________________
detection_layer_35 (Conv2D)     (None, 7, 7, 35)     35875       leaky_re_lu_22[0][0]             
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 7, 7, 5, 7)   0           detection_layer_35[0][0]         
==================================================================================================
Total params: 50,583,811
Trainable params: 50,563,139
Non-trainable params: 20,672
__________________________________________________________________________________________________
Epoch 1/1
/home/user/miniconda3/lib/python3.7/site-packages/imgaug/imgaug.py:184: DeprecationWarning: Function `ContrastNormalization()` is deprecated. Use `imgaug.contrast.LinearContrast` instead.
  warn_deprecated(msg, stacklevel=3)
1534/1534 [==============================] - 27720s 18s/step - loss: 0.5426 - val_loss: 0.6020


cat_face 0.2088
dog_face 0.1289
mAP: 0.1689
Saving model on first epoch irrespective of mAP
/home/user/miniconda3/lib/python3.7/site-packages/axelerate/networks/yolo/backend/utils/map_evaluation.py:261: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
  plt.show(block=False)
/home/user/miniconda3/lib/python3.7/site-packages/axelerate/networks/yolo/backend/utils/map_evaluation.py:262: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
  plt.pause(1)
471-mins to train
Traceback (most recent call last):
  File "axelerate/train.py", line 184, in <module>
    setup_training(config_file=args.config)
  File "axelerate/train.py", line 169, in setup_training
    return(train_from_config(config, dirname))
  File "axelerate/train.py", line 149, in train_from_config
    converter.convert_model(model_path)    
  File "/home/user/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/convert.py", line 220, in convert_model
    model = keras.models.load_model(model_path, compile=False)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/saving.py", line 492, in load_wrapper
    return load_function(*args, **kwargs)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/saving.py", line 584, in load_model
    model = _deserialize_model(h5dict, custom_objects, compile)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/saving.py", line 274, in _deserialize_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/saving.py", line 627, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/layers/__init__.py", line 168, in deserialize
    printable_module_name='layer')
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/network.py", line 1075, in from_config
    process_node(layer, node_data)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/network.py", line 1025, in process_node
    layer(unpack_singleton(input_tensors), **kwargs)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/engine/base_layer.py", line 489, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/user/miniconda3/lib/python3.7/site-packages/keras/layers/core.py", line 716, in call
    return self.function(inputs, **arguments)
  File "/home/user/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/feature.py", line 78, in space_to_depth_x2
    return tf.space_to_depth(x, block_size=2)
NameError: name 'tf' is not defined

Environment (please complete the following information):
環境（請填寫以下信息）：
linux 18.04
tensorflow 1.15
aXeleRate 0.60 or 0.59

Training person detector with pascal_20_detection dataset error

Describe the bug
I'm trying to train a mobilenet with approx. 10k training images that were provided by PASCAL-VOC dataset, and I parsed all images that do not have the 'person' label. Unfortunately, if I try to train it, some errors show up about the integrity of the dataset, and that he couldn't open the annotation file. It happens with many annotation files, could it be something about the quantity or the file?

Screenshots
This is my code, not anything special

%cd /content
# !ls images_v0/imgs_validation
import json
from axelerate import setup_training, setup_evaluation, setup_inference
import tensorflow.keras.backend as K
import traceback
import time

detector_base = {
	"model":{
		"type":                 "Detector",
		"architecture":         "MobileNet1_0", # MobileNet7_5
		"input_size":           224,
		"anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
		"labels":               ["person"],
		"coord_scale" : 		1.0,
		"class_scale" : 		1.0,
		"object_scale" : 		5.0,
		"no_object_scale" : 	5.0 # 1.0e
	},
	"weights" : {
		"full":   				"",
		"backend":   		  "mobilenet_1_0_224_tf_no_top.h5" # 
	},
	"train" : {
		"actual_epoch":         5,
		"train_image_folder":   "geral/imgs",
		"train_annot_folder":   "geral/anns",
		"train_times":          3,
		"valid_image_folder":   "geral/imgs_validation",
		"valid_annot_folder":   "geral/anns_validation",
		"valid_times":          2,
		"valid_metric":         "mAP",
		"batch_size":           8,
		"learning_rate":        1e-4,
		"saved_folder":   		"TESTE_ZERO_MEU",
		"first_trainable_layer": "", #conv_pw_13_bn
		"augumentation":		True,
		"is_only_detect" : 		False
	},
	"converter" : {
		"type":   				["k210"]
	}
}

try:
	print(json.dumps(detector_base, indent=4, sort_keys=False))
	K.clear_session()
	model_path = setup_training(config_dict=detector_base)
	K.clear_session()
	setup_evaluation(detector_base, model_path, threshold=0.5)
	print('finalizado treino final')
except Exception as e:
	traceback.print_exc()
	time.sleep(2)

This is the debug window

This image has an annotation file, but cannot be open. Check the integrity of your dataset. geral/imgs/2008_003320.jpg
  6/553 [..............................] - ETA: 15:10 - loss: 4.4850Traceback (most recent call last):
  File "<ipython-input-4-58aa7d85c5e4>", line 49, in <module>
    model_path = setup_training(config_dict=detector_base)
  File "/content/aXeleRate/axelerate/train.py", line 165, in setup_training
    return(train_from_config(config, dirname))
  File "/content/aXeleRate/axelerate/train.py", line 142, in train_from_config
    config['train']['valid_metric'])
  File "/content/aXeleRate/axelerate/networks/yolo/frontend.py", line 148, in train
    metrics="mAP")
  File "/content/aXeleRate/axelerate/networks/common_utils/fit.py", line 129, in train
    use_multiprocessing = True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1861, in fit_generator
    initial_epoch=initial_epoch)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 855, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 2943, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 560, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError:  error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 567, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/content/aXeleRate/axelerate/networks/yolo/backend/batch_gen.py", line 102, in __getitem__
    img, boxes, labels = self._img_aug.imread(fname, boxes, labels)
  File "/content/aXeleRate/axelerate/networks/common_utils/augment.py", line 39, in imread
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

"""


The above exception was the direct cause of the following exception:


Traceback (most recent call last):

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/script_ops.py", line 249, in __call__
    ret = func(*args)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 620, in wrapper
    return func(*args, **kwargs)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 891, in generator_py_func
    values = next(generator_state.get_iterator(iterator_id))

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 807, in wrapped_generator
    for data in generator_fn():

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 788, in get
    six.reraise(*sys.exc_info())

  File "/usr/local/lib/python3.7/dist-packages/six.py", line 703, in reraise
    raise value

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 779, in get
    inputs = self.queue.get(block=True, timeout=5).get()

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 567, in get_index
    return _SHARED_SEQUENCES[uid][i]

  File "/content/aXeleRate/axelerate/networks/yolo/backend/batch_gen.py", line 102, in __getitem__
    img, boxes, labels = self._img_aug.imread(fname, boxes, labels)

  File "/content/aXeleRate/axelerate/networks/common_utils/augment.py", line 39, in imread
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'



	 [[{{node PyFunc}}]]
	 [[IteratorGetNext]] [Op:__inference_train_function_7154]

Function call stack:
train_function

Environment (please complete the following information):

Using Google Colab

Additional context
Should I validate something more about the dataset images?

Fatal: Invalid dataset, file size should be 602112B, but got 1806336B

Two moth ago I train a classifier model and it works well, but today I retrain the same model using same dataset , get an error:
Fatal: Invalid dataset, file size should be 602112B, but got 1806336B
I dont konw why conver to k210 model get this error, 1806336B is three times of 602112B

Problems training a model for a person and pet detector

I'm trying to train a model for a person and pet detector (like cat or dog, for example). My question is about what config to use when training a model for a detector with only those classes, that works in Maix Go.

Get anchors for my dataset

I have 2 question.

I label for my data into .xml file on annotations folder. So how i can generate for my anchor for yolov2 without using default anchors ?
I label for about 2500 cat's images like this.
I train for 2500 images to detect face's cat. But, my train got loss = 0.09 in only 4/100 epochs.
When my train finish. it detect incorrectly. How can i fix it ?

Thank you so much.

Threads Error

I try to run axelerate in local then i have an issues with thread.
This Error Appears.

import sys
sys.path.append('aXeleRate')
from axelerate import setup_training, setup_inference

from axelerate.networks.common_utils.augment import visualize_classification_dataset

visualize_classification_dataset('/datasets', num_imgs=10, img_size=224, augment=True)

config = {
    "model" : {
        "type":                 "Classifier",
        "architecture":         "MobileNet7_5",
        "input_size":           224,
        "fully-connected":      [100,50],
        "labels":               [],
        "dropout" : 		0.2
    },
     "weights" : {
            "full":   				"",
            "backend":   		    "imagenet",
            "save_bottleneck":      False
        
    },
    "train" : {
        "actual_epoch":         10,
        "train_image_folder":   "datasets",
        "train_times":          4,
        "valid_image_folder":   "datasets",
        "valid_times":          4,
        "valid_metric":         "val_accuracy",
        "batch_size":           32,
        "learning_rate":        1e-4,
        "saved_folder":   		"classifier",
        "first_trainable_layer": "dense_1",
        "augumentation":				True
    },
    "converter" : {
        "type":   				[]
    }
}

from keras import backend as K 
K.clear_session()
model_path = setup_training(config_dict=config)

from axelerate.networks.common_utils.convert import Converter
converter = Converter('k210', 'MobileNet7_5', 'datasets')
converter.convert_model(model_path)

Environment:
anaconda
Python 3.6
TF 2.3.1

Training with our dataset

Hey, thank you for the work.

I've been using your framework for a while and I was wondering how should a dataset be formed to be actually good for training yolo with mbnet0.5 or 0.75 as backend.
I am training a person detector.
I have parsed the pascal-voc to remove all the labels that are not 'person' but using the whole dataset for the training didn't bring good result.
I have also used the inria dataset, which as I can see, is the one that you have partially provided in the colab notebook and got better results.

my question is this:
is there a proportion to respect, between the number of images containing the objects we want to detect and images not containing them?
Thanks!

Maixpy Load error 2: ERR KMODEL VERSION: only supported kmodel v3/v4 now

Hi, this is the error I get when I try to run my detectors (both the custom detectors and the ones in the Yolo example). The only models that I was able to make work are the Smodels generated by the Maixhub. I tried many versions with the same results, here you can see a screen while I was running the script boot.py with the firmware v0.5.0_31

classifier mode , k210_dataset_gen fail ...

十分抱歉，因為是**人，英语水平真的不行！只好用中文来述叙了！

我使用 classifier 功能时，当训练完全之后，要转成 k210 kmodel 时，
系统报了错误！

When I use the classifier function, when the training is complete, when I want to convert to k210 kmodel,
The system reported an error!

`
which is a non-GUI backend, so cannot show the figure.
plt.pause(1)
3-mins to train
[]
/home/jlinux/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/tmp
projects/classifier/2020-07-10_11-34-44/Classifier_best_val_accuracy.kmodel

Import graph...
Optimize Pass 1...
Optimize Pass 2...
Quantize...
4.1. Add quantization checkpoints...
4.2. Get activation ranges...
Plan buffers...
Fatal: Invalid dataset, should contain one file at least
255
`
我逆向追查，是 convert.py 中的 k210_dataset_gen 所产生的 image_files_list 是空的
并没有将资料 copy 到 /home/jlinux/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/tmp 中，导致最终转出失败！

My reverse tracking is that the image_files_list generated by k210_dataset_gen in convert.py is empty
Did not copy the data to /home/jlinux/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/tmp, which caused the final transfer to fail!

而使用 "detector" 这功能是好的！

It is good to use "detector"!
`
6-mins to train
Converting to tflite without Reshape layer for K210 Yolo
['raccoon_dataset/valid_images/raccoon-53.jpg', 'raccoon_dataset/valid_images/raccoon-183.jpg', 'raccoon_dataset/valid_images/raccoon-193.jpg', 'raccoon_dataset/valid_images/raccoon-43.jpg', 'raccoon_dataset/valid_images/raccoon-133.jpg', 'raccoon_dataset/valid_images/raccoon-103.jpg', 'raccoon_dataset/valid_images/raccoon-73.jpg', 'raccoon_dataset/valid_images/raccoon-153.jpg', 'raccoon_dataset/valid_images/raccoon-13.jpg', 'raccoon_dataset/valid_images/raccoon-83.jpg', 'raccoon_dataset/valid_images/raccoon-143.jpg', 'raccoon_dataset/valid_images/raccoon-3.jpg', 'raccoon_dataset/valid_images/raccoon-163.jpg', 'raccoon_dataset/valid_images/raccoon-173.jpg', 'raccoon_dataset/valid_images/raccoon-33.jpg', 'raccoon_dataset/valid_images/raccoon-63.jpg', 'raccoon_dataset/valid_images/raccoon-93.jpg', 'raccoon_dataset/valid_images/raccoon-113.jpg', 'raccoon_dataset/valid_images/raccoon-123.jpg', 'raccoon_dataset/valid_images/raccoon-23.jpg']
/home/jlinux/miniconda3/lib/python3.7/site-packages/axelerate/networks/common_utils/tmp
projects/raccoon_detector/2020-07-10_12-38-55/YOLO_best_mAP.kmodel

Import graph...
Optimize Pass 1...
Optimize Pass 2...
Quantize...
4.1. Add quantization checkpoints...
4.2. Get activation ranges...
Plan buffers...
Run calibration...
[==================================================] 100% 12.047s
4.5. Quantize graph...
Lowering...
Generate code...
`

环境：
linux 18.04
python 3.7

Tiny yolo

Is your feature request related to a problem? Please describe.
Can I use tiny yolo for training model, tinyyolo and mobilenet what is better ?

If i use architecture is "tiny yolo", what parameter I need to change ?
Thanks.

'ProgbarLogger' object has no attribute 'log_values'

Hello,
I am facing the specified error with the following config :

OS : Ubuntu 18.04
Tensorflow: 1.15.04
Keras : 2.3.1

JSON:

{
    "model" : {
        "type":                 "Classifier",
        "architecture":         "NASNetMobile",
        "input_size":           224,
        "fully-connected":      [],
        "labels":               [],
        "dropout" : 		    0.2
    },
     "weights" : {
            "full":   				"",
            "backend":   		    "imagenet",
            "save_bottleneck":      false
        
    },
    "train" : {
        "actual_epoch":         100,
        "train_image_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/fire_dataset/cat_imaages",
        "train_times":          1,
        "valid_image_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/fire_dataset/non_cat_images",
        "valid_times":          1,
        "valid_metric":         "val_accuracy",
        "batch_size":           16,
        "learning_rate":        1e-3,
        "saved_folder":   		"cat",
        "first_trainable_layer": "",
        "augumentation":		 true
    },
    "converter" : {
        "type":   				["k210"]
    }
}

When i run
python train.py -c ../configs/cat_classifier.json
i get the following error
AttributeError: 'ProgbarLogger' object has no attribute 'log_values'
I have tried adding " dense " to the first_trainable_layer but i get a different error
Exception: First trainable layer specified in config file is not in the model
Also i have tried with the dogs_classifier with the provided data from the sample_datasets and the result is the same.

Thank you for your time!

是否支持把没有目标的空背景和标签输入训练？

这样可以提高错检率是嘛？让模型更稳定

Error converting shape

Hello, i tried the tutorial from here https://www.hackster.io/dmitrywat/object-detection-with-sipeed-maix-boards-kendryte-k210-421d55?fbclid=IwAR2mDf91UkdMzAsimB1YjI7XtpLcOB35n-1mGF5yC0eWo8bgNtUExzP8Ti8 but i am facing the following error :

TypeError: Error converting shape to a TensorShape: int() argument must be a string, a bytes-like object or a number, not 'list'.

My OS is : Ubuntu 18.04.5 LTS
TF : tensorflow==1.14.0
Keras : 2.1.1

conda create -n yolo python=3.7
conda activate yolo
pip install git+https://github.com/AIWintermuteAI/aXeleRate
Downloaded : https://github.com/penny4860/Yolo-digit-detector
Dataset : https://drive.google.com/file/d/1ncCsJ-8kIXQXpw8DF0T_ZfJBpemv2FI9/view

I had to replace the architecture " MobileNet7_5 " to " MobileNet " because i had the error :

Exception: Architecture not supported! Only support Full Yolo, Tiny Yolo, MobileNet, SqueezeNet, VGG16, ResNet50, and Inception3 at the moment!

JSON config file :

{
    "model" : {
        "type":                 "Detector",
        "architecture":         "MobileNet",
        "input_size":           [224,224],
        "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
        "labels":               ["raccoon"],
        "coord_scale" : 		1.0,
        "class_scale" : 		1.0,
        "object_scale" : 		5.0,
        "no_object_scale" : 	1.0
    },
    "weights" : {
        "full":   				"",
        "backend":              "imagenet"
    },
    "train" : {
        "actual_epoch":         50,
        "train_image_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/imgs",
        "train_annot_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/anns",
        "train_times":          2,
        "valid_image_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/imgs_validation",
        "valid_annot_folder":   "/mnt/0d2701de-7997-4d53-99fb-0c9de0a116c9/K210_tools/data/anns_validation",
        "valid_times":          2,
        "valid_metric":         "mAP",
        "batch_size":           4,
        "learning_rate":        1e-4,
        "saved_folder":   		"raccoon_detector",
        "first_trainable_layer": "",
        "augumentation":				true,
        "is_only_detect" : 		false
    },
    "converter" : {
        "type":   				["k210"]
    }
}

Thank you for your time!
Best regards

json.decoder.JSONDecodeError: Expecting ',' delimiter: line 35 column 2 (char 1325)

What happend？

Note that this exception is used from _json

def __init__(self, msg, doc, pos):
    lineno = doc.count('\n', 0, pos) + 1
    colno = pos - doc.rfind('\n', 0, pos)
    errmsg = "%s: line %d column %d (char %d)" % (msg,lineno,colno,pos)
    ValueError.__init__(self, errmsg)
    self.msg = msg
    self.doc = doc
    self.pos = pos
    self.lineno = lineno
    self.colno = colno

No output or errors from yolo model on K210 board despite being able to run on the PC

Describe the bug
No output on Maix bit board when attempting to perform inference on yolo detector model trained with Axelerate scripts. There are no outputs or errors both on the screen and when printing via the serial terminal

To Reproduce
Steps to reproduce the behavior:

Train 52 classes with the yolo detector script
Flash kmodel file onto board
Use k210_detector sample script as base and change labels to classes used
Run inference

Expected behavior
The board should output results to the screen or terminal, however there is no output or error whatsoever. The model works on my pc when I call the setup_inference() function.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Local Jupyter Notebook in Conda with Python 3.8 running tensorflow 2.4 on Ubuntu 20.04

Additional context
I'm trying to train a playing card detection model, as such I am teaching the model to recognise the corner of the cards where it contains both the suit and number of the card. A standard deck of playing cards has 52 classes and as such I require 52 classes for this model. This model is trained on a synthethic dataset consisting of 100k images with an average of 2.5 cards per image for a total average of 4.5k samples per class.

For additional context, I trained a proof of concept model using only 4 cards namely ["As", "3h", "8c", "Jd"] and it worked without any issues. But once I expanded training to include all 52 cards, the model fails to work on my board.

[unstable branch] yolo training quantize or not can be switched by config.json

Is your feature request related to a problem? Please describe.
in axelerate/training.py custom training is commented out and quantize training is a default choice.

Describe the solution you'd like
Add a boolean "quantize" param in config.json and switch training mode for yolo in training.py.

Support for Yolov3

Is your feature request related to a problem? Please describe.
Feature request to support the YoloV3 architecture and training.

Describe the solution you'd like

The ability to train a yolov3 model with custom data.
And an example of how deploy it to the K210 to recognize image/video.
Performance (speed/accuracy) results reported and compared with yolov2 or others.

Describe alternatives you've considered
YoloV2 is currently supported. Yolov4/v5 were discounted in #38 (though it was indicated that Yolov3 should be supported).

Additional context

https://github.com/zhen8838/K210_Yolo_framework may be a good place to start, though I haven't had the model converge well with that.

aiwintermuteai / axelerate Goto Github PK

axelerate's Issues

`Project folder detector already exists. Creating a folder for new training session. Tflite Converter ready K210 Converter ready ['shark', 'dolphin', 'surfer', 'swimmer', 'bird', 'boogieboarder', 'boat', 'jetski', 'whale']

Layer (type) Output Shape Param #

activation_1 (Activation) (None, 160, 120, 20) 0

Note that this exception is used from _json

Recommend Projects

Recommend Topics

Recommend Org

Jobs

`Project folder detector already exists. Creating a folder for new training session.
Tflite Converter ready
K210 Converter ready
['shark', 'dolphin', 'surfer', 'swimmer', 'bird', 'boogieboarder', 'boat', 'jetski', 'whale']