ImageClassifyTool
All steps to build the training and test environment have been tested on ubuntu18.04. Here recommend python anaconda environment.
First install python 3.6 or later.
sudo apt install python3.6
sudo apt install -y python3-pip
Install keras and tensorflow 1.15 gpu version
pip3 install keras
pip3 install tensorflow-gpu==1.15
Install python opencv
apt install python3-opencv
Now enter the ImageClassifyTool directory.
cd ImageClassifyTool
The folder structure and file function is as follows
dataclassify: image dataset directory for train classification model
datadetect: image dataset directory for train yolo detect model
model_data: trained model directory for test and training
model_training: working directory while train model
testclassify: test directory for image classification
testdetect: test directory for object detect
trainclassify.py: training python script for classification model
traindetect.py: training python script for yolo model
trainclassify.py: benchmark python script for classification
trainclassify.py: benchmark python script for detection
1. Train Image classification
Data collection
First, we need image data for training. As mentioned above there are several ways for scrap image data set. Here wrote download python script for download from URL list. Enter “dataclassify” directory and run following python script.
python3 imagedownloder.py --urls=urls_1.txt --dir=class01
python3 imagedownloder.py --urls=urls_2.txt --dir=class02
This script download to class02 directory all images from URL list. And add images which downloaded from google image search and grabbed from video. At least 10k images are recommended for training.
Data Preprocessing
Once the image is ready, we have to split two set. One is train other one is validation set. In addition, for training we have to convert images to 224X224 pixel images. Run following python script.
python3 imagesplitetrain.py --infolder class01 --classname 0
python3 imagesplitetrain.py --infolder class02 --classname 1
After run this script we can confirm “train” and “validation” directories.
Training the classifier
Now we can start training. Enter the root directory for working.
python3 trainclassify.py --classes=2 --size=224 --batch=64 --epochs=100 --weights=False --tclasses=0
Parameter explanation
--classes, The number of classes of dataset.
--size, The image size of train sample.
--batch, The number of train samples per batch.
--epochs, The number of train iterations.
--weights, Fine tune with other weights.
--tclasses, The number of classes of pre-trained model.
Here –weights and –tclasses parameter is False and 0, because don’t use pre-trained model. If the training is completed then we can find trained model file in “model_training/logclassify” directory and the file name is “trained_classifymodel.h5”. For test we have to copy this file to “model_data” directory.
cp model_training/logclassify/trained_classifymodel.h5 model_data/trained_classifymodel.h5
Fine-Tuning
If you want to do fine-tune the trained model to improve accuracy or add other class then you can run the following command. Before do fine-tune, you need to check the accuracy of the training data and the number of classifications and it should be noted that the size of the input image should be consistent with the original model. You can download a pre-trained model to classify adult, soccer and other here.
python3 trainclassify.py --classes=3 --size=224 --batch=64 --epochs=100 --weights=trained_model.h5 --tclasses=2
Test and Result
Place the images you want to test in the "testclassify" directory and run following python script.
python3 image_classify.py
Calculate ROC and Threshold for classifier
Need to install dependent python packages
pip install scikit-learn
python3 image_calcroc.py
2. Yolo Training
Data collection
As same like classification we can scrap image data set. In this project used public data set called “VOC”. The Pascal VOC challenge is a very popular dataset for building and evaluating algorithms for image classification, object detection, and segmentation.
Enter “datadetect” directory and download data using following command script.
cd datadetect
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
There will be a VOCdevkit/ subdirectory with all the VOC training data in it. Using following command script we can make training image list.
python3 maketrainlist.py
This script filter only five (person, car, bicycle, dog, cat) objects. we can confirm “detecttrain.txt” file for training. If we use image data from scrap in internet, in this case it needs to be fed with labeled training data in order for our detector to learn to detect objects in images, such as cat and dog in pictures. To label images, general using image annotation tool like "labelImg". Here we don't discuss about this.
Training
Now we can start training. Enter work root directory and run following script.
python3 traindetect.py
If completed training then we can find trained model file in “model_training/logdetect” directory.This file name is “trained_weights_final.h5”.
For test we copy this file to “model_data” directory.
cp model_training/logdetect/trained_weights_final.h5 model_data/yolo_trained_weights_final.h5
Test
Place the images you want to test in the "testdetect" directory and run following python script.
python3 image_detect.py
3. Trouble Shooting
cannot import cv2
pip install opencv-python
cannot import pandas
pip install pandas
cannot import scikit
pip install scikit-learn
cannot import matplotlib
pip install matplotlib