GithubHelp home page GithubHelp logo

keras_frcnn's Introduction

Keras Faster-RCNN

[UPDATE]

This work has been publiced on StrangeAI - An AI Algorithm Hub, You can found this work at Here (You may found more interesting work on this website, it's a very good resource to learn AI, StrangeAi authors maintainered all applications in AI).

you can also subscribe their official wechat account:

this is a very userful implementation of faster-rcnn based on tensorflow and keras, the model is very clear and just saved in .h5 file, out of box to use, and easy to train on other data set with full support. if you have any question, feel free to ask me via wechat: jintianiloveu

Update

This code only support to keras 2.0.3, the newest version will cause some errors. If you can fix it, feel free to send me a PR.

Requirements

Basically, this code supports both python2.7 and python3.5, the following package should installed:

  • tensorflow
  • keras
  • scipy
  • cv2

Out of box model to predict

I have trained a model to predict kitti. I will update a dropbox link here later. Let's see the result of predict:

Train New Dataset

to train a new dataset is also very simple and straight forward. Simply convert your detection label file whatever format into this format:

/path/training/image_2/000000.png,712.40,143.00,810.73,307.92,Pedestrian
/path/training/image_2/000001.png,599.41,156.40,629.75,189.25,Truck

Which is /path/to/img.png,x1,y1,x2,y2,class_name, with this simple file, we don't need class map file, our training program will statistic this automatically.

For Predict

If you want see how good your trained model is, simply run:

python test_frcnn_kitti.py

you can also using -p to specific single image to predict, or send a path contains many images, our program will automatically recognise that.

That's all, help you enjoy!

keras_frcnn's People

Contributors

lucasjinreal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras_frcnn's Issues

cfg.model_path setting problem

Shall I use the pre-trained model weights here (e.g. "resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5")
OR
shall I use the architecture of residual network (e.g. resnet50.py)?

Thanks,
R

GPU memory usage + speed

Hi, I'm kinda new on this field, I'm studying computer Enginering and I'm trying to train your NN with data of my own, the problem comes when I launch train and get:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3113 C python 74MiB |
+-----------------------------------------------------------------------------+
On a GeForce GTX 980 with 4036MiB.
And get ~4 hours per epoch.
I dont know if this is due to my data, the NN or some kind of option I have not discovered yet, I would really appreciate the help.
Thanks in advance.

Please Me.( /model/kitti_frcnn_last.hdf5)

Unable to open file (unable to open file: name = './model/kitti_frcnn_last.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0).

How to solve this problem?
Where to get this file?
Please help me figure it out.

Get feature-map of a given region in the image from the ground truth

Hello!

Do you have any idea on how can I get the feature map from the last layer of a given region (from ground truth) after passing the entire image throughout the RPN?

In particular:

# get the feature maps and output from the RPN
[Y1, Y2, F] = model_rpn.predict(X)

I am assuming that F is the feature map, and I want to get the feature-map of a small region in the image (pedestrian from GT) for i want to use that feature map for training a SVM classifier.

Cheers!

Where to find the kiti data set you talk about ?

Hello , I dont find where to get the kiti data base you work with (what is URL exactly)? (there a lot DB in the web site).
And the annotations txt files (kitti_simple_labels) is created by you with hand or is already exist with kiti base ?
thanks a lot for your help and your very good works !!!
Dave.

rpn_loss question

There is a change in rpn_cls_loss, y_true[:anchor_nums]->y_true[anchor_nums:], and I want to know why make this change.
Thanks a lot!

train own dataset but cbf find hdf5

@jinfagang
could you explain what is the kitti_frcnn_last.hdf5? As my understanding is a file to save weight(or the net) if you dont want train the whole data from beginning. so what I am going to do for this if i am going to train my own data?
tbh this is not a really useful readme!!

It can only work for the first kind object

I trained this model on the Voc2012 and get the hdf5 file.But when I run the test file, only the first kind object can be found.If I test it on images containing other kinds objects,It doesn't work at all.

Time Epoch.CPU

Please tell me how long does one epoch on the CPU take? Education has risen in the first era and does not move.

The test image can not detect object?

After training all images, in my experiment, I have only one class except background, the loss also realize very low. But test other images, no object is detected. I am so confused

Keras 2.1.x incompatibilty

Upgrading keras to version 2.1.x it gives this error:

Starting training
Epoch 1/3000
Exception: Error when checking target: expected rpn_out_class to have shape (None, None, None, 9) but got array with shape (1, 56, 38, 18)
Exception: Error when checking target: expected rpn_out_class to have shape (None, None, None, 9) but got array with shape (1, 50, 38, 18)
Exception: Error when checking target: expected rpn_out_class to have shape (None, None, None, 9) but got array with shape (1, 43, 38, 18)
Exception: Error when checking target: expected rpn_out_class to have shape (None, None, None, 9) but got array with shape (1, 38, 38, 18)
...

I cannot train the model, and always got error like this! Any info will be appreciated!

Average number of overlapping bounding boxes from RPN = 0.0 for 1000 previous iterations
RPN is not producing bounding boxes that overlap the ground truth boxes. Check RPN settings or keep training.
Average number of overlapping bounding boxes from RPN = 0.0 for 1000 previous iterations
RPN is not producing bounding boxes that overlap the ground truth boxes. Check RPN settings or keep training.
image

About the model weight loading in training stage

In the training script config.model_path is used for loading weights.

https://github.com/jinfagang/keras_frcnn/blob/5a8610c2ef45b2fb38fc2fbebf4c726e680b8b3e/train_frcnn_kitti.py#L91-L94

However, finally the config.model_path is used for export.

https://github.com/jinfagang/keras_frcnn/blob/5a8610c2ef45b2fb38fc2fbebf4c726e680b8b3e/train_frcnn_kitti.py#L240-L243

Can you explain these lines?

I guess that you copy the codes from https://github.com/yhenon/keras-frcnn, where the original code is

try:
	print('loading weights from {}'.format(C.base_net_weights))
	model_rpn.load_weights(C.base_net_weights, by_name=True)
	model_classifier.load_weights(C.base_net_weights, by_name=True)

Can you explain this change? @jinfagang

when i run the command 'python test_frcnn_kitti.py ' something was wrong

my environment is anaconda with python 2.7, and i have installed the tensorflow,but when i run the command,something was wrong:
Traceback (most recent call last):
File "test_frcnn_kitti.py", line 200, in
predict(args)
File "test_frcnn_kitti.py", line 144, in predict
cfg = pickle.load(f_in)
File "/root/anaconda2/lib/python2.7/pickle.py", line 1384, in load
return Unpickler(file).load()
File "/root/anaconda2/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/root/anaconda2/lib/python2.7/pickle.py", line 892, in load_proto
raise ValueError, "unsupported pickle protocol: %d" % proto
ValueError: unsupported pickle protocol: 3

can you give a config with pickle protocol:2?

background detection in roi_helpers.py

Hi,

in roi_helpers.py, you use the following code to neglect some region proposals and detect background:

neglect region proposals:

if best_iou < C.classifier_min_overlap:
    continue

detect background:

if C.classifier_min_overlap <= best_iou < C.classifier_max_overlap:
    cls_name = 'bg'

but I think these two should be interchanged since for a region proposal, the less IoU with GT boxes, the higher probability it would be background.

Please let me know if I am wrong. Thank you!

Mask R-#

great implementation i love it! probably the simplest ive found so far! am only wondering whither youre planning to add the Mask branch anytime soon?

How to use measure_map.py?

I have a new val_kitti_simple_label.txt to get the mAp , but when I add this path to measure_map.py
The error is :TypeError:non_max_suppression_fast() got multiple values for argument 'overlap_thresh';
Who can tell me the correct step ?

Not an issue. Asking for suggestions. How to generate bounding boxes for our dataset ?

Is there a quick way to generate bounding boxes for our own dataset ? Is there any alternative to doing it manually ?
I followed a simple approach of writing a python script to split the images into fixed size blocks and taking user input to classify

#annotation file
file = open("annotations.txt","w")
#code to split images into blocks and annotate
for filename in os.listdir("."):
    frame_small = cv2.imread(filename)
    gray = cv2.cvtColor(frame_small, cv2.COLOR_BGR2GRAY)
    shape = frame_small.shape
    height = shape[0]
    width = shape[1]
    h = 0
    w = 0
    found = False
    while (h+80 <= height) and found == False :
        while (w+105 <= width) and found == False :
            roi_gray = gray[h:h+80, w:w+105]
            cv2.imshow((str)(filename)+"split"+(str)(h)+","+(str)(w)+"--"+(str)(h+80)+","+(str)(w+105), roi_gray)
            waitkey_return = cv2.waitKey(0) 
            if waitkey_return == ord("f") :
                print ("Classified Face")
                file.write("images_local/"+(str)(filename)+","+(str)(w)+","+(str)(h)+","+(str)(w+105)
                           +","+(str)(h+80)+",Face\n")
            elif waitkey_return == ord("g") :
                print ("Classified Gesture")
                file.write("images_local/"+(str)(filename)+","+(str)(w)+","+(str)(h)+","+(str)(w+105)
                           +","+(str)(h+80)+",Gesture\n")
            else :
                print ("Classified DontCare")
            cv2.destroyAllWindows()
            w+=105
        w = 0
        h+=80
file.close()

Is there a better and faster approach ?

where does roi_input come from?

在train_frcnn_kitti.py中roi_input=Input(shape=(None,4)),之后在nn.classifier中的roi_pooling_conv中用到,提取roi_input的x,y,w,h,然后将提取到的roi区域大小的图像resize为pooling siez,这个roi_input没有任何初始化,x,y,w,h从哪里来的?跪求大神回答。
@jinfagang

cv.imread() returns None

In simple_parser.py, variable “filename” is the first line of kitti_simple_label.txt, and it returns Nonetype. Filename is "/media/jintian/.....", yet I didn't find /media/ folder in my directory..... it's weird, so is there anyone who meets the same problem here ?

rpn_out_class size is wrong ?

Judging from the papers, the rpn_out_class size should be 18 instead of 9, because background is also contained, I think

x_class = Conv2D(num_anchors, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)

should be changed
x_class = Conv2D(num_anchors×2, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)

Is it? I'm also not very sure.

frcnn bad results, which parameters to choose ?

Hello,

I am trying to train faster rcnn on the PASCAL VOC dataset, but the network does not seem to learn (or maybe is to slow ?) I trained on 10 epochs and the loss was still above 3. Did anyone train it successfully ? With which parameters ?

EDIT : the train loss can be below 1.0 but the validation is still hight around 2.5

get expected rpn_out_class to have shape ....

i used my own train data. get errors:

Num classes (including bg) = 17
Num train samples 2125
Num val samples 394
loading weights from ./model/resnet50_weights_tf_dim_ordering_tf_kernels.h5
Unable to open file (Unable to open file: name = './model/kitti_frcnn_last.hdf5', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0)
Could not load pretrained model weights. Weights can be found in the keras application folder https://github.com/fchollet/keras/tree/master/keras/applications
Starting training
Epoch 1/3000
## Exception: Error when checking target: expected rpn_out_class to have shape (None, None, None, 9) but got array with shape (1, 38, 42, 18)

CNTK as backend

Hi
Thanks for sharing the code.
Since CNTK can also now be used as Keras backend, can that be used instead of TensorFlow?

Regards
Wajahat

Question get_img_output_length

I am trying to understand a bit the code and i am wondering what does the get_img_output_length means in the vgg.py and resnet.py files ? Why does it return the width and height divided by 16 ?
I thought it was about the stride but it does not seem to be that .
Moreover, why is it different between the resnet.py and vgg.py regarding this function ?

Issue with BGR an RGB

https://github.com/jinfagang/keras_frcnn/blob/8b5d8fa2f1a11b83ebe91ef8486dd2ee5d620ede/keras_frcnn/data_generators.py#L316

From what i understand, to subtract the mean pixel value from ImageNet, this should be correct:
`# For Tensorflow
#x[:, :, :, (B,G,R)]

Subtract ImageNet mean pixel

x[:, :, :, 0] -= 103.939 #B
x[:, :, :, 1] -= 116.779 #G
x[:, :, :, 2] -= 123.68 #R`

I noticed you read the image using cv2, the default order for the channel is BGR for cv2. But then you convert it to RGB and subtract the mean pixel value from Imagenet.

What you did is like this:
`#x[:, :, :, (R,G,B)]

Subtract ImageNet mean pixel

x[:, :, :, 0] -= 103.939 #R
x[:, :, :, 1] -= 116.779 #G
x[:, :, :, 2] -= 123.68 #B`

Am i correct? Please check. Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.