GithubHelp home page GithubHelp logo

kleinyuan / tf-3d-object-detection Goto Github PK

View Code? Open in Web Editor NEW
111.0 8.0 38.0 47 KB

Detect object in 3D with Point Cloud and Image.

License: MIT License

Python 100.00%
3d-object-detection tensorflow frustum-pointnet kitti

tf-3d-object-detection's Introduction

Summary

3d

(Below is from a data in KITTI 3D Object Detection Dataset)

semi-endtoend

Run demo

1. Requirements

  • MacOS or Ubuntu

  • Tensorflow

  • Mayavi (visualization Only)

  • OpenCV

  • Anaconda preferred (optional)

2. Clone this repo

git clone https://github.com/KleinYuan/tf-3d-object-detection.git

2. Install Dependencies

# Simply run this in this project root folder
cd tf-3d-object-detection
pip install -r requirements.txt

If you meet error install say opencv, do conda install opencv if you use Anaconda. Otherwise, dude, build from source and let's call it a day.

3. Pick a 2D Object Detection Model

In here we support 5 different 2D Detection models:

Model name Speed COCO mAP Outputs
ssd_mobilenet_v1_coco fast 21 Boxes
ssd_inception_v2_coco fast 24 Boxes
rfcn_resnet101_coco medium 30 Boxes
faster_rcnn_resnet101_coco medium 32 Boxes
faster_rcnn_inception_resnet_v2_atrous_coco slow 37 Boxes

Pick one of those that makes you feel good, and find it in the list -- _DETECTOR_2D_OPTIONS in configs/configs, then replace it with the value of _DETECTOR_2D_MODEL_NAME.

And by default, I use ssd_mobilenet_v1_coco_11_06_2017 due to it's fast.

4. Download Test Data

Due to the license of KITTI is waaaaaaaaaaay to long to read, I will just tell ya how to do it instead of running a risk to attach here with some data from KITTI, which when I downloaded it I clicked some button to have agreed on something that's TLTR.

# Step1 Go to http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d
# Step2 Do "Download left color images of object data set (12 GB)"
# Step3 Do "Download Velodyne point clouds, if you want to use laser information (29 GB)"
# Step4 Do "Download camera calibration matrices of object data set (16 MB)"
# Step5 Unzip all those three zip files and you will find ~7000ish training datasets, each pair include velodyne, image and calibration
# Step6 Pick one of them, copy and paste it under example_data folder, and name the image to 1.png, and velodyne file to 1.bin
# Step7 Open calibration file, find corresponding item and replace it with CALIB_PARAM in configs/configs.py, by default, it's from training/000000.txt
# Step8 Really sorry to let you go thru last 7 Steps and I think I may come up with a better idea to do it with one button

5. Download Pretrained Model

As you may see, this project combined 2 Deep Neural Networks together. Therefore, yes you need to download two pre-trained model.

2D Object Detector Model 3D Object Detector Model
Download Link Download v1 and v2 is not supported yet (originally from here)

Then, unzip them and put them under pretrained folder. Also, renamed the checkpoint.txt file to checkpoint even though it's useless and you cannot freeze it :unhappy: .

The folder will look like this:

--tf-3d-object-detection
  |-- pretrained
      |--log_v1
          |-- checkpoint (originally named checkpoint.txt)
          |-- log_train.txt
          |-- model.ckpt.data-00000-of-00001
          |-- model.ckpt
          |-- model.ckpt.meta
      |-- ssd_mobilenet_v1_coco_11_06_2017 (or other names if you decide to use different ones)
          |-- frozen_inference_graph.pb
          |-- graph.pbtxt
          |-- model.ckpt-0.data-00000-of-00001
          |-- model.ckpt-0.index
          |-- model.ckpt-0.meta

You may realize this fact thus 3D object detection model is not really frozenable one.

(Hopefully they can disclose the original tensorflow ops for v1 so that we can remove both tf.py_func and freeze the model)

6. Run Demo

# If you use Pycharm, just click the green run button
# If not, navigate to root folder of this repo and run:
python apps/demo.py

# If it complains, yo, I cannot find some modules, yo, do:
export PYTHONPATH='.'
python apps/demo.py

# And if you still have the issue, man, you must really mess up with your python env.
# I don't wanan help you on that in this readme and don't create an issue for that as well.
# You shall either try using anaconda or find a python knower to help you with it
# Or, just do STACKOVERFLOW like other pals do

Then you should be able to see 3 Windows pop up in order, and don't forget to Press any key to continue as the terminal mention.

References

tf-3d-object-detection's People

Contributors

kleinyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tf-3d-object-detection's Issues

Does this result normal?

You said yellow is the 3d box location, but I just can not find it anyway:

image

There is not car. I think the model predict something wrong.

beside there are some error in codes. One of them is that input placeholder are (1, 1024, 4), but input of a simple image are (9, 1024, 4) why?

Remind myself

This repo has been around for almost one year and in here I just create an issue to remind myself that the following action items shall be done soon:

  • The code is not really production quality yet. I definitely shall upgrade the code quality and shall really just abstract the interfaces.

  • This repo is based on Frustum-PointNet and there are more related works in this area showing promising performances such as Deep Continuous Fusion for Multi-Sensor 3D Object Detection, Multi-Task Multi-Sensor Fusion for 3D Object Detection and so on, which you can hardly find the implementations. I probably shall extend the scope of this project to include both training and inference for various projects.

  • Some issues mention that the results do not align in 3D and I shall go back and double-check to see whether something is going wrong with the projection.

Evaluation

Hello,
Thanks for the repo :)

Is there a code for evaluaing the 3d detections?

2D detector

Hi,

Thanks for sharing your project with us. You provided several pre-trained 2D detectors models in your repo. I am curious did you train them by yourself? If not, could you please give me a link to the repo where I can find the source code to train these models?

Thanks a lot :)

Liangcheng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.