GithubHelp home page GithubHelp logo

philip-huang / pixor Goto Github PK

View Code? Open in Web Editor NEW
281.0 6.0 59.0 8.55 MB

PyTorch Implementation of PIXOR

License: MIT License

Python 86.87% Shell 0.45% C 3.60% Makefile 0.11% CMake 6.88% C++ 2.10%
deep-learning self-driving-car lidar

pixor's Introduction

PIXOR: Real-time 3D Object Detection from Point Clouds

This is a custom implementation of the paper from Uber ATG using PyTorch 1.0. It represents the driving scene using lidar data in the Birds' Eye View (BEV) and uses a single stage object detector to predict the poses of road objects with respect to the car

PIXOR: Real-time 3D Object Detection from Point Clouds

alt text

Highlights

  • PyTorch 1.0 Reproduced and trained from scratch using the KITTI dataset
  • Fast Custom LiDAR preprocessing using C++
  • Multi-GPU Training and Pytorch MultiProcessing package to speed up non-maximum suppression during evaluation
  • Tensorboard Visualize trainig progress using Tensorboard
  • KITTI and ROSBAG Demo Scripts that supports running inferences directly on raw KITTI data or custom rosbags.

Install

Dependencies:

  • Python 3.5(3.6)
  • Pytorch (Follow Official Installation Guideline)
  • Tensorflow (see their website)
  • Numpy, MatplotLib, OpenCV3
  • PyKitti (for running on KITTI raw dataset)
  • gcc
pip install shapely numpy matplotlib
git clone https://github.com/philip-huang/PIXOR
cd PIXOR/srcs/preprocess
make

(Optional) If you want to run this project on a custom rosbag containing Velodyne HDL64 scans the system must be Linux with ROS kinetic installed. You also need to install the velodyne driver into the velodyne_ws folder.

Set up the velodyne workspace by running ./velodyne_setup.bash and press Ctrl-C as necessary.

Demo

A helper class is provided in run_kitti.py to simplify writing inference pipelines using pre-trained models. Here is how we would do it. Run this from the src folder (suppose I have already downloaded my KITTI raw data and extracted to somewhere)

from run_kitti import *

def make_kitti_video():
     
    basedir = '/mnt/ssd2/od/KITTI/raw'
    date = '2011_09_26'
    drive = '0035'
    dataset = pykitti.raw(basedir, date, drive)
   
    videoname = "detection_{}_{}.avi".format(date, drive)
    save_path = os.path.join(basedir, date, "{}_drive_{}_sync".format(date, drive), videoname)    
    run(dataset, save_path)

make_kitti_video()

Training and Evaluation

Our Training Result (as of Dec 2018) alt text

All configuration (hyperparameters, GPUs, etc) should be put in a config.json file and save to the directory srcs/experiments/$exp_name$ To train

python srcs/main.py train (--name=$exp_name$)

To evaluate an experiment

python srcs/main.py val (--name=$exp_name$)

To display a sample result

python srcs/main.py test --name=$exp_name$

To view tensorboard

tensorboard --logdir=srcs/logs/$exp_name$

TODO

  • Improve training accuracy on KITTI dataset
  • Data augmentation
  • Generalization gap on custom driving sequences
  • Data Collection
  • Improve model (possible idea: use map as a prior)

Credits

Project Contributors

  • Philip Huang
  • Allan Liu

Paper Citation below



@inproceedings{yang2018pixor,
  title={PIXOR: Real-Time 3D Object Detection From Point Clouds},
  author={Yang, Bin and Luo, Wenjie and Urtasun, Raquel}
}

We would like to thank aUToronto for genersouly sponsoring GPUs for this project

pixor's People

Contributors

philip-huang avatar zqallan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pixor's Issues

Fake repository.. Doesn't train properly if we start from scratch

After traiing for round 40 epoch with batchsize=1,
On validation/testing kitti dataset, vehcile bounding box won't get detected. Hence this repository is useless since after training, i didn't got anyresult. Henve i conclude that this repository doesn't work properly

Also multi-gpu with batch-size=1, model doesn't trained and crashes?
Poor training and no results after running the code .

problem with loss.py

In loss.py shouldn't line 85
loss = loss(pred, label)
be
loss = loss.forward(pred, label)

Problem with training on KITTI

Hi,

I started training PIXOR your implementation with a Batch_size=1.
I trained upto 40 epoch as mentioned in the code. Loaded the checkpoint 34 from trained model.
It seems that after training num_pred=0 i.e number of predictions boxes becomes zero.
Loss curves are behaving well.
Kindly provide solution if you have solved it.

how did you train a new model for another class?such as Pedestrian or Cyclist?

1.The score of the prediction is too low.if I need to modify the alpha and beta?
2.And how should I set the object_list and filter the train dataset to make the balance of positive and negetive data?
3.I meet the problem like others in compute_iou():float divided by zero.How to avoid it?
If you have any ideas,welcome
pls,thank you

All prediction box and scores are zero /empty list

Hi,

I trained pixor with a batch-size 1 and run for around35 epoch. Loss was exponentially decreasing.

With that trained model when i started evaluated using python main.py --mode=test --name=default or python main.py --mode=val --name=default, iam getting no prediction boxes.
I trained with batchsize1 using 1080Ti since for larger batch_size>=4, the code was crashing.

Kindly help me/suggest me in getting correct prediction boxes. Pls help

Prepare for the dataset

Hi, do you know what dataset is used exactly? There are many parts of the KITTI Bird's Eye View dataset. I don't know which should I download. Thank you.

Bird's Eye View Evaluation 2017:
Download left color images of object data set (12 GB)
Download right color images, if you want to use stereo information (12 GB)
Download the 3 temporally preceding frames (left color) (36 GB)
Download the 3 temporally preceding frames (right color) (36 GB)
Download Velodyne point clouds, if you want to use laser information (29 GB)
Download camera calibration matrices of object data set (16 MB)
Download training labels of object data set (5 MB)

pool.starmap(filter_pred, [(config, pred) for pred in predictions])ZeroDivisionError: float division by zero

Hi
I trained your PIXOR network for 40 epoch with decrease in loss curves.

How when i execute python main.py --mode=val --name=default , the below error occurs.. Kindly help please

Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/Tracking/PIXOR-master/srcs/postprocess.py", line 129, in filter_pred
selected_ids = non_max_suppression(corners, scores, config['nms_iou_threshold'])
File "/home/Tracking/PIXOR-master/srcs/postprocess.py", line 96, in non_max_suppression
iou = compute_iou(polygons[i], polygons[ixs[1:]])
File "/home/Tracking/PIXOR-master/srcs/postprocess.py", line 51, in compute_iou
iou = [box.intersection(b).area / box.union(b).area for b in boxes]
File "/home/Tracking/PIXOR-master/srcs/postprocess.py", line 51, in
iou = [box.intersection(b).area / box.union(b).area for b in boxes]
ZeroDivisionError: float division by zero

A question about coordinate transformation

The 3d locations of ground truth boxes are in camera coordinate. In this implementation, coordinate transformation of 3d locations just contains rotation transformation. Is the translation ignored? And in rotation transformation, I think "z=-z" is necessary, which is similar to "y=-y" in datagen.py

Pretrained_models

Thank you for your great job!

Could you please share a pretrained model for testing?

A question about corners' calculation

In datagen.py:

def get_corners(self, bbox):
        w, h, l, y, z, x, yaw = bbox[8:15]
        y = -y

the bbox[8] is width, bbox[9] is height and bbox[10] is length. But in KITTI's instruction (readme.txt in devkit_object.zip), bbox[8] is height, bbox[9] is width and bbox[10] is length.

Open-source license

Hi Philip,

This repository looks great :)

Would you mind adding an open-source license? Otherwise it could be problematic for other people to use it.

Best,
Martin

AttributeError: module 'tensorboard.summary._tf.summary' has no attribute 'FileWriter'

Hi, I tried to run this program on my computer, and when I executed the following command:

python main.py train --name=default

I got this error:

AttributeError: module 'tensorboard.summary._tf.summary' has no attribute 'FileWriter'

this is the detail:

Using device cpu
There are 3712 images in txt file
Found 3712 Velodyne scans...
done.
There are 3769 images in txt file
Found 3769 Velodyne scans...
done.
------------------------------------------------------------------
Traceback (most recent call last):
  File "main.py", line 395, in <module>
    train(args.name, device)
  File "main.py", line 191, in train
    train_logger = get_logger(config, 'train')
  File "/home/andre/masterarbeit/PIXOR/srcs/utils.py", line 57, in get_logger
    return logger.Logger(folder)
  File "/home/andre/masterarbeit/PIXOR/srcs/logger.py", line 15, in __init__
    self.writer = tf.summary.FileWriter(log_dir)
AttributeError: module 'tensorboard.summary._tf.summary' has no attribute 'FileWriter'

the version of tensorflow is 2.1. Can you tell me how to solve this problem? thx

Preprocess cost too much time.

Runing average time:

Average Preprocessing Time:  0.096s 
        Forward Time:        0.011s 
        Postprocessing Time: 0.073s

Preprocessing time cost too much, morethtan even forward time.....

Focal loss

Why this implementtaion does not use focal loss for per column class loss ? It seems to use straight cross enthropy, despite the prevalence of negative samples.

Hi, do you meet the problem "num_pred=0" ?

Hi, man, I try your method and I often meet the problem "num_pred=0" in function "compute_ap", and this problem always leads to training process interrupt. Do you meet the same problem?

CPU and GPU usage

I am using one GPU to train the model, but it is still using around 10-16G CPU memory. Would you please tell what operations are using so much memory on the CPU.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.