vita-epfl / butterflydetector Goto Github PK

License: Other

Python 95.77% Cython 4.23%

butterflydetector's Introduction

Butterfly Detector

Butterfly Detector for Aerial Images

Current state-of-the-art object detectors have achieved high performance when applied to images captured by standard front facing cameras. When applied to high-resolution aerial images captured from a drone or UAV stand-point, they fail to generalize to the wide range of objects' scales. In order to address this limitation, we propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images. We extend the concept of fields and introduce butterfly fields, a type of composite field that describes the spatial information of output features as well as the scale of the detected object. To overcome occlusion and viewing angle variations that can hinder the localization process, we employ a voting mechanism between related butterfly vectors pointing to the object center. We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.

Demo

Setup

Python 3 is required. Python 2 is not supported. Do not clone this repository and make sure there is no folder named butterflydetector in your current directory.

pip3 install butterflydetector

For development of the butterflydetector source code itself, you need to clone this repository and then:

pip3 install numpy cython
pip3 install --editable '.[train,test]'

The last command installs the Python package in the current directory (signified by the dot) with the optional dependencies needed for training and testing.

Data structure

data         
├── UAV-benchmark-M
    ├── test
    ├── train
├── VisDrone2019
    ├── VisDrone2019-DET-train
        ├── annotations
        ├── images
    ├── VisDrone2019-DET-val
    ├── VisDrone2019-DET-test-dev

Interfaces

python3 -m butterflydetector.predict --help
python3 -m butterflydetector.train --help
python3 -m butterflydetector.eval --help
python3 -m butterflydetector.logs --help

Tools to work with models:

python3 -m butterflydetector.migrate --help

Benchmark

Comparison of AP (%), False Positives (FP), and Recall (%) with state-of-the-art methods on UAVDT datasets.

Comparison of AP (average precision), AR (average recall), True Posi-tives (TP), and False Positives (FP) with state-of-the-art methods on VisDronedataset

Visualization

To visualize logs:

python3 -m butterflydetector.logs \
  outputs/<model1-basename>.pkl.log \
  outputs/<model2-basename>.pkl.log \
  outputs/<model3-basename>.pkl.log

Train

See datasets for setup instructions.

The exact training command that was used for a model is in the first line of the training log file.

Train a HRNetW32-det model on the VisDrone Dataset:

time CUDA_VISIBLE_DEVICES=0,1 python3 -m butterflydetector.train \
  --lr=1e-3 \
  --momentum=0.95 \
  --epochs=150 \
  --lr-decay 120 140 \
  --batch-size=5 \
  --basenet=hrnetw32det \
  --headnets butterfly10 \
  --square-edge=512 \
  --lambdas 1 1 1 1 \
  --dataset visdrone \
  --butterfly-side-length -2

You can refine an existing model with the --checkpoint option.

Evaluation

The command below will run your model on visdrone and save the predictions in the output directory. The predictions are saved in the correct format to be read by the official Matlab evaluator of VisDrone2019. To evaluate on UAVDT, simply replace 'visdrone' to 'uavdt'.

python -m butterflydetector.eval --checkpoint <directory-to-checkpoint> --dataset visdrone --output <directory-to-store-predictions> --seed-threshold 0.1

Video

Processing a video frame by frame from video.avi to video.pose.mp4 using ffmpeg:

export VIDEO=video.avi  # change to your video file

mkdir ${VIDEO}.images
ffmpeg -i ${VIDEO} -qscale:v 2 -vf scale=641:-1 -f image2 ${VIDEO}.images/%05d.jpg
python3 -m butterflydetector.predict --checkpoint resnet152 --glob "${VIDEO}.images/*.jpg"
ffmpeg -framerate 24 -pattern_type glob -i ${VIDEO}.images/'*.jpg.skeleton.png' -vf scale=640:-2 -c:v libx264 -pix_fmt yuv420p ${VIDEO}.pose.mp4

In this process, ffmpeg scales the video to 641px which can be adjusted.

EPFL Roundabout Dataset

EPFL Roundabout is a dataset that contains more than 2 hours of drone data collected from 4 different roundabout locations: Morges, EPFL, Ecublens, Echandens. The dataset contains both detection and tracking labels for 6 different categories: car, truck, bus, van, cyclists, pedestrians. The dataset can be used for detection, tracking, as well as trajectory prediction. The latter is usually challenging at a roundabout. The dataset can be downloaded here

Citation

@misc{adaimi2020perceiving,
      title={Perceiving Traffic from Aerial Images},
      author={George Adaimi and Sven Kreiss and Alexandre Alahi},
      year={2020},
      eprint={2009.07611},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

butterflydetector's People

Contributors

Stargazers

Watchers

Forkers

mesguerraf

butterflydetector's Issues

can you share the code about uavdt format to json or xml format transform code?

RuntimeError: The size of tensor a (256) must match the size of tensor b (255) at non-singleton dimension 3

Hello! Thanks for your great job.
When I run

time CUDA_VISIBLE_DEVICES=0,4 python3 -m butterflydetector.train --lr=1e-3 --momentum=0.95 --epochs=150 --lr-decay 120 140 --batch-size=16 --basenet=hrnetw32det --head-quad=1 --headnets butterfly10 --square-edge=512 --lambdas 1 1 1 1 --dataset uavdt

I got the error.

(cluster) hxz@ubuntu16:/home/data/hxz/butterflydetector$ time CUDA_VISIBLE_DEVICES=0,4 python3 -m butterflydetector.train --lr=1e-3 --momentum=0.95 --epochs=150 --lr-decay 120 140 --batch-size=16 --basenet=hrnetw32det --head-quad=1 --headnets butterfly10 --square-edge=512 --lambdas 1 1 1 1 --dataset uavdt
INFO:butterflydetector.logs:{'type': 'process', 'argv': ['/home/data/hxz/butterflydetector/butterflydetector/train.py', '--lr=1e-3', '--momentum=0.95', '--epochs=150', '--lr-decay', '120', '140', '--batch-size=16', '--basenet=hrnetw32det', '--head-quad=1', '--headnets', 'butterfly10', '--square-edge=512', '--lambdas', '1', '1', '1', '1', '--dataset', 'uavdt'], 'args': {'debug': False, 'checkpoint': None, 'basenet': 'hrnetw32det', 'headnets': ['butterfly10'], 'pretrained': True, 'cross_talk': 0.0, 'head_dropout': 0.0, 'head_quad': 1, 'lambdas': [1.0, 1.0, 1.0, 1.0], 'r_smooth': 0.0, 'regression_loss': 'laplace', 'background_weight': 1.0, 'margin_loss': False, 'auto_tune_mtl': False, 'butterfly_side_length': 1, 'momentum': 0.95, 'beta2': 0.999, 'adam_eps': 1e-06, 'nesterov': True, 'weight_decay': 0.0, 'adam': False, 'amsgrad': False, 'lr': 0.001, 'lr_decay': [120, 140], 'lr_burn_in_epochs': 2, 'lr_burn_in_factor': 0.001, 'lr_gamma': 0.1, 'dataset': 'uavdt', 'train_annotations': None, 'train_image_dir': None, 'val_annotations': None, 'val_image_dir': None, 'pre_n_images': 8000, 'n_images': None, 'duplicate_data': None, 'pre_duplicate_data': None, 'loader_workers': 2, 'batch_size': 16, 'output': 'outputs/hrnetw32det-butterfly10-edge512-211210-152433.pkl', 'stride_apply': 1, 'epochs': 150, 'freeze_base': 0, 'pre_lr': 0.0001, 'rescale_images': 1.0, 'orientation_invariant': False, 'update_batchnorm_runningstatistics': False, 'square_edge': 512, 'ema': 0.001, 'disable_cuda': False, 'augmentation': True, 'debug_fields_indices': [], 'profile': None, 'device': device(type='cuda'), 'pin_memory': True}, 'version': '0.0.1', 'hostname': 'ubuntu16'}
INFO:butterflydetector.network.hrnet:=> init weights from normal distribution
INFO:butterflydetector.network.hrnet:=> loading pretrained model pretrained/imagenet/hrnet_w32-36af842e.pth
INFO:butterflydetector.network.basenetworks:stride = 4
INFO:butterflydetector.network.basenetworks:output features = 512
INFO:butterflydetector.network.heads:selected head CompositeField for butterfly10
Using multiple GPUs: 2
INFO:butterflydetector.network.losses:multihead loss: ['butterfly10.c', 'butterfly10.vec1', 'butterfly10.scales1', 'butterfly10.scales2'], [1.0, 1.0, 1.0, 1.0]
/home/data/hxz/butterflydetector/butterflydetector/data_manager/uavdt.py:66: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
self.targets = np.asarray(self.targets)
/home/data/hxz/butterflydetector/butterflydetector/data_manager/uavdt.py:67: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
self.targets_ignore = np.asarray(self.targets_ignore)
Images: 40409
Images: 16580
Images: 8000
INFO:butterflydetector.optimize:SGD optimizer
INFO:butterflydetector.network.trainer:{'type': 'config', 'field_names': ['butterfly10.c', 'butterfly10.vec1', 'butterfly10.scales1', 'butterfly10.scales2']}
/home/hxz/anaconda3/envs/cluster/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
Traceback (most recent call last):
File "/home/hxz/anaconda3/envs/cluster/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/hxz/anaconda3/envs/cluster/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/data/hxz/butterflydetector/butterflydetector/train.py", line 200, in
main()
File "/home/data/hxz/butterflydetector/butterflydetector/train.py", line 196, in main
trainer.loop(train_loader, val_loader, args.epochs, start_epoch=start_epoch)
File "/home/data/hxz/butterflydetector/butterflydetector/network/trainer.py", line 99, in loop
self.train(train_scenes, epoch)
File "/home/data/hxz/butterflydetector/butterflydetector/network/trainer.py", line 173, in train
loss, head_losses = self.train_batch(data, target, meta, apply_gradients)
File "/home/data/hxz/butterflydetector/butterflydetector/network/trainer.py", line 116, in train_batch
loss, head_losses = self.loss(outputs, targets)
File "/home/hxz/anaconda3/envs/cluster/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/data/hxz/butterflydetector/butterflydetector/network/losses.py", line 176, in forward
for l, f, t in zip(self.losses, head_fields, head_targets)
File "/home/data/hxz/butterflydetector/butterflydetector/network/losses.py", line 177, in
for ll in l(f, t)]
File "/home/hxz/anaconda3/envs/cluster/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/data/hxz/butterflydetector/butterflydetector/network/losses.py", line 457, in forward
) / 100.0 / batch_size
RuntimeError: The size of tensor a (256) must match the size of tensor b (255) at non-singleton dimension 3

How to fix it ?

Pretrained Weights

Hi,

Could you provide pretrained weights on UAVDT and VisDrone datasets?

Thanks.

Annotation format

Can you describe annotation format?
For example : 0 0 0 2071 1316 102 73 0

I guess first 0 : Image ID, but I didn't understand other. Can I remove other to use dataset for object detection?

seems no description at datsets.md

I could not find any description at datsets.md
https://github.com/vita-epfl/butterflydetector/blob/master/docs/datasets.md

calculating mAP of trained model

How does one calculate the mAP of a trained model? The intended way to do so appears to be with benchmark.py which then calls eval.py. But it appears to me that when running this, stats from line 107 in benchmark.py will always be empty, because eval.py never writes any file of the form *.stats.json which is looked for in the generation of the variable stats.

The line which might be intended to write the stats to disk in the needed form is line 308 from eval.py as it uses an InstanceScoreRecorder object. But this line is a comment and even if it wasn't, it's output is not of the form *.stats.json but it merely writes to ./instance_score_data.json. Am I missing something?

Thank you a lot in advance!