damo-streamnet's Introduction

DAMO-StreamNet: Optimizing Streaming Perception for Autonomous Driving

DAMO-StreamNet is a novel streaming perception framework for real-time video object detection in autonomous driving scenarios. It builds upon state-of-the-art models like YOLO and LongShortNet to achieve optimized accuracy under strict latency constraints.

Key Features

Robust Neck Design: Incorporates deformable convolution to enhance receptive fields and feature alignment.
Dual-Branch Structure: Fuses semantic and temporal features for accurate motion prediction.
Asymmetric Distillation: Distills future knowledge from teacher to student network during training for performance gains.
Real-time Forecasting: Continuously updates support frames for seamless streaming.

For more details, please see our full IJCAI 2023 paper.

Usage

DAMO-StreamNet supports real-time detection of 8 classes relevant to autonomous driving:

Person, Bicycle, Car, Motorcycle, Bus, Truck, Traffic Light, Stop Sign

See ModelScope Documentation for code examples to run inference using our pretrained models.

Model Zoo

Model	Input Size	Velocity	sAP 0.5:0.95	sAP50	sAP75	COCO Weights	Checkpoint
DAMO-StreamNet-S	600x960	1x	31.8	52.3	31.0	link	link
DAMO-StreamNet-M	600x960	1x	35.5	57.0	36.2	link	link
DAMO-StreamNet-L	600x960	1x	37.8	59.1	38.6	link	link
DAMO-StreamNet-L	1200x1920	1x	43.3	66.1	44.6	link	link

Teacher models available here.

Installation

Follow install guidelines from StreamYOLO and LongShortNet.

Quick Start

Dataset Preparation

Follow Argoverse-HD setup instructions.

Model Preparation

Organize downloaded models:

./models
├── checkpoints
│   ├── streamnet_l_1200x1920.pth
│   ├── streamnet_l.pth
│   ├── streamnet_m.pth
│   └── streamnet_s.pth
├── coco_pretrained_models
│   ├── yolox_l_drfpn.pth
│   ├── yolox_m_drfpn.pth
│   └── yolox_s_drfpn.pth  
└── teacher_models
    └── l_s50_still_dfp_flip_ep8_4_gpus_bs_8
        └── best_ckpt.pth

Training

bash run_train.sh

Evaluation

bash run_eval.sh

Training Details

8 Epochs on Argoverse-HD
SGD Optimizer with Linear LR Schedule
Random Flip Augmentation
Multi-Scale Training

References

Please cite our paper:

@article{DAMO_StreamNet,
  title={DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving},
  author={Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie},
  journal={IJCAI},  
  year={2023}
}

DAMO-StreamNet builds on YOLO, LongShortNet and StreamYOLO.

License

For academic research only. Please contact authors for commercial licensing.

damo-streamnet's People

Contributors

Stargazers

Watchers

damo-streamnet's Issues

trainer的区别

请问streamnet_l_1200x1920.py中使用longshort_trainer，但是其他exp使用longshort_dil_trainer，是什么区别呢？

ValueError: Attempting to unscale FP16 gradients.

hi您好，我在运行run_train.sh不用预训练模型时遇到了下面的问题，请问可以如何解决？

RuntimeError: expected scalar type Half but found Float

具体报错在这一行
https://github.com/zhiqic/DAMO-StreamNet/blob/adf65eda4308f570dd4d3613c1bcb9f3a4afa7f8/exps/model/damo_yolo/base_models/core/ops.py#L194

经debug发现， self.conv1 的模型weight参数为float32，所以输入fp16会不匹配。

于是在 https://github.com/zhiqic/DAMO-StreamNet/blob/adf65eda4308f570dd4d3613c1bcb9f3a4afa7f8/exps/train_utils/longshort_trainer.py#L142 下一行加了修改模型为半精度即 model = model.half()

但是self.scaler.step(self.optimizer)这句报错ValueError: Attempting to unscale FP16 gradients.

还有在最开始使用预训练模型时也有报错:

File "/home/qtt/Test/DAMO-StreamNet/exps/train_utils/longshort_trainer.py", line 325, in resume_train
ckpt = torch.load(ckpt_file, map_location=self.device)["model"]
│ │ │ │ └ 'cuda:0'
│ │ │ └ <exps.train_utils.longshort_trainer.Trainer object at 0x7f2e84e47150>
│ │ └ '/home/qtt/Test/DAMO-StreamNet/models/coco_pretrained_models/yolox_l_drfpn.pth'
│ └ <function load at 0x7f2e86976200>
└ <module 'torch' from '/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/site-packages/torch/init.py'>

File "/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
│ │ │ │ └ {'encoding': 'utf-8'}
│ │ │ └ <module 'pickle' from '/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/pickle.py'>
│ │ └ 'cuda:0'
│ └ <torch._C.PyTorchFileReader object at 0x7f2e84b658b0>
└ <function _load at 0x7f2e86976560>
File "/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/site-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
│ └ <method 'load' of '_pickle.Unpickler' objects>
└ <_pickle.Unpickler object at 0x7f2ddeee84d0>
File "/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/site-packages/torch/serialization.py", line 845, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
│ │ │ │ │ └ 'cuda:0'
│ │ │ │ └ <function _maybe_decode_ascii at 0x7f2e86976440>
│ │ │ └ '94471090091744'
│ │ └ 256
│ └ <class 'torch.FloatStorage'>
└ <function _load..load_tensor at 0x7f2dd8b05cb0>
File "/home/qtt/Software/anaconda3/envs/torch171_py37_cu110/lib/python3.7/site-packages/torch/serialization.py", line 833, in load_tensor
storage = zip_file.get_storage_from_record(name, size, dtype).storage()
│ │ │ │ └ torch.float32
│ │ │ └ 256
│ │ └ 'data/94471090091744'
│ └ <instancemethod get_storage_from_record at 0x7f2e874cb6d0>
└ <torch._C.PyTorchFileReader object at 0x7f2e84b658b0>

RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94471090091744: invalid header or archive is corrupted