GithubHelp home page GithubHelp logo

guda-cln's Introduction

⚠️ Code is still in reviewing process and will be uploaded soon

GUDA-CLN

Motivation

Since the AVT dataset does not have any ground truth annotations and labeling the vast amount of data manually is costly and time-consuming, pre-trained supervised lane detection models are performing suboptimally as they are trained on another dataset. Due to the difference in data distribution between the AVT dataset and the dataset the model was trained on, a domain shift is introduced. This model leverages self-supervised monocular depth estimation as an auxiliary task to overcome the domain gap and the current SOTA-lane detection model as a baseline lane detector.

Architecture

picture of architecture

This work proposes a novel geometric unsupervised domain adaptation for lane detection (GUDA-Lane) to overcome the domain shift between a source dataset and target dataset. The model is composed of three networks:

  • A lane detection network that takes an image as input and outputs a set of lanes

  • A depth estimation network that takes an input image as input and outputs a estimated depth map

  • A pose estimation network that takes a source $I_s$ and a target image as input and outputs a transformation between them

To perform the domain adaptation, a feature sharing module (FSM) between the two image encoders of the lane detection and depth estimation network is introduced.

How to run

Setup

Create and start conda environment

conda env create -f environment.yml
conda activate guda-cln

DLA backbone

When starting this project, LaneAF was used as the lane detection baseline. However, as it is quite computationally complex, I opted for CondLaneNet. LaneAF itself does work in this project, however, it is not possible to use the domain adaptation as it would need to be implemented first. Since LaneAF uses the DLA backbone instead of a ResNet, it is also included in this repo. So, if you want to use it, run the following before using it:

cd networks
git clone [email protected]:chenandy/DCNv2.git
cd DCNv2
./make.sh

Data Prep

This model uses TuSimple as the source domain, i.e. utilizing its labels, and BDD100K as the target domain, i.e. only utilizing the monocular video sequences.

To prepare the datasets, copy ./data to anywhere you want to save your data. Specific download routines have already been created, so to download and prepare the datasets, run the following:

# For TuSimple
cd <path-to-data>/data/tusimple
./setup_tusimple.sh
cd <path-to-guda-cln>
python datasets/utils.py --task extract-lines-txt --dataset tusimple --data_path <path-to-data>/data/tusimple

# For BDD100K
cd <path-to-data>/data/bdd100k
./setup_bdd100k.sh
cd <path-to-guda-cln>
python datasets/utils.py --task extract-lines-txt --dataset bdd100k --data_path <path-to-data>/data/bdd100k

To inference the model on any other dataset, the data must be stored as images.

Training

A sample script for training the model is provided in ./scripts/sample_train.sh. Each argument is explained there and also in ./options.py.

The final configuration parameters of the model are set as the default values in the argument parser. As BDD100K is a very complex dataset, the depth estimation was pretrained on BDD100K. First starting with training the split bdd100k_highway_single for 10 epochs, then taking the pre-trained weights and training it again on bdd100k_highway_4 and bdd100k_highway_34 for 2 epochs, respectively. Then, the learned weights are used to continue training GUDA-CLN.

The model works only in single GPU mode. Depending on the GPU and harddrive used, the training time ranges from approx. 15h (Nvidia A100) to approx. 36h (Nvidia GTX1080).

If you want to train the model on a different target dataset, it is of benefit to pretrain the depth estimation network on the more difficult dataset for a couple of epochs and use the pretrained weights for doing the domain adapted training.

Evaluating the model

Depending on the used dataset, there are two common ways of evaluating the model performance. For TuSimple, there is the TuSimple-way, for CULane, the CULane way. Both evaluation methods are implemented in this Code, however, only the CULane way can be used for all datasets whereas the TuSimple way only works for TuSimple.

To evaluate the model, the evaluation script for the CULane evaluation method needs to be build. To do that, run the following:

cd evaluation/CULane
./build.sh

For evaluating a model, take a look at ./scripts/sample_eval.sh.

Inference

Please follow the jupyter notebook for training, evaluation, and inference information: ./sample.ipynb

Results

GUDA-CLN was compared against the baseline CondLaneNet implementation.

Evaluation on TuSimple

The first table shows the evaluation results on the source domain indicating how well the lane detection task itself performs. Both GUDA-CLN and the baseline model perform similarly well, however, the performance on the TuSimple is already saturated.

Method Source Target FN ↓ FP ↓ Acc ↑
CondLaneNet TuSimple - 6.3% 4.1% 94.3%
GUDA-CLN TuSimple BDD100K 5.4% 3.9% 94.7%

Evaluation on BDD100K

This table indicates a slight performance increase on the target dataset of GUDA-CLN in comparison to CondLaneNet which suggests the efficacy of the domain adaptation.

Method Source Target Prec ↑ Rec ↑ Acc ↑
CondLaneNet TuSimple - 48.1% 30.8% 37.5%
GUDA-CLN TuSimple BDD100K 56.6% 29.8% 39.0%
Evaluated on BDD100K

Evaluation on BDD100K (only ego lanes)

For the lane variability analysis, only the ego-lanes, i.e. the nearest lanes left and right of the centerline of the image, are necessary to compute the lateral position of the vehicle within a lane. Thus, as a second evaluation, only the ego-lanes are taken into consideration further increase the performance boost of GUDA-CLN over CondLaneNet.

Method Source Target Prec ↑ Rec ↑ Acc ↑
CondLaneNet TuSimple - 41.8% 39.5% 40.6%
GUDA-CLN TuSimple BDD100K 50.2% 39.4% 44.2%

Evaluation on AVT

The first model was initially used as the lane detection model and it is clearly apparent, that GUDA-CLN outperforms this model by a large margin.

Method Source Target Prec ↑ Rec ↑ Acc ↑
SimCycleGAN + ERFNet TuSimple - 33.4% 33.6% 33.5%
CondLaneNet TuSimple BDD100K 63.0% 60.3% 61.6%
GUDA-CLN TuSimple BDD100K 74.2% 62.6% 67.9%

Pretrained models

In the following, the pretrained lane detection models are saved. The pretrained BDD100K on Monodepth2 can be used as pretrained weights, if you want to retrain GUDA-CLN on TuSimple as a source domain and BDD100K as a target domain.

Training Modality Links
Pre-trained BDD100K on Monodepth2 Download
GUDA-CLN from TuSimple to BDD100K Download
CondLaneNet trained on BDD100K Download

guda-cln's People

Contributors

andy-96 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.