GithubHelp home page GithubHelp logo

omg's Introduction

OMG

Title: OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval

The paper has been accepted by CVPR 2022 Workshop.

Abstract

Retrieving tracked-vehicles by natural language descriptions plays a critical role in smart city construction. It aims to find the best match for the given texts from a set of tracked vehicles in surveillance videos. Existing works generally solve it by a dual-stream framework, which consists of a text encoder, a visual encoder and a cross-modal loss function. Although some progress has been made, they failed to fully exploit the information at various levels of granularity. To tackle this issue, we propose a novel framework for the natural language-based vehicle retrieval task, OMG, which Observes Multiple Granularities with respect to visual representation, textual representation and objective functions. For the visual representation, target features, context features and motion features are encoded separately. For the textual representation, one global embedding, three local embeddings and a color-type prompt embedding are extracted to represent various granularities of semantic features. Finally, the overall framework is optimized by a cross-modal multi-granularity contrastive loss function. Experiments demonstrate the effectiveness of our method. Our OMG significantly outperforms all previous methods and ranks the 9th on the 6th AI City Challenge Track2. The codes are available at https://github.com/dyhBUPT/OMG.

Framework

OMG_framework

Prompt

Experiments

image-20220412190740408

Run

Data Preparation

Baidu Disk: link with code "city"

Requirements

  • CLIP
  • requirements.txt

Train

python train.py --config configs/Swin-B+CLIP-B_OMG2a_NLAug_IDLoss.yaml --valnum 4

Test

python test.py

Note

  • We also design the OSG framework for the ensemble. Please refer to the code for details.

Citation

@InProceedings{Du_2022_CVPR,
    author    = {Du, Yunhao and Zhang, Binyu and Ruan, Xiangning and Su, Fei and Zhao, Zhicheng and Chen, Hong},
    title     = {OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2022},
    pages     = {3124-3133}
}

Acknowledgement

A large part of the codes are borrowed from CLT. Thanks for their excellent work!

omg's People

Contributors

dyhbupt avatar yunhaodu avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

akemi0301

omg's Issues

Checkpoints

Hello, Could you please Where i can download the checkpoints !

Thank you!

Baidu Link is not working

Dear Team
It's really nice work. I want to reproduce the results but Baidu link for data preparation is not working.
Could you please provide it again?

Question about data

Hello, may I ask that where should I put both the data from baidu and the data from aicity in?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.