GithubHelp home page GithubHelp logo

dplnet's Introduction

DPLNet (Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning)

Welcome to the official code repository for Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning. We're excited to share our work with you, please bear with us as we prepare the code and demo. Stay tuned for the reveal!

Motivation

Previous multimodal methods often need to fully fine-tune the entire network, which are training-costly due to massive parameter updates in the feature extraction and fusion, and thus increases the deployment burden of multimodal semantic segmentation. In this paper, we propose a novel and simple yet effective dual-prompt learning paradigm, dubbed DPLNet, for training-efficient multimodal semantic segmentation.

Editor

Framework

Overview architecture of the proposed DPLNet, which adapts a frozen pre-trained model using two specially designed prompting learning modules, MPG for multimodal prompt generation and MFA for multimodal feature adaption, with only a few learnable parameters to achieve multimodal semantic segmentation in a training-efficient way. Framework

Visualization

RGBD Semantic Segmentation Results

NYU-V2

Results

SUN-RGBD

Results

RGBT Semantic Segmentation Results

MFNet

Results

PST900

Results

RGB-D SOD Results

Results

RGB-T SOD Results

Results

RGB-T Video Semantic Segmentation Results

Results

dplnet's People

Contributors

shaohuadong2021 avatar

Stargazers

李晖 avatar  avatar Baird Xiong avatar Subtitle123 avatar DongLZ avatar Jifeng Wang avatar Wen Jiang avatar  avatar  avatar  avatar  avatar  avatar WYS1998 avatar  avatar  avatar  avatar  avatar qwertyuiop avatar LEFTeyes avatar  avatar Yi Pan avatar 岑朝君 avatar  avatar Soyun Choi avatar Rui Zhang avatar  avatar Yunhe Feng avatar  avatar  avatar  avatar Tongfei avatar Yu-Wen Michael Zhang avatar  avatar  avatar  avatar  avatar Chuanmiing avatar  avatar Xun Yang avatar SHARKALAKA avatar Jiankang Hong avatar XiaominFan avatar zhuyun97 avatar

Watchers

 avatar Howard H. Tang avatar  avatar

dplnet's Issues

public code available

Hi,

I found your work to be very interesting and inspiring. I would love to see the code to better understand and appreciate the details of your project. Could you please share when you plan to release it?

Thank you!

RGB backbone pre-training

Thank you very much for your work, may I ask which data set is used for pre-training of the RGB backbone in your model?

Regarding questions about the parameters

Hello! Thank you for your excellent work.

I have some questions about the parameters reported in your paper. It seems that the parameter count does not include the frozen pre-trained backbone (MiT-B5). Could you further clarify the total number of parameters during network deployment (including both trainable and non-trainable parameters)? Additionally, could you provide more details on the network's FLOPs?

Ask about the preprocessing of depth images

When training on NYUv2 dataset, did you use HHA encoding image as inputs or did you just use the colorization depth images? Furthermore, did you crop the white border of depth images?

source code

First of all, thank you for your work, I am very interested in your model, may I ask when you plan to release the source code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.