Hi. Thank you for your impressive work. I've read your work and want

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

About the model explanation about catr HOT 9 OPEN

saahiluppal commented on August 11, 2024 1

About the model explanation

from catr.

Comments (9)

saahiluppal commented on August 11, 2024 1

I've read it in ablation studies of some paper, not sure which paper.
I'll share the name of the paper as soon as i come across it again.

from catr.

ohwi commented on August 11, 2024

I saw little difference at the backbone. The paper uses ViT and this work uses CNN.

from catr.

saahiluppal commented on August 11, 2024

Hey, Thanks for the feedback.

This work is inspired from Facebook AI's (Detection Transformer) which aims to do object detection with transformers.

The paper you've enclosed is very recent work on this similar topic, but they have not provided any implementation.

from catr.

ohwi commented on August 11, 2024

Thank you for your reply.

I think I understand the structure of your work. Thank you!!

from catr.

parthskansara commented on August 11, 2024

Hi @saahiluppal, I am trying to understand where the object detection part is occurring in the code, and what exact algorithm you're using.

from catr.

saahiluppal commented on August 11, 2024

Hey,
The model is not doing Object Detection at any phase.

Image is fed to a resnet and this backbone will give us the feature embedding along with the corresponding mask for the image.
Then these features and mask are fed to the transformer,
and the rest is handled by attention.

That is the versatility of attention mechanism.

from catr.

saahiluppal commented on August 11, 2024

PS: Recent research shows that doing "Object Detection" prior to "Image Captioning" doesn't bring any additional improvement, instead it will just increase complexity.

from catr.

ohwi commented on August 11, 2024

PS: Recent research shows that doing "Object Detection" prior to "Image Captioning" doesn't bring any additional improvement, instead it will just increase complexity.

Hi. Would you let me know what is the paper you referenced? Thank you.

from catr.

Tough-Stone commented on August 11, 2024

Have you found which paper the structure of this code refers to？Thanks

from catr.

Recommend Projects

About the model explanation about catr HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs