Comments (8)
You could find answers in Sec 4.2 and Table 8.
from meta-detr.
@Yuxuan-W The implementation is the same as the paper. And if you read codes carefully, you will find CAM is not used at every encoder layer.
from meta-detr.
@Yuxuan-W The implementation is the same as the paper. And if you read codes carefully, you will find CAM is not used at every encoder layer.
When computing category codes, src is support feature, why do src, category_code and tsp need to do QSAttn? It's not mentioned in the paper. Cannot be seen from fig. 4.
In deformable_transformer.py:
def forward_supp_branch(self, src, pos, reference_points, spatial_shapes, level_start_index, padding_mask, tsp, support_boxes):
# self attention
src2 = self.self_attn(self.with_pos_embed(src, pos), reference_points, src, spatial_shapes, level_start_index, padding_mask)
src = src + self.dropout1(src2)
src = self.norm1(src)
support_img_h, support_img_w = spatial_shapes[0, 0], spatial_shapes[0, 1]
supp_roi = torchvision.ops.roi_align(
src.transpose(1, 2).reshape(src.shape[0], -1, support_img_h, support_img_w),
support_boxes,
output_size=(7, 7),
spatial_scale=1 / 32.0,
aligned=True)
supp_roi=supp_roi.mean(3).mean(2) # [episode_size, C]
category_code = supp_roi.sigmoid() # [episode_size, C]
if self.QSAttn:
# siamese attention
src, tsp = self.siamese_attn(src,
inverse_sigmoid(category_code).unsqueeze(0).expand(src.shape[0], -1, -1),
category_code.unsqueeze(0).expand(src.shape[0], -1, -1),
tsp)
# ffn
src = self.forward_ffn(src + tsp)
else:
src = self.forward_ffn(src)
return src, category_code
from meta-detr.
@Yuxuan-W The implementation is the same as the paper. And if you read codes carefully, you will find CAM is not used at every encoder layer.
Sorry for my careless reading, the comment has been removed for clarification.
from meta-detr.
@Yuxuan-W No problem. Any discussions are welcome.
@nhw649 Do allow me some time to address your concern, as I have a full-time job now, and I haven't read my own paper for a long period of time.
from meta-detr.
@Yuxuan-W No problem. Any discussions are welcome.
@nhw649 Do allow me some time to address your concern, as I have a full-time job now, and I haven't read my own paper for a long period of time.
Okay, please. Thank you.
from meta-detr.
@Yuxuan-W The implementation is the same as the paper. And if you read codes carefully, you will find CAM is not used at every encoder layer.
When computing category codes, src is support feature, why do src, category_code and tsp need to do QSAttn? It's not mentioned in the paper. Cannot be seen from fig. 4.
In deformable_transformer.py:
def forward_supp_branch(self, src, pos, reference_points, spatial_shapes, level_start_index, padding_mask, tsp, support_boxes): # self attention src2 = self.self_attn(self.with_pos_embed(src, pos), reference_points, src, spatial_shapes, level_start_index, padding_mask) src = src + self.dropout1(src2) src = self.norm1(src) support_img_h, support_img_w = spatial_shapes[0, 0], spatial_shapes[0, 1] supp_roi = torchvision.ops.roi_align( src.transpose(1, 2).reshape(src.shape[0], -1, support_img_h, support_img_w), support_boxes, output_size=(7, 7), spatial_scale=1 / 32.0, aligned=True) supp_roi=supp_roi.mean(3).mean(2) # [episode_size, C] category_code = supp_roi.sigmoid() # [episode_size, C] if self.QSAttn: # siamese attention src, tsp = self.siamese_attn(src, inverse_sigmoid(category_code).unsqueeze(0).expand(src.shape[0], -1, -1), category_code.unsqueeze(0).expand(src.shape[0], -1, -1), tsp) # ffn src = self.forward_ffn(src + tsp) else: src = self.forward_ffn(src) return src, category_code
In the DeformableTransformerEncoder, you will find self.QSAttn == True is only for the first layer. However, if you look at the category code computed for the first layer, namely category_code[0], you will find it is not involved in the siamese_attn. In another word, only the category_code[1:] are effected by siamese_attn, but they are never used.
So I guess it's just for convenience in implementation. There's no problem and it is aligned with the paper.
from meta-detr.
I hope Yuxuan-W's comments have addressed your concerns. @nhw649 Thank you @Yuxuan-W !
from meta-detr.
Related Issues (20)
- Unable to download pre-training weight and voc dataset in coco format using Google's drive HOT 2
- Regarding Background Encoding and Prototype HOT 2
- coco fine-tuning parameters
- Can you provide the t-SNE visualization code about mmdet? HOT 3
- Is the results of multi-scale version better and why not use it? HOT 1
- Some questions about t-SNE HOT 1
- There was a problem trying to train the code.
- How to evaluate the base training performance?
- split few-shot
- could you improve the training efficiency?
- Could you provide the fine-tuned weights? HOT 1
- About visualize the results.
- How long does it take Meta-Finetuning to converge?
- Performance of Meta-DETR without meta-finetuning? HOT 7
- 训练自己的数据集 HOT 2
- 在训练自己的数据集时,类别数报错。 HOT 2
- Questions about Task Encodings, Class Prototypes, and Category Codes
- How to generate my own few_shot file just as "coco_fewshot" when finetune on custom dataset? HOT 1
- 您好,请问可以公开一下论文中可视化结果的相关代码吗? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meta-detr.