haochen-wang409 / hpm Goto Github PK

[CVPR'23] Hard Patches Mining for Masked Image Modeling

Home Page: https://arxiv.org/pdf/2304.05919.pdf

License: Apache License 2.0

Python 98.47% Shell 1.53%

ade20k coco computer-vision cvpr2023 imagenet masked-autoencoder masked-image-modeling pattern-recognition self-supervised-learning unsupervised-learning

hpm's Introduction

Hi there 👋

🏫 Second year Ph.D. student of Institute of Automation, Chinese Academy of Sciences, supervised by Prof. Zhaoxiang Zhang
💻 Interested in image understanding, label efficient learning, and unsupervised representation learning
📧 Contact me at [email protected]

hpm's People

Stargazers

Watchers

Forkers

tongbaochen whuhxb qianqian121 4three2one barry-wy dl-mae

hpm's Issues

Confused about "model_teacher" in pre-training code

Hello,

Thank You for your great work. I was going through main_pretrain.py and figured out you have three models, model_teacher, model, and model_ema. From your paper I understand, you have 2 models that takes part in the pre-training process, one model (the model?) and the other an ema-based model (the model_ema?). Thus, my question is, what role is the model_teacher playing?

Thank You!

模型大小

我想知道hpm和mae参数量上的对比，我感觉好像hpm要大很多

question about COCO detection

Hi, nice to read such an interesting paper. The pretraining code is well-detailed, but this repository lacks information regarding COCO detection downstream finetuning. I am curious about how to conduct COCO detection finetuning after MAE pretraining. Could you provide more details about the detection architecture implementation and experiment config?

Questions Regarding Pretraining Experiment Configuration

I have some inquiries regarding the pretraining experiment configuration with imagenet1k for the HPM project and would appreciate obtaining more detailed information about this experiment to better understand the project's performance and resource requirements.

1.How many GPUs were used during pretraining？
2.Please provide detailed information about the GPU models or specifications used for pretraining.
3.What was the total training time for pretraining？
4.What was the GPU memory consumption per card during pretraining?

Question about heatmap

Hello, it is very nice to read this paper. I am conducting research on 'learn where to mask' just like you. Could you please explain how the heatmap in your paper is generated and how to convert weights to RGB for the heatmap? I have tried several approaches, but none of them look as visually appealing as yours. Thank you very much.

input mismatch of loss predictor

Hi,

Very interesting paper. I have a question regarding the implementation.

In your paper, the teacher model predicts loss for each patch based on fully visible image, while the student model learns to predict masked patch loss based on the unmasked patches. Is there a mismatch during training and inference of loss predictor?

Question about pretrain models on 800 epochs

Hello, nice to read this outstanding paper, I have some questions about the settings when pretrain model on 800 epochs, did your settings keep the same as pretrain model on 200 epoch in Table S1 and S2?(such as: pretrain weight decay 0.05, pretrain layer-wise lr decay 0.8, finetune training epochs 100)

预训练权重可以分享出来吗

readme中只有微调完的权重，请问是否可以分享预训练权重，方便我们在其他数据集上进行微调

scaler

assert len(optimizer_state["found_inf_per_device"]) > 0, "No inf checks were recorded for this optimizer."
作者大大您好，我在三维医学图像上用您的方法，scaler报错这个，去掉scaler后loss完全不变，参数没有更新，请问可能是什么问题

Availability of Pre-trained Models

I am interested in your project and I am keen on experimenting with it for my research. I was wondering if you have any plans to provide pre-trained models that users can directly use without going through the training process themselves. Having access to pre-trained models can help in significantly reducing the time and resources required to get started with the project.

Thank you for considering this request. Looking forward to your response.

Request for certain experimental matters

Hi, thanks for your great work, I have some experimental matters here.

For the ViT-Base HPM pre-training config, the paper only provides a config for 200 epochs, and I'm not sure whether the pretrain_base.sh in this repository is for 800 epochs or not. Could you provide config for 800 and 1600 epochs?
Why is the optimizer different from the original LARS optimizer in MAE for the linear_probing experiments?
Have you tested different batch sizes during pre-training, Does batch_size impact the model performance significantly?
Looking forward to hearing back from you soon. Thank you!

haochen-wang409 / hpm Goto Github PK

hpm's Introduction

Hi there 👋

hpm's People

Stargazers

Watchers

Forkers

hpm's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs