Thank you for the paper and the source code! I have run test_cluster_det.sh on gi

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

batch_size can't be greater than 1 about learn-to-cluster HOT 11 CLOSED

yl-1993 commented on July 19, 2024

batch_size can't be greater than 1

from learn-to-cluster.

Comments (11)

yl-1993 commented on July 19, 2024 1

@luzai Basically, there are 3 ways to implement bs > 1.
(1) Accumulate loss before back-propagation. This is the simplest way but not very efficient.
(2) Customize collate_fn to merge them into a big graph (as you suggested) or pad graphs to the max size in the batch.
(3) Use distributed training, i.e., each process handles a graph.

from learn-to-cluster.

yl-1993 commented on July 19, 2024 1

@luzai I guess the question is why GCN-D can predict IoU since we don't have $\hat{P}$ as input.
(1) I agree with you that larger receptive field may potentially lead to better result. Actually, the generated proposals contain different receptive fields and some proposals are larger than the ground-truth.
(2) We have to control the receptive field in a reasonable range. Note that it is similar to object detection with very dense objects on an image, thus if we use very large receptive field, it may make the main object unclear.
(3) IoU is an indicator to evaluate the quality of a cluster proposal. Our goal is to rank all proposals with a meaningful predicted score. There may exist better indicators.
(4) An alternative solution is to generate proposals directly on the entire affinity graph, like faster-rcnn in object detection. In this way, the receptive field will not be a problem.

I think it is a very good question and worth further investigation.

from learn-to-cluster.

yl-1993 commented on July 19, 2024

@zhaomengao We have not implemented batch_size_per_gpu > 1 yet.

If you want to implement batch_size_per_gpu > 1, you can customize collate_fn since graphs have varied size inside a batch.
Make sure you generate multi-view proposals during training. Generally, more proposals help train the model better. The model structure is the same as the pretrained model. The training code will be released later in late May.

from learn-to-cluster.

zhaomengao commented on July 19, 2024

Thanks a lot for your answer.
The training proposals are generated through part0_train.bin and the given faiss_k_80.npz, that is is_rebuild=False, is that right? the th is set as [0.6 0.65 0.7 0.75].
Besides, can I refer to the parameter settings in cfg_0.7_0.75.yaml?

from learn-to-cluster.

yl-1993 commented on July 19, 2024

@zhaomengao yes, you can refer to the settings in cfg_0.7_0.75.yaml, I have updated the config in dev branch. One different thing lies in the batch size, as we use bs=32 for training. Besides, you can obtain more training proposals by using different k or iteratively building the super vertices.

from learn-to-cluster.

luzai commented on July 19, 2024

Thank you for your great work!

May I ask some details about implement bs=32? I guess we need to merge graphs of varied size into one graph and do sparse matrix multiplication on the large graph. Is this correct?

Meanwhile, may I ask a question about the paper? As shown in the figure from the paper, the output of gcn is 0.82 when it is feeded by a graph of 5 nodes (with adj, features). Do you think the gcn need to be feed with larger graph (maybe we can call it larger reception field), so that it knows the iou is 0.82, since iou concerns all ground truth vertex of one class?

from learn-to-cluster.

yl-1993 commented on July 19, 2024

@luzai I am not sure I fully understand your second question.
(1) There are different proposals in this figure and their receptive field are different.
(2) This teaser is only a demonstration of our algorithm, not the actual result. In practice, the vertices of a proposal for ms1m usually lies in 30~300.

Keep asking if I don't fully answer your question.

from learn-to-cluster.

luzai commented on July 19, 2024

Thank you very much for your detailed response!
For the second question, I am sorry for not explaining clearly. May I describe the question in more details?

The input to GCN-D is the proposal $P$. It contains the information of $P \bigcap \hat P$, which is a subset of $P$. (But not full of $P \bigcup \hat P$). Thus, GCN-D is capable of prediction the purity IoP. As for IoU, I guess GcN-D may find some cluster patterns to predict it, but it may predict it more precisely, given more input vertex.

Please let me know if I do not fully understand the idea of paper. Thank you!

from learn-to-cluster.

luzai commented on July 19, 2024

Great thanks for your detailed response!

I can understand that (1) Some of the proposals is a super-set of ground-truth cluster, since they are generated in multi-scale. (2) GCN-D will evaluate each proposal by the indicator IoU. Larger proposal may receives higher score. (There may be other intrinsic patterns for a proposal of high recall, not just large). (3) DeOverlap rank the proposals and predict clusters. Larger proposal may be preferred in this step. Thus, the whole pipeline can generate high quality clusters.

Your comparation with object detection is quite inspiring. Expect more solid works!

from learn-to-cluster.

yl-1993 commented on July 19, 2024

@luzai Yes, your are right. The key lies in the multi-scale training and testing. This is a good point and we are going to do more analysis on the learned patterns.

from learn-to-cluster.

yl-1993 commented on July 19, 2024

@zhaomengao @luzai batch_size > 1 is supported now and the training code is released.

from learn-to-cluster.

batch_size can't be greater than 1 about learn-to-cluster HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs