GithubHelp home page GithubHelp logo

Comments (11)

yl-1993 avatar yl-1993 commented on July 19, 2024 1

@luzai Basically, there are 3 ways to implement bs > 1.
(1) Accumulate loss before back-propagation. This is the simplest way but not very efficient.
(2) Customize collate_fn to merge them into a big graph (as you suggested) or pad graphs to the max size in the batch.
(3) Use distributed training, i.e., each process handles a graph.

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024 1

@luzai I guess the question is why GCN-D can predict IoU since we don't have $\hat{P}$ as input.
(1) I agree with you that larger receptive field may potentially lead to better result. Actually, the generated proposals contain different receptive fields and some proposals are larger than the ground-truth.
(2) We have to control the receptive field in a reasonable range. Note that it is similar to object detection with very dense objects on an image, thus if we use very large receptive field, it may make the main object unclear.
(3) IoU is an indicator to evaluate the quality of a cluster proposal. Our goal is to rank all proposals with a meaningful predicted score. There may exist better indicators.
(4) An alternative solution is to generate proposals directly on the entire affinity graph, like faster-rcnn in object detection. In this way, the receptive field will not be a problem.

I think it is a very good question and worth further investigation.

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024

@zhaomengao We have not implemented batch_size_per_gpu > 1 yet.

  1. If you want to implement batch_size_per_gpu > 1, you can customize collate_fn since graphs have varied size inside a batch.
  2. Make sure you generate multi-view proposals during training. Generally, more proposals help train the model better. The model structure is the same as the pretrained model. The training code will be released later in late May.

from learn-to-cluster.

zhaomengao avatar zhaomengao commented on July 19, 2024

Thanks a lot for your answer.
The training proposals are generated through part0_train.bin and the given faiss_k_80.npz, that is is_rebuild=False, is that right? the th is set as [0.6 0.65 0.7 0.75].
Besides, can I refer to the parameter settings in cfg_0.7_0.75.yaml?

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024

@zhaomengao yes, you can refer to the settings in cfg_0.7_0.75.yaml, I have updated the config in dev branch. One different thing lies in the batch size, as we use bs=32 for training. Besides, you can obtain more training proposals by using different k or iteratively building the super vertices.

from learn-to-cluster.

luzai avatar luzai commented on July 19, 2024

Thank you for your great work!

May I ask some details about implement bs=32? I guess we need to merge graphs of varied size into one graph and do sparse matrix multiplication on the large graph. Is this correct?

Meanwhile, may I ask a question about the paper? As shown in the figure from the paper, the output of gcn is 0.82 when it is feeded by a graph of 5 nodes (with adj, features). Do you think the gcn need to be feed with larger graph (maybe we can call it larger reception field), so that it knows the iou is 0.82, since iou concerns all ground truth vertex of one class?

image

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024

@luzai I am not sure I fully understand your second question.
(1) There are different proposals in this figure and their receptive field are different.
(2) This teaser is only a demonstration of our algorithm, not the actual result. In practice, the vertices of a proposal for ms1m usually lies in 30~300.

Keep asking if I don't fully answer your question.

from learn-to-cluster.

luzai avatar luzai commented on July 19, 2024

Thank you very much for your detailed response!
For the second question, I am sorry for not explaining clearly. May I describe the question in more details?

The input to GCN-D is the proposal $P$. It contains the information of $P \bigcap \hat P$, which is a subset of $P$. (But not full of $P \bigcup \hat P$). Thus, GCN-D is capable of prediction the purity IoP. As for IoU, I guess GcN-D may find some cluster patterns to predict it, but it may predict it more precisely, given more input vertex.

Please let me know if I do not fully understand the idea of paper. Thank you!

image
image

from learn-to-cluster.

luzai avatar luzai commented on July 19, 2024

Great thanks for your detailed response!

I can understand that (1) Some of the proposals is a super-set of ground-truth cluster, since they are generated in multi-scale. (2) GCN-D will evaluate each proposal by the indicator IoU. Larger proposal may receives higher score. (There may be other intrinsic patterns for a proposal of high recall, not just large). (3) DeOverlap rank the proposals and predict clusters. Larger proposal may be preferred in this step. Thus, the whole pipeline can generate high quality clusters.

Your comparation with object detection is quite inspiring. Expect more solid works!

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024

@luzai Yes, your are right. The key lies in the multi-scale training and testing. This is a good point and we are going to do more analysis on the learned patterns.

from learn-to-cluster.

yl-1993 avatar yl-1993 commented on July 19, 2024

@zhaomengao @luzai batch_size > 1 is supported now and the training code is released.

from learn-to-cluster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.