GithubHelp home page GithubHelp logo

Comments (5)

woodfrog avatar woodfrog commented on June 7, 2024 1

In the current implementation, the number of edges after edge filtering is set to be (3 * N) instead of O(N^2), where N is the number of corners (check here and the corresponding descriptions in the paper's Sec.4.2). We don't use O(N^2) because 1) most edge candidates are easy negatives and can be eliminated with independent processing (i.e., the edge filtering part), and feeding all edges into transformer decoder is a waste; 2) keeping all the edge candidates makes the computational cost of transformer decoders unaffordable.

In your case, I guess the GPU memory is used up even before the edge filtering part, as you have too many corners. A potential solution would be: 1) run the edge filtering on all O(N^2) edge candidates in an iterative manner and eliminate all easy negatives; 2) try to run the edge transformer decoder part with the remaining edge candidates. But running a transformer decoder with over 2000 input nodes is still computationally expensive, and you still need many GPU resources.

Another workaround could be splitting each big scene into multiple sub-divisions and running the algorithm on each part separately. As your scene is so huge, the relation between far areas might be weak, and this division might not hurt the performance significantly.

Hope this will help :)

from heat.

zssjh avatar zssjh commented on June 7, 2024

In the current implementation, the number of edges after edge filtering is set to be (3 * N) instead of O(N^2), where N is the number of corners (check here and the corresponding descriptions in the paper's Sec.4.2). We don't use O(N^2) because 1) most edge candidates are easy negatives and can be eliminated with independent processing (i.e., the edge filtering part), and feeding all edges into transformer decoder is a waste; 2) keeping all the edge candidates makes the computational cost of transformer decoders unaffordable.

In your case, I guess the GPU memory is used up even before the edge filtering part, as you have too many corners. A potential solution would be: 1) run the edge filtering on all O(N^2) edge candidates in an iterative manner and eliminate all easy negatives; 2) try to run the edge transformer decoder part with the remaining edge candidates. But running a transformer decoder with over 2000 input nodes is still computationally expensive, and you still need many GPU resources.

Another workaround could be splitting each big scene into multiple sub-divisions and running the algorithm on each part separately. As your scene is so huge, the relation between far areas might be weak, and this division might not hurt the performance significantly.

Hope this will help :)

Thank you, It's very helpful. I'll try it!

from heat.

zssjh avatar zssjh commented on June 7, 2024

Hello, @woodfrog , still a problem about this dataset, our dataset is about 120 images, I augment 10 times to get about 1000 inputs, but the training is still over fitting at about epoch 50, so the network learning nothing now. So for small dataset, which part of the network can I remove or simplify that has relative little impact on accuracy? Or do you have other suggestions? Thank you very much!

from heat.

woodfrog avatar woodfrog commented on June 7, 2024

Hello, @woodfrog , still a problem about this dataset, our dataset is about 120 images, I augment 10 times to get about 1000 inputs, but the training is still over fitting at about epoch 50, so the network learning nothing now. So for small dataset, which part of the network can I remove or simplify that has relative little impact on accuracy? Or do you have other suggestions? Thank you very much!

Hi @zssjh, according to your previous description, your dataset seems to contain quite large-scale scenes, so I don't think 120 of such scenes would lead to very serious overfitting. Could you elaborate on what you observed for "so the network learns nothing now"? If you try to do a test on the training images, would the results be perfect? If this is the case, then data augmentation should be the right way to go -- what is your current augmentation strategy to get the 10 augmented copies?

from heat.

zssjh avatar zssjh commented on June 7, 2024

Hi, @woodfrog
Thank you for your reply! I train about 300 epochs, at about 20 epoch, the val loss began to rise until the end, including corner loss, s1 edge loss, image decorator loss. Only geometry loss did not rise, but remained unchanged from the 150th epochs, so I judged this situation as over fitting. According to your suggestion, I tested the best checkpoint (from 144 epoch) on the testset, and I found that the network seemed to have learned some rules, About 40% of edges and corners can be detected correctly, but it seems that the over fitting prevents the network from learning better.

from heat.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.