GithubHelp home page GithubHelp logo

Loss & accuracy about u-2-net HOT 8 OPEN

xuebinqin avatar xuebinqin commented on May 15, 2024 5
Loss & accuracy

from u-2-net.

Comments (8)

xuebinqin avatar xuebinqin commented on May 15, 2024 2

from u-2-net.

xuebinqin avatar xuebinqin commented on May 15, 2024 1

from u-2-net.

szerintedmi avatar szerintedmi commented on May 15, 2024

@NathanUA , thank you for taking the time to answer. Very useful tips, much appreciated.

You can think about increasing the filter numbers of the network.

Do you mean increasing "M" ( the channels in the internal layers of RSUs)? Or increase the number of layers ("L")?

In our current version model, we use 6 side outputs for reducing the overfitting. You can also try to disable some or all of the side outputs supervision to increase the model capacity.

Just to double check my understanding: you suggest to use only the BCE of the last side output as a loss function for the training? (in your train code it's called loss2 or tar).
Are you solely using this fused BCE loss instead of last output loss to avoid overfitting ? In that case it totally make sense to me to use only d0 loss for training as long we don't overfit.
It might be trivial but I can't get my head around why it would increase model capacity. Btw what you mean by "model capacity" ? :-)

BTW, I am not sure which exact accuracy measure are you using. But I suggest to use IoU or F-measure to evaluate the segmentation performance.

Used MAE but good call, we will try IoU and F-measure

(I suggest to train your model from scratch not start from our pre-trained weights).

So far our results are getting better much quicker when we start from your pretrained weights (as you can see from the graphs above). Do you think this trend would change if we train from scratch for longer?

In addition, your validation set is too small compared with your training set. I think that's why your validation losses are even smaller than training losses.

Indeed, already increased to 2000 validation samples. Will play around if we need more

RES: We didn't test that much on LR. We use the adam optimizer with default settings.

I just tried to lower the LR because it seemed the loss is oscillating. But the training might not been long enough to see the trend. Happy to share our findings if you are interested

from u-2-net.

szerintedmi avatar szerintedmi commented on May 15, 2024

btw, here are the results from 100 iterations (~240k samples) and 2000 validation samples per iteration.
image

from u-2-net.

szerintedmi avatar szerintedmi commented on May 15, 2024

thanks @NathanUA !

We run some trainings optimizing to d0 loss ( fusion loss without the side output losses). I didn't bring any significant improvement. Qualitatively it seems the same as with your original loss.
UPDATE: just run a small test on 300 samples to see loss vs. loss2 and the correlation seems pretty strong, so no surprise changing the loss fx didn't make a noticable difference
image

You mentioned you trained 120 hours with 10k images augmented to 20k. We trained with 30k images augmented to 240k. It finishes in 9-10hrs on a single GPU Colab instance. Why is so much faster for us? Have you fed the same images multiple times during the training? If so why?

from u-2-net.

ohheysherry66 avatar ohheysherry66 commented on May 15, 2024

I'm trying to retrain your model for our specific use case. I'm training with images augmented from a 30k set. I also added accuracy calculations and validations.

The loss and accuracy seems to stall no matter how I change the learning rate.
What would you recommend? Should I just train longer ? Shall I try to "freeze" or lower the LR on part of the layers (which layers? all encoders?). Or is that how far it can get?
Have you experimented with different LR algos (Cyclic etc.)?

I ran these with 120k training images (50 epochs , 200 Iterations each, batch size 12 ). Validation size: 600 images after each epoch

Training from scratch

image

Training on your pre-trained model (173.6 MB) LR=0.001 (as yours)

image

LR reduced to 0.0001 (on pre-trained model)

image

Hi, could you please tell me where to add the validate process,it seems change a lot ,thankyou.

from u-2-net.

xuebinqin avatar xuebinqin commented on May 15, 2024

from u-2-net.

ohheysherry66 avatar ohheysherry66 commented on May 15, 2024

you can first define a testing dataloader just after the training dataloader and then feed the testing data with in the if ite_num % save_frq == 0: by a for loop just like for i, data in enumerate(salobj_dataloader):. before the for loop of testing you need to change the net mode to evaluation by net.eval() and change that back to training mode by net.train() after the validation.

On Fri, Jul 17, 2020 at 1:10 AM ohheysherry66 @.***> wrote: I'm trying to retrain your model for our specific use case. I'm training with images augmented from a 30k set. I also added accuracy calculations and validations. The loss and accuracy seems to stall no matter how I change the learning rate. What would you recommend? Should I just train longer ? Shall I try to "freeze" or lower the LR on part of the layers (which layers? all encoders?). Or is that how far it can get? Have you experimented with different LR algos (Cyclic etc.)? I ran these with 120k training images (50 epochs , 200 Iterations each, batch size 12 ). Validation size: 600 images after each epoch Training from scratch [image: image] https://user-images.githubusercontent.com/7456451/82139227-4f97a300-981e-11ea-93fd-64109911391b.png Training on your pre-trained model (173.6 MB) LR=0.001 (as yours) [image: image] https://user-images.githubusercontent.com/7456451/82139394-5541b880-981f-11ea-9f09-f795bf7e9bcc.png LR reduced to 0.0001 (on pre-trained model) [image: image] https://user-images.githubusercontent.com/7456451/82139404-730f1d80-981f-11ea-9662-1f5a6324545c.png Hi, could you please tell me where to add the validate process,it seems change a lot ,thankyou. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGOROZ22JSNX3OMUR5HBDR372P5ANCNFSM4NDI5UFQ .
-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thankyou,big help!

from u-2-net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.