GithubHelp home page GithubHelp logo

namiyousef / multi-task-learning Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 2.0 592 KB

Repository for multi task learning

Python 62.17% Jupyter Notebook 37.83%
mtl pytorch-implementation pytorch torch torchvision

multi-task-learning's Introduction

Multi-Task Learning

How to run main code:

  • Clone the repository.
  • Create a new environment from requirements.txt.
  • Run main.py.

How to use this as a library:

  • You can import functions from from the different files. The code is designed to behave like a module.
  • In particular, the function get_prebuilt_model is very useful if you want to load in-built (default) models.
  • You can add new defaults based on configurations that you build.
  • It is also possible to build models in a bespoke way.

How to run legacy code:

  • The legacy code behaves slightly differently to the main code. It is included for reference.
  • In particular, main_colab.py and colab_continue_train.py are useful if you are running on Colab and have trouble with runtimes restarting. You can use them to save models and then re-train from the previously saved state.
  • main.py from legacy should not be used.

multi-task-learning's People

Contributors

connorwatts avatar namiyousef avatar valerief412 avatar yakovsushenok avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yakovsushenok

multi-task-learning's Issues

Add regular model saving and ability to continue training from last saved state

Colab notebooks kill code, so there needs to be a feature to save/load models at regular intervals. Commit 486b668 adds a manual fix for this, however issue will remain open as a feature to integrate into submission code.

The final outcome would need to consider the random state of the data relevant. Make sure this is not reset when you import from other files. Need more learning in order to be able to do this.

Type bug when trying GPU

There's a bug due to some ype mismatch, torch.FloatTensor vs. torch.cuda.FloatTensor. There is a notebook file that you should be able to run on colab.

More details, see here

Dataset conflict. Decide on which to use.

There are two datasets in the repository provided by Yipeng. One of them was only authored around the 1st of Jan 2022.
See the repository for new dataset (https://weisslab.cs.ucl.ac.uk/WEISSTeaching/datasets/-/tree/oxpet/). For my tests I've used the new one, adding this issue to note that we NEED to standardise this for testing.

Included logs with Shaheer (GTA for IDL course):

[Yesterday 19:37] Nami, Yousef
Hi Shaheer, I hope that you are well and in good health. I'm Yousef, a student on the IDL course! I recently opened the data repository for CW2 again, and I found that there's a folder called "data_new". How does this differ from the original set? Are we expected to use the new one for the CW?

[Yesterday 20:10] Saeed, Shaheer Ullah
Hi! Hope you're well too. You can use either the new set OR the old set for the CW; this is completely your choice and the choice of dataset is not a marking point (so you won't be marked down for picking one over the other). The new one has both cats and dogs in the test set, whereas the old one only has dogs. So if you pick the old one you will be investigating impact of MTL on dog segmentation but if you pick the new one, you will be investigating the impact of MTL on cat AND dog segmentation. Again which one you choose is up to you and you will not be marked down for your choice. Please see the moodle post by Yipeng for further context: https://moodle.ucl.ac.uk/mod/forum/discuss.php?d=692925
UCL Moodle: Log in to the site

Add default model configurations

Add default model configurations for each plug and play of different models.
Configs to add:

  • resnet configurations with the current heads we're using

Useful Links

Useful links for tools that we can use

Reporting links (latex, drawing, etc.)

GitHub links

Papers

Please don't include papers here anymore. Add them under links for the report. See #5

Other Knowledge / Trivia

Colab

L1 loss used for bounding box?

In criterion.py under get_loss(), the L1Loss is used for task=="BB". Is there a reason for this? Why wasn't IoU used?

Results

Number of epochs used:
Batch_size used:
Model architecture:

  • Encoder: Resnet34 pre-trained=False
  • Decoders:
    -- Class:
    -- Segmen:
    -- BB:

Losses

Train

Configuration BB (L1Loss) Class (BCELoss) Segmen (DiceLoss) Total
Baseline
Baseline + Class
Baseline + BB
Baseline + BB + Class
Seg + bb (0.75, 0.25) 22.136 - 0.125
Seg + bb (0.25, 0.75) 20.152 - 0.122
seg + bb + class (0.5, 0.25, 0.25) 0.3001 1.0348 0.1527
seg + bb + class (0.25, 0.5, 0.25) 0.2744 1.1820 0.1582
seg + bb + class (0.25, 0.25, 0.5) 0.3564 0.77764 0.1979
seg + bb (uniform)
seg + bb (const. bern.)

Test

Configuration BB (L1Loss) Class (BCELoss) Segmen (DiceLoss) Total
Baseline - - 0.137
Baseline + Class - 1.278 0.182
Baseline + BB 21.254 - 0.126
Baseline + BB + Class 0.2892 1.4871 0.1698

Metrics

Train

Configuration Class (Accuracy)
Baseline + Class
Baseline + BB + Class

Test

Configuration Class (Accuracy)
Baseline + Class
Baseline + BB + Class

Introduction and Minutes

Hi guys,

When committing, please use the correct issue number b adding #i where i is the relevant issue number. For example, if I wanted to refer to the issue pertaining to Useful Links, I would say #1.

Please make sure you 'watch' the repository to receive emails / updates when changes are made. You can do this by clicking the eye icon, at the top of the repo.
Screenshot 2021-12-17 at 18 10 24

Also feel free to edit / update / add issues as you see fit.

Project directory and structure:

  • FILL THIS, MAKE SURE REQS MEET THAT OF YIPENG
  • ADD NOTES ABOUT BRANCHES

Report

Documentation for the report

Please feel free to add comments about sections of the report. If adding draft version of the report to GitHub please make sure that you reference using the issue number, #5.

The report can be found here.

If you want to find resources to help with writing the report (for example, how to draw neural network diagrams), see #1.

Benchmark classification performance

The MTL classification task does not perform very well as per @ConnorWatts and brings the model down. This observation was based on the MTL model with only the classification head.

My experiments

NOTE: accuracy values missing at the moment. Only Loss measured. Further note, for all of these the new_dataset is used, not the old dataset (see #6 for more details). This is something we need to standardise across the board.

1) Using ClassificationHead

  • Used 3 epochs only due to comp. constraints. Got average BinaryCrossEntropy loss of ~ 0.67

2) Using ClassificationHeadUnet

  • Basically instead of applying the classification directly after the encoding stage, the data is decoded using the UNet structure with skips from encoding stage included. The classification is then applied at the final stage.
  • Used 3 epochs. The decoding stage is identical to that of the img segmentation (but ofc, this is a separate head, so decoding weights aren't shared!!). After the decoding, global avg pool added and then dense layer added.
  • BinaryCrossEntropy loss is 0.64... so the improvement is marginal...

3) benchmark classification accuracy using SVM model

  • With 500 data points, the classification accuracy is 0.678 (note this is classification accuracy on TEST set)

Run tests on the coursework environment

  • Make sure that code is packaged in the correct form
  • Make sure that dependencies all included, and references correct
  • Make sure it runs on the specified environment

Re-design lambda's to make sense w.r.t OEQ

Currently, the way the weights are specified is arbitrary. We can choose any lambda values of our choice, without any constraints on them.
For example, one configuration has lambdas = [1, 1/100, 1/10000]

This can make sense, e.g. if the loss functions we choose are of O(1), O(100) and O(10000) respectively. However, in my opinion this choice is extremely arbitrary (note it isn't invalid per se). We need to choose losses such that they are all of the same order, then we can apply lambda's such that sum(lambdas) = 1 (thereby constraining the problem). Without this, I'm not sure if the OEQ makes sense?

After a quick search during our meeting today I found these:

Both seem to go for arbitrary losses. I would prefer going for a solution where the losses are of the same scale, allowing us to apply the constraint above. It would make more sense from an OEQ pov.

What are your thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.