GithubHelp home page GithubHelp logo

ResNet structure about pytorch-meta-dataset HOT 5 OPEN

mboudiaf avatar mboudiaf commented on September 27, 2024
ResNet structure

from pytorch-meta-dataset.

Comments (5)

chmxu avatar chmxu commented on September 27, 2024

Also, when I use the default script for episodic training the usage of RAM increases dramatically during training. The model can use about 100G RAM after about 300 iterations. I don't know if this is reasonable.

from pytorch-meta-dataset.

mboudiaf avatar mboudiaf commented on September 27, 2024

Hi,

Thanks for raising this issue. Let me investigate on both problems and get back to you ASAP.

Update :

  1. Could you try again and let me know if the RAM problem is solved ?
  2. As for the resnet structure, there is indeed some discrepancy in the litterature between resnet18 (implemented in my code) and the custom resnet 12 used in several few-shot works. I will add the latter architecture soon.

from pytorch-meta-dataset.

chmxu avatar chmxu commented on September 27, 2024

Hi, thank you for your reply!
I have tried to modify the training script based on your new version to skip the model forward and backward and only iterate the dataloader and print the memory usage as follow

    import psutil
    for i, data in enumerate(tqdm_bar):
        if i >= args.num_updates:
            break

        print("PERCENTAGE RAM USED", psutil.virtual_memory().percent)
        continue

In my trial the percentage of used memory keeps increasing. I think there may be some potential memory leakage when reading the tfrecord files but I cannot figure it out.
My pytorch version is 1.9.0, with cuda 11.1. Maybe you can try my code to see if you can reproduce the problem.

from pytorch-meta-dataset.

mboudiaf avatar mboudiaf commented on September 27, 2024

I have tried my new code before pushing, and I had no memory leakage. When choosing my pytorch loader, the RAM capped at 16.5 GB. Please can you confirm that by running on my original code:

bash scripts/train.sh protonet resnet18 ilsvrc_2012

you don't have any leakage ? Thanks.

from pytorch-meta-dataset.

chmxu avatar chmxu commented on September 27, 2024

I re-clone the repo and trained the protonet with your original code. After 1400 iterations 23G RAM is used. When I train the model with 4 GPUs (by modifying the gpu configuration in base.yaml), about 80G RAM is used at 1100 iterations. And the usage keeps increasing slowly in both cases.

I assume the RAM used by the model is correlated with number of GPUs (since DDP is used) and size of an episode. In this way when the episode is large, which is exact the case in meta-dataset where the largest support set can contain 500 images, and when I want to use multiple GPUs, the code may use incredibly large RAM. I wonder if there is any solution to this problem.

from pytorch-meta-dataset.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.