GithubHelp home page GithubHelp logo

Question for Finetuning about stablesr HOT 17 CLOSED

iceclear avatar iceclear commented on August 27, 2024
Question for Finetuning

from stablesr.

Comments (17)

xyIsHere avatar xyIsHere commented on August 27, 2024 1

Thank you so much. I will keep running the experiments and let you know if there is any update.

from stablesr.

IceClear avatar IceClear commented on August 27, 2024

Hi. Without training settings as well as the figures for comparison, it is hard to tell the problem. Maybe the test data distribution is very different from the validation one.

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Thanks for your very quick response. I got similar results as this issue #26.

The following zip file contains the training config. I just follow your suggestion to reset the model path.
2023-06-25T16-19-48-project.zip

And here is the results (right: result; left: input). The bottom one is a sample from validation set and it is also has been used for training. The above one did has some difference with the input, but the result is very far from the model you provided.
image
image

Thanks!

from stablesr.

IceClear avatar IceClear commented on August 27, 2024

Hi.
First, the input is different. My result is generated from a resized 128x128 image while it seems that you directly used the original 720x720 image.
Second, my results are generated with cfw weight = 0.5.
Third, my model is trained on 8 V100 GPUs, of which the training batch size, i.e., 48x4 should be much larger than yours, I guess.
You can train longer for better results, from my experience.

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Thanks! By the way, how long did you spend for the fine-tuning stage, maybe just around 24 hours?

from stablesr.

IceClear avatar IceClear commented on August 27, 2024

For several days. The longer the better. One week should be enough.

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Hi. First, the input is different. My result is generated from a resized 128x128 image while it seems that you directly used the original 720x720 image. Second, my results are generated with cfw weight = 0.5. Third, my model is trained on 8 V100 GPUs, of which the training batch size, i.e., 48x4 should be much larger than yours, I guess. You can train longer for better results, from my experience.

Thanks a lot for your help. Actually I did not train the cfw yet. Currently, I just want to make sure my fine-tuning results is reasonable. I think both of the resolution and the cfw weight should not be the reason.
image
As shown in the above figure, I use the same image as input and test with different fine-tuning model (stablesr_000117.ckpt that you provided and the model trained with 4 A100 cards with batch size of 12 and accumulated_grad_batches of 4). I finetune the model for about 24hrs and get the epoch_000131.ckpt).

The command that I used is "python scripts/sr_val_ddpm_text_T_vqganfin_old.py --config configs/stableSRNew/v2-finetune-test.yaml --ckpt "./pretrained_models/stablesr_000117.ckpt" --vqgan_ckpt "./pretrained_models/vqgan_cfw_00011.ckpt" --init-img ./inputs/test_example --outdir out_landscape/ --ddpm_steps 200 --dec_w 0.0 --suffix 'stablesr117'".

I set the dec_w to 0.0, so the result is achieved by only consider the fine-tuning without CFW. Did you have other suggestion for me to dubug? Or do you think I just need to train more days?

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

For several days. The longer the better. One week should be enough.

So the provided model (stable_000117.ckpt) is achieved by fine-tuned for around a week? I thought 117 epoch model do not need to spend for so much time to get.

from stablesr.

IceClear avatar IceClear commented on August 27, 2024

The speed of A100 is more than 2x than V100, and I do not remember the exact training time of the 512 model.
It is hard to say whether there is a problem.
You may check the performance of different epochs on the real image.
The performance may vary for different epochs.
From my experience, training longer do improves the performance.

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Dear author,
Could you also show me how to use the thop package to print the params and flops of the stablesr. I tried to this use the test script (vqganfin_old.py) but not get succeed yet. Thanks!

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Dear author, Could you also show me how to use the thop package to print the params and flops of the stablesr. I tried to this use the test script (vqganfin_old.py) but not get succeed yet. Thanks!

I finally solved this problem.
image

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Dear author,
Here the training log is attached. I'm wondering if there is any difference with yours?
train_reproduce_4card_bs12.log

from stablesr.

BobbyZ04 avatar BobbyZ04 commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this:
image

The latent shape I checked is:
image

This is where I generated them:
image

Thank you

from stablesr.

BobbyZ04 avatar BobbyZ04 commented on August 27, 2024

Thanks a lot!

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this: image

The latent shape I checked is: image

This is where I generated them: image

Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

from stablesr.

BobbyZ04 avatar BobbyZ04 commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this: image
The latent shape I checked is: image
This is where I generated them: image
Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

I think they are making sense but yea different than the author's results hmm, maybe can try fix the seeds for inference to check if the finetuning is successful? https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

from stablesr.

xyIsHere avatar xyIsHere commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this: image
The latent shape I checked is: image
This is where I generated them: image
Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

I think they are making sense but yea different than the author's results hmm, maybe can try fix the seeds for inference to check if the finetuning is successful? https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

I'm wondering if it is possible to share one example that you generated using only the fine-tuned model? Thanks a lot!

from stablesr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.