Dear author, I tried to reproduce your work and I'm currently want t

Question for Finetuning about stablesr HOT 17 CLOSED

iceclear commented on August 27, 2024

Question for Finetuning

from stablesr.

Comments (17)

xyIsHere commented on August 27, 2024 1

Thank you so much. I will keep running the experiments and let you know if there is any update.

from stablesr.

IceClear commented on August 27, 2024

Hi. Without training settings as well as the figures for comparison, it is hard to tell the problem. Maybe the test data distribution is very different from the validation one.

from stablesr.

xyIsHere commented on August 27, 2024

Thanks for your very quick response. I got similar results as this issue #26.

The following zip file contains the training config. I just follow your suggestion to reset the model path.
2023-06-25T16-19-48-project.zip

And here is the results (right: result; left: input). The bottom one is a sample from validation set and it is also has been used for training. The above one did has some difference with the input, but the result is very far from the model you provided.

Thanks!

from stablesr.

IceClear commented on August 27, 2024

Hi.
First, the input is different. My result is generated from a resized 128x128 image while it seems that you directly used the original 720x720 image.
Second, my results are generated with cfw weight = 0.5.
Third, my model is trained on 8 V100 GPUs, of which the training batch size, i.e., 48x4 should be much larger than yours, I guess.
You can train longer for better results, from my experience.

from stablesr.

xyIsHere commented on August 27, 2024

Thanks! By the way, how long did you spend for the fine-tuning stage, maybe just around 24 hours?

from stablesr.

IceClear commented on August 27, 2024

For several days. The longer the better. One week should be enough.

from stablesr.

xyIsHere commented on August 27, 2024

Hi. First, the input is different. My result is generated from a resized 128x128 image while it seems that you directly used the original 720x720 image. Second, my results are generated with cfw weight = 0.5. Third, my model is trained on 8 V100 GPUs, of which the training batch size, i.e., 48x4 should be much larger than yours, I guess. You can train longer for better results, from my experience.

Thanks a lot for your help. Actually I did not train the cfw yet. Currently, I just want to make sure my fine-tuning results is reasonable. I think both of the resolution and the cfw weight should not be the reason.

As shown in the above figure, I use the same image as input and test with different fine-tuning model (stablesr_000117.ckpt that you provided and the model trained with 4 A100 cards with batch size of 12 and accumulated_grad_batches of 4). I finetune the model for about 24hrs and get the epoch_000131.ckpt).

The command that I used is "python scripts/sr_val_ddpm_text_T_vqganfin_old.py --config configs/stableSRNew/v2-finetune-test.yaml --ckpt "./pretrained_models/stablesr_000117.ckpt" --vqgan_ckpt "./pretrained_models/vqgan_cfw_00011.ckpt" --init-img ./inputs/test_example --outdir out_landscape/ --ddpm_steps 200 --dec_w 0.0 --suffix 'stablesr117'".

I set the dec_w to 0.0, so the result is achieved by only consider the fine-tuning without CFW. Did you have other suggestion for me to dubug? Or do you think I just need to train more days?

from stablesr.

xyIsHere commented on August 27, 2024

For several days. The longer the better. One week should be enough.

So the provided model (stable_000117.ckpt) is achieved by fine-tuned for around a week? I thought 117 epoch model do not need to spend for so much time to get.

from stablesr.

IceClear commented on August 27, 2024

The speed of A100 is more than 2x than V100, and I do not remember the exact training time of the 512 model.
It is hard to say whether there is a problem.
You may check the performance of different epochs on the real image.
The performance may vary for different epochs.
From my experience, training longer do improves the performance.

from stablesr.

xyIsHere commented on August 27, 2024

Dear author,
Could you also show me how to use the thop package to print the params and flops of the stablesr. I tried to this use the test script (vqganfin_old.py) but not get succeed yet. Thanks!

from stablesr.

xyIsHere commented on August 27, 2024

Dear author, Could you also show me how to use the thop package to print the params and flops of the stablesr. I tried to this use the test script (vqganfin_old.py) but not get succeed yet. Thanks!

I finally solved this problem.

from stablesr.

xyIsHere commented on August 27, 2024

Dear author,
Here the training log is attached. I'm wondering if there is any difference with yours?
train_reproduce_4card_bs12.log

from stablesr.

BobbyZ04 commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this:

The latent shape I checked is:

This is where I generated them:

Thank you

from stablesr.

BobbyZ04 commented on August 27, 2024

Thanks a lot!

from stablesr.

xyIsHere commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this:

The latent shape I checked is:

This is where I generated them:

Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

from stablesr.

BobbyZ04 commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this:
The latent shape I checked is:
This is where I generated them:
Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

I think they are making sense but yea different than the author's results hmm, maybe can try fix the seeds for inference to check if the finetuning is successful? https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

from stablesr.

xyIsHere commented on August 27, 2024

Hi may I ask how did you generate the latent for the second stage training? it's supposed to be 4D? Cause I got the error saying dimension incorrect like this:
The latent shape I checked is:
This is where I generated them:
Thank you

I currently only conducted the fine-tuning experiments and haven't trained the CFW since I found that my fine-tuning result is not good enough to train the CFW. How about your fine-tuning results? For training the CFW, I saw there is a issue #28 that might can help you.

I think they are making sense but yea different than the author's results hmm, maybe can try fix the seeds for inference to check if the finetuning is successful? https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

I'm wondering if it is possible to share one example that you generated using only the fine-tuned model? Thanks a lot!

from stablesr.

Question for Finetuning about stablesr HOT 17 CLOSED

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs