Amazing project！！ I used a 1024 * 800 image and executed the followi

OOM about stablesr HOT 4 CLOSED

iceclear commented on August 27, 2024

OOM

from stablesr.

Comments (4)

zcdliuwei commented on August 27, 2024 1

I will try decrease the sampling steps, undoubtedly, this will accelerate inference,
but to ensure the final SR effect, this may not be the preferred solution，
as for distributed inference, currently I only have one A100 server.

This is the SR method that I have seen that can support any input size、any upscale、suitable for images such as wild and AIGC, and has almost the best effect.

Thanks again for your amazing work. I will continue to pay attention to this issue.

from stablesr.

IceClear commented on August 27, 2024

Hi, thanks for your interest.
Large-resolution results require huge GPU memory since for sr_val_ddpm_text_T_vqganfin_oldcanvas.py, it decodes the whole latent codes together for the final outputs. This can avoid border artifacts but 32G memory only can host 2k resolution at most.
For your case, you just need to turn to use sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py which use the chop operation to generate image part by part for memory saving, but may lead to border artifacts. Actually, we also use this file for generating the SR result of this image.

from stablesr.

zcdliuwei commented on August 27, 2024

Yes
when I switch to use sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py，I got the expected output, it was perfect，and almost no boundary artifacts can be seen！！

Although I don't have many test cases, I believe that using this script can yield reasonable super-resolution results in the vast majority of cases, with almost no boundary artifacts

The only problem now is that the inference time is too long. The 4K example above took more than an hour in total，is there any optimization space in the inference time, or if I mistake used your script?
look forward to your reply

from stablesr.

IceClear commented on August 27, 2024

The inference time can be very long for large resolutions and sometimes we observe boundary artifacts, it depends on the content.
Currently, we have not involved any inference acceleration.
I used to try DDIM but it leads to weird results so I gave up. But other accelerated technologies may work.
Under the current case, I think you can decrease the sampling steps. It is 200 by default but reduce it to 50 or even 20 sometimes can also lead to relatively good results, though some details tend to be a little blurry compared with the default settings. This is expected for diffusion models.

BTW, another thing you can try if you are interested is to add multi-GPU support. Although the batch size is 1, since we divide the image into multiple tiles, it is still possible to deal with them separately. Just make sure they are under the same seed.

from stablesr.

OOM about stablesr HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs