GithubHelp home page GithubHelp logo

Comments (7)

illtellyoulater avatar illtellyoulater commented on May 29, 2024 4

I fixed it!

Apparently the problem was that the cuda toolkit version I was installing with the conda package (11.3) was not recent enough to match my video drivers.

So I uninstalled the conda packages for torch, torchvision and cudatoolkit, and I installed torch and torchvision via pip, paying attention to pick a version which embedded a much more recent cuda toolkit version (11.5):

pip3 install torch==1.11.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html

Now I can happily dream! ;)
Thanks again for your help and to all those who have worked on this!

Btw, the other project which was also generating black images (glide-text2im), did not benefit from this fix, so it was just a coincidence! And it has to be caused by something else...

from big-sleep.

MrPalais avatar MrPalais commented on May 29, 2024 2

@illtellyoulater I had to reinstall numpy 1.22.3, but your trick works, thanks a lot :)

from big-sleep.

htoyryla avatar htoyryla commented on May 29, 2024

"the generated images are completely black (although the inference process seems to run nicely and without errors)"

Most often this happens, because when processing images with Python, pixel values can be represented either as 0..1 floats or 0...255 integers. Now, if the generated image is 0..1 but the library which is used to store it into a file expects 0..255, the result is a black image.

Some libraries are clever enough to adapt to the correct range based on the type (integer or float) but often not. It could even be that a windows implementation of a library is different.

Just my 2c worth.

from big-sleep.

illtellyoulater avatar illtellyoulater commented on May 29, 2024

@htoyryla thanks so much for the input!
Based on your reasoning, the first thing I did was inspecting the image variable that is later saved to an image file... and apparently it's just nan values, like so:

print(image)

tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]],

        [[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]],

        [[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]]])

image is created like so:
image = self.model.model()[best].cpu()
(in big_sleep.py, line 464)

So I wonder if this line could be causing any problem?
I admit I have close to zero knowledge about Torch or other ML libraries... I am actually a total newbie at ML in general, so all I can do is basically try to draw some attention to some potentially problematic code, like I did, but that's all...

I really hope this could be of any help to you or to someone else knowing more than me, who could actually try to understand what's going on...

Another super simple speculation I could make is, the image file is saved using torchvision.utils.save_image(...) function, and I am skeptic that such an important and popular library would suffer from the kind of int/float inconsistency you described in your reply... which would restrict the potentially problematic code to just big_sleep.py.

But if there's a problem in that file, then how could I be the only one experiencing it given there are many other Windows users?
It's so frustrating... :\

from big-sleep.

htoyryla avatar htoyryla commented on May 29, 2024

If image is full of nans then it is definitely not the 1 vs 255 problem, and torchvision for sure works correctly given a torch tensor. Something runs amok in the model itself. Don't have time to look deeper, it is a year since I have used this project.

But what I'd look at in this case... The process is iterative, so I'd look at the image as it evolves. An iterative process like this can run amok at some point, for instance if the learning rate is too high, and then you may get nans.

But my perspective is different, being an ai artist working with my own code, so I run into these situations all the time and have to solve them myself.

from big-sleep.

illtellyoulater avatar illtellyoulater commented on May 29, 2024

Hey, hold on, look at this!
Another ML project is joining the "black images" party...

In fact I've just found out that in my case glide-text2im is also generating black images!!! 😮

Now this is starting to get a little weird, isn't it ?

from big-sleep.

illtellyoulater avatar illtellyoulater commented on May 29, 2024

@MrPalais glad I could help :)

from big-sleep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.