GithubHelp home page GithubHelp logo

GPU Support about stemroller HOT 16 CLOSED

stemrollerapp avatar stemrollerapp commented on June 10, 2024 1
GPU Support

from stemroller.

Comments (16)

Arecsu avatar Arecsu commented on June 10, 2024 4

image

UPDATE: success!!! Super fast. Really fast. My CPU takes like 4-7 minutes to process a song. I have a 8-core AMD 3700x @ 4.2ghz.

And with the GPU (2070 super) it takes like 20 seconds πŸ£πŸŽ‰

demucs-cxfreeze

  • Used pyenv and virtualenv to manage a clean installation like your repo
  • Python 3.10.6
  • pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 demucs SoundFile cx-Freeze
  • rest of the steps on your repo

No changes to the app. Just replaced the demucs-cxfreeze πŸ“ folder. It's very heavy on file-size though, adding +3gb uncompressed.

If a SteamRoller release is going to be made compatible with nvidia GPUs, we should be aware of this somehow (from demucs repo):

If you want to use GPU acceleration, you will need at least 3GB of RAM on your GPU for demucs. However, about 7GB of RAM will be required if you use the default arguments. Add --segment SEGMENT to change size of each split. If you only have 3GB memory, set SEGMENT to 8 (though quality may be worse if this argument is too small). Creating an environment variable PYTORCH_NO_CUDA_MEMORY_CACHING=1 can help users with even smaller RAM such as 2GB (I separated a track that is 4 minutes but only 1.5GB is used), but this would make the separation slower.

If you do not have enough memory on your GPU, simply add -d cpu to the command line to use the CPU. With Demucs, processing time should be roughly equal to 1.5 times the duration of the track.

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 3

Thank you so much for testing this! I merged your PR into demucs-cxfreeze and will try to upload a new build with CUDA support by the end of the weekend. Once that's done, I'll make a corresponding new version of StemRoller with the updated demucs-cxfreeze bundle.

from stemroller.

Arecsu avatar Arecsu commented on June 10, 2024 2

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 1

Cool. Yeah it's a little more complex than just changing StemRoller's code - actually the demucs-cxfreeze package needs to be refrozen with CUDA PyTorch instead of the CPU one. Shouldn't be too difficult but just takes some time. I'll send you a new build when I get a chance! Thanks for looking into it.

from stemroller.

Arecsu avatar Arecsu commented on June 10, 2024 1

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 1

Thanks @aleph23 and @rdeavila for the suggestions. If you think users would be comfortable with manually installing it, I can definitely add a section to the README about how to configure it. I'll see if I can release just the GPU-specific components as sort of a "patch" that could just be unzipped into the main app directory.

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 1

If any of you have a PyTorch CUDA compatible GPU and want to test out the latest update to the develop branch then please let me know if it works for you (and if not, what errors you see as output) and how long it takes to split with GPU enabled. I don't have a recent enough GPU on my device to test it, so it'd be helpful if I could get confirmation from someone else. Maybe @Arecsu since you seemed to have success doing it yourself earlier?

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 1

Glad to hear it worked! Thanks so much for testing. Yes, this is using the latest Demucs with the htdemucs_ft model which should be the best. You can check the console output (when running in dev mode) to make sure it is using the demucs-cxfreeze from this repo instead of Demucs installed to your system (but that should be forced by the way the env vars are set for the child process). And the goal of compressing to 7z SFX was that now it should be uploaded to GitHub releases, so the plan is for this to be the next release!

from stemroller.

Arecsu avatar Arecsu commented on June 10, 2024 1

Yes. It is using htdemucs_ft indeed. Looking awesome already. Thank you so much!

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024 1

New version is out now! Download from https://github.com/stemrollerapp/stemroller/releases

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024

That's a great question. I could theoretically implement this feature, but I don't have a GPU that's recent enough to use with PyTorch (Demucs' ML backend), so I couldn't test it myself. If you'd like, I can try to do an experimental build with GPU support and you can let me know if it works for you or not! However, it might take me a few weeks since I'm currently quite busy with other projects and don't have as much time to focus on StemRoller.

Would you like me to make a GPU-compatible build sometime and send you a message when it's ready to test?

from stemroller.

iffyloop avatar iffyloop commented on June 10, 2024

Unfortunately demucs-cxfreeze with CUDA is 2.2GB compressed to ZIP, and GitHub Releases only allows files up to 2GB, so I'm not sure exactly how I'd be able to distribute this with StemRoller. I'll look into splitting the archive into chunks or finding another place to store it, but it may be quite a while before GPU support is available as this poses a slight logistical problem.

from stemroller.

aleph23 avatar aleph23 commented on June 10, 2024

The HD hog is Torch. Maybe write a script for anyone wanting Nvidia GPU support (and it IS worth it, it's way faster) to pull the version of Torch with CUDA support themselves. The CLI install is
python.exe -m pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
I know jack about Electron and how it deals with python dependencies but if you can offload Torch you'll save 2G compressed (4,502,012,008 bytes on disk size)

from stemroller.

rdeavila avatar rdeavila commented on June 10, 2024

Maybe you can write a how-to on README.md to anyone interested on run StemRoller on Nvidia (like me).

from stemroller.

Arecsu avatar Arecsu commented on June 10, 2024

I can test it, sure. So far, I've cloned the develop branch and build it. Small bug here in ResultCard.svelte:

{#if status === 'processing'}
  <Button Icon={LoadingSpinnerIcon} text="Processing" disabled={true} />
{#if status === 'downloading'}
  <Button Icon={LoadingSpinnerIcon} text="Downloading" disabled={true} />
{:else if status === 'queued'}
  <Button Icon={CollectionIcon} text="Queued" disabled={true} />

Pull request β†’ #31

There are two #if. Second one should be :else if :p

I see in your latest commits a Github Actions script that should build a CUDA version. At least for me locally, I don't get any automatic PyTorch download during the build process. How can I manage to build one? Or is it a build available to download and test?

Edit: figured out I have to run npm run download-third-party-apps and then build it :)

from stemroller.

Arecsu avatar Arecsu commented on June 10, 2024

Everything works great so far! Is it using the newest version of demucs? The whole package compressed using 7z level 9 LZMA2 is ~1.6gb. I think it can be uploaded to Github releases

from stemroller.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.