Comments (16)
UPDATE: success!!! Super fast. Really fast. My CPU takes like 4-7 minutes to process a song. I have a 8-core AMD 3700x @ 4.2ghz.
And with the GPU (2070 super) it takes like 20 seconds π£π
demucs-cxfreeze
- Used pyenv and virtualenv to manage a clean installation like your repo
- Python 3.10.6
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 demucs SoundFile cx-Freeze
- rest of the steps on your repo
No changes to the app. Just replaced the demucs-cxfreeze
π folder. It's very heavy on file-size though, adding +3gb uncompressed.
If a SteamRoller release is going to be made compatible with nvidia GPUs, we should be aware of this somehow (from demucs repo):
If you want to use GPU acceleration, you will need at least 3GB of RAM on your GPU for demucs. However, about 7GB of RAM will be required if you use the default arguments. Add --segment SEGMENT to change size of each split. If you only have 3GB memory, set SEGMENT to 8 (though quality may be worse if this argument is too small). Creating an environment variable PYTORCH_NO_CUDA_MEMORY_CACHING=1 can help users with even smaller RAM such as 2GB (I separated a track that is 4 minutes but only 1.5GB is used), but this would make the separation slower.
If you do not have enough memory on your GPU, simply add -d cpu to the command line to use the CPU. With Demucs, processing time should be roughly equal to 1.5 times the duration of the track.
from stemroller.
Thank you so much for testing this! I merged your PR into demucs-cxfreeze
and will try to upload a new build with CUDA support by the end of the weekend. Once that's done, I'll make a corresponding new version of StemRoller with the updated demucs-cxfreeze
bundle.
from stemroller.
from stemroller.
Cool. Yeah it's a little more complex than just changing StemRoller's code - actually the demucs-cxfreeze
package needs to be refrozen with CUDA PyTorch instead of the CPU one. Shouldn't be too difficult but just takes some time. I'll send you a new build when I get a chance! Thanks for looking into it.
from stemroller.
from stemroller.
Thanks @aleph23 and @rdeavila for the suggestions. If you think users would be comfortable with manually installing it, I can definitely add a section to the README about how to configure it. I'll see if I can release just the GPU-specific components as sort of a "patch" that could just be unzipped into the main app directory.
from stemroller.
If any of you have a PyTorch CUDA compatible GPU and want to test out the latest update to the develop
branch then please let me know if it works for you (and if not, what errors you see as output) and how long it takes to split with GPU enabled. I don't have a recent enough GPU on my device to test it, so it'd be helpful if I could get confirmation from someone else. Maybe @Arecsu since you seemed to have success doing it yourself earlier?
from stemroller.
Glad to hear it worked! Thanks so much for testing. Yes, this is using the latest Demucs with the htdemucs_ft
model which should be the best. You can check the console output (when running in dev mode) to make sure it is using the demucs-cxfreeze
from this repo instead of Demucs installed to your system (but that should be forced by the way the env vars are set for the child process). And the goal of compressing to 7z SFX was that now it should be uploaded to GitHub releases, so the plan is for this to be the next release!
from stemroller.
Yes. It is using htdemucs_ft indeed. Looking awesome already. Thank you so much!
from stemroller.
New version is out now! Download from https://github.com/stemrollerapp/stemroller/releases
from stemroller.
That's a great question. I could theoretically implement this feature, but I don't have a GPU that's recent enough to use with PyTorch (Demucs' ML backend), so I couldn't test it myself. If you'd like, I can try to do an experimental build with GPU support and you can let me know if it works for you or not! However, it might take me a few weeks since I'm currently quite busy with other projects and don't have as much time to focus on StemRoller.
Would you like me to make a GPU-compatible build sometime and send you a message when it's ready to test?
from stemroller.
Unfortunately demucs-cxfreeze
with CUDA is 2.2GB compressed to ZIP, and GitHub Releases only allows files up to 2GB, so I'm not sure exactly how I'd be able to distribute this with StemRoller. I'll look into splitting the archive into chunks or finding another place to store it, but it may be quite a while before GPU support is available as this poses a slight logistical problem.
from stemroller.
The HD hog is Torch. Maybe write a script for anyone wanting Nvidia GPU support (and it IS worth it, it's way faster) to pull the version of Torch with CUDA support themselves. The CLI install is
python.exe -m pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
I know jack about Electron and how it deals with python dependencies but if you can offload Torch you'll save 2G compressed (4,502,012,008 bytes on disk size)
from stemroller.
Maybe you can write a how-to on README.md to anyone interested on run StemRoller on Nvidia (like me).
from stemroller.
I can test it, sure. So far, I've cloned the develop
branch and build it. Small bug here in ResultCard.svelte:
{#if status === 'processing'}
<Button Icon={LoadingSpinnerIcon} text="Processing" disabled={true} />
{#if status === 'downloading'}
<Button Icon={LoadingSpinnerIcon} text="Downloading" disabled={true} />
{:else if status === 'queued'}
<Button Icon={CollectionIcon} text="Queued" disabled={true} />
Pull request β #31
There are two #if
. Second one should be :else if
:p
I see in your latest commits a Github Actions script that should build a CUDA version. At least for me locally, I don't get any automatic PyTorch download during the build process. How can I manage to build one? Or is it a build available to download and test?
Edit: figured out I have to run npm run download-third-party-apps
and then build it :)
from stemroller.
Everything works great so far! Is it using the newest version of demucs? The whole package compressed using 7z level 9 LZMA2 is ~1.6gb. I think it can be uploaded to Github releases
from stemroller.
Related Issues (20)
- Error: Unable to find Demucs output directory (Linux) HOT 10
- Import our own files? HOT 1
- Awesome SvelteKit HOT 1
- Request: Can we sign the Mac app? HOT 6
- How to change output path HOT 1
- Process takes time forever with Mac1 HOT 3
- Call for help with Linux support HOT 14
- 7z SFX extracts to directory "dist" rather than "stemroller-2.0.0-win-cuda" HOT 1
- Processing local file fails if the path contains certain unicode characters HOT 5
- Suggestion need support for multiple files from local stored files and support for AMD Gpus HOT 1
- Isolate stem option? HOT 1
- Request: isolate guitars, synths and piano HOT 1
- Request: Multiple Vocals split HOT 3
- Support MPS on Macos devices HOT 11
- Integration as plugin for DAWs HOT 3
- Audio Artifacts HOT 2
- View search result in web browser HOT 1
- Built-in stem player with mute/solo HOT 8
- One of my audio files is 1 hour long. I donβt know if this software supports it. I tried it but it failed.
- Possible to separate guitar and piano? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stemroller.