19 hour file around 1GB in size results in killed for OOM error. I'm running with 13GB

Out of Memory Errors with ~13GB of ram free. about stable-ts HOT 9 CLOSED

kanjieater commented on August 20, 2024

Out of Memory Errors with ~13GB of ram free.

from stable-ts.

Comments (9)

jianfch commented on August 20, 2024 1

You can try to lower the --refine_ts_num (default: 100). Or just disable refinement with --refine_ts_num 0.

from stable-ts.

jianfch commented on August 20, 2024 1

If you still see a spike even with --suppress_silence false. Then the spike is likely from whisper.log_mel_spectrogram which the default part of whisper loading the audio. Passing a 19hr long array into whisper.log_mel_spectrogram causes 23GB spike on my end. I suggest splitting that audio track down to shorter tracks.

import whisper
mel = whisper.log_mel_spectrogram('audio.mp3')

from stable-ts.

kanjieater commented on August 20, 2024

You can try to lower the --refine_ts_num (default: 100). Or just disable refinement with --refine_ts_num 0.

Thanks - I'll give it a try. Could you explain more about how that parameter affects the model so I can tune it accurately? If I disable it with 0, what will be the impact?

from stable-ts.

jianfch commented on August 20, 2024

So it seems refine_ts_num doesn't have a significant effect on memory usage. But there does appear to be a surge in memory usage when loading the model with default whisper function. This surge elevates the baseline memory usage. This surge should be fixed in 0b42339. added --sync_empty which can also reduce memory usage during inference.

from stable-ts.

kanjieater commented on August 20, 2024

Thank you for the quick response. I tried your suggestion and latest version. Unfortunately, there was no change, as the memory still filled up quickly

8737 Killed stable-ts "$FOLDER/audio.mp3" --language Japanese --output_dir "$FOLDER/" --model large-v2 -o "$FOLDER/captions.ass" --sync_empty

The memory starts lower for a time, then around that peak it crashes, it's not an immediate crash but it is within a 3 minutes.

from stable-ts.

jianfch commented on August 20, 2024

My apologies, I misread the issue. I was assuming we were talking about GPU memory. The previous solution only works for GPU memory.
It is expected that stable-ts has higher CPU memory usage than official whisper and other implementations because it stores significantly more data (in RAM) for stabilizing the timestamps. The spike and crash you're seeing might be due to the stable-ts trying to generate a timestamp mask for your the entire audio track at once. So this spike is likely before inference (--verbose should tell you if there is not text output to the console before it crashes). If this is the case, --suppress_silence False should drastically lower the RAM usage.

from stable-ts.

kanjieater commented on August 20, 2024

I didn't see any output when running with the --verbose command.
19625 Killed stable-ts "$FOLDER/audio.mp3" --language Japanese --output_dir "$FOLDER/" --model large-v2 -o "$FOLDER/captions.ass" --sync_empty --verbose

I will try removing the sync_empty flag, and running again to see if verbose shows anything (accidentally left it in. I'll try running with the --suppress_silence False as well.

Update:
Verbose didn't output anything unfortunately
20378 Killed stable-ts "$FOLDER/audio.mp3" --language Japanese --output_dir "$FOLDER/" --model large-v2 -o "$FOLDER/captions.ass" --verbose

I also ran it with suppress_silence, and got the same result 22053 Killed stable-ts "$FOLDER/audio.mp3" --language Japanese --output_dir "$FOLDER/" --model large-v2 -o "$FOLDER/captions.ass" --suppress_silence false --overwrite

Memory usage and CPU usage spike at the same time when the Out of Memory error occurs.

Just to be clear, my specs are:
i9-13900ks
4070TI
32GB DDR5 ram

All of this is stable and working well. It runs inside of WSL2 on Win11 (which has access to CPU, GPU and RAM - works fine for whisper and whisperx as far as resources). I've allocated additional memory as well:

Would you like me to send you the 1GB file somewhere so you could see if you can reproduce as well? I can run it successfully for smaller files.

from stable-ts.

kanjieater commented on August 20, 2024

I just started a run on a 6 hour wav file that is 700mb. The progress bar started very quickly. The progress bar never showed for my 19hr 1GB file and always crashed.

Update: The 6 hour wav completed w/o issue.

from stable-ts.

kanjieater commented on August 20, 2024

You are correct. The input file is too large when Whisper starts, so I either need more RAM or for Whisper to fix it upstream. Thank you for your help with this.

from stable-ts.

Out of Memory Errors with ~13GB of ram free. about stable-ts HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs