Comments (1)
Nevermind, it was only crashing when I used virutal console mode. I switched to a xfce4 session and it doesn't crash anymore. I installed the stable version of TransformerEngine.
Edit: I reinstalled MS-AMP and I still get this error message and then I reinstalled the stable verison of TransformerEngine and still get the error message.
Traceback (most recent call last):
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/train_finetune.py", line 713, in <module>
main()
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/train_finetune.py", line 460, in main
g_loss.backward()
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/torch/_tensor.py", line 520, in backward
torch.autograd.backward(
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/torch/autograd/__init__.py", line 288, in backward
_engine_run_backward(
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/torch/autograd/graph.py", line 767, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: GET was unable to find an engine to execute this computation
Traceback (most recent call last):
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/bin/accelerate", line 8, in <module>
sys.exit(main())
^^^^^^
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/accelerate/commands/accelerate_cli.py", line 46, in main
args.func(args)
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1082, in launch_command
simple_launcher(args)
File "/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/lib/python3.11/site-packages/accelerate/commands/launch.py", line 688, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/run/media/user/e1745494-af46-4749-9e1a-89d2b2289699/StyleTTS2/fp8/bin/python3.11', 'train_finetune.py', '--config_path', './Configs/config_ft-Ellie-Up-FP8.yml']' returned non-zero exit status 1.
Edit: I got this error message. This was with the stable version of TransformerEngine.
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
from styletts2.
Related Issues (20)
- Error Message After Using a fine tuned ASR Model
- Stage 2 Training Fails with NaN Loss on Single GPU Due to Inconsistent Checkpoint Keys
- Getting CUDA Out of memory error in Stage2 training HOT 13
- Multi-lingual training HOT 18
- In training Stage1 after 49th epoch getting RuntimeError: you can only change requires_grad flags of leaf variables, g_loss.requires_grad = True
- First stage training after 49th epoch (i.e., when epoch >= TMA_epoch)
- Getting error in d_loss.backward() of first_stage training
- Can the model learn accents not supported by espeak-ng?
- Joint training is failing with Assertion error
- In 2nd stage training AttributeError: 'AudioDiffusionConditional' object has no attribute 'module'
- Questions about Differentiable Duration Modeling HOT 1
- weird chinese pronunciation HOT 3
- Training PL-BERT on styletts2-community/multilingual-pl-bert
- Can anyone please share checkpoints that we get after we complete both stages of training HOT 3
- Model Size of fine tuned Model
- Can StyleTTS2 use phonemization from different languages to finetune or train?
- StyleTTS Python API doesn't detect devanagari script
- After training 1 epoch, train_first.py crashes: RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1, 1, 800] HOT 1
- Do we need lr scheduler?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from styletts2.