Comments (4)
Hi, thanks for reporting this,
Can you retry without EP and report back whether this improves the speed? In addition, I would encourage trying different TP/PP configurations to determine the optimal.
Thank you.
from nemo.
Hi, thanks for reporting this,
Can you retry without EP and report back whether this improves the speed? In addition, I would encourage trying different TP/PP configurations to determine the optimal.
Thank you.
Thank you for your response. We have already attempted the process without EP. However, it proved to be slower compared to when EP was utilized. Below are the average times recorded without EP:
#nodes=4, DP=1, GBPT = 2 sec
#nodes=8, DP=2, GBPT = 12 sec
#nodes=16, DP=4, GBPT = 34 sec
We have also experimented with different combinations for TP and PP, such as 8x4, 4x8 and 8x8. In terms of speed, all configurations performed worse than the one reported in the issue.
from nemo.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
from nemo.
@sunilitggu can you try with top of tree NeMo (git clone) and set your optimizer to mcore_distributed_optim (via model.optim.name='mcore_distributed_optim') ?
from nemo.
Related Issues (20)
- Latest release version 1.23.0 missing the AudioCodecModel checkpoint list. HOT 1
- NLP isn't getting imported due to ApexGuardDefaults HOT 1
- Job specific environment variables can't be set in Hydra multi-run HOT 2
- Using lhotse when training a hybrid fast conformer model fails HOT 7
- How to config a locally model?
- Unable to reproduce cache aware streaming results for Conformer that were there for Fastconformer.
- Can we add emotions to the produced audio?
- LM on Parakeet models HOT 1
- to support deepseekv2 HOT 1
- How to use a pre-trained model for cache-aware FastConformer-Hybrid model? HOT 3
- When Trying to import nlp collections in the Nemo Primer getting error "No Module named megatron"
- How to export SLUIntentSlotBPEModel to ONNX HOT 1
- issue about self attention with mask
- Converting megatron checkpoint to .nemo without the same environment
- Nemo container for Nemotron 340B inference fails pytorch_lightning import HOT 1
- Can you support DoRA?
- Unable to reproduce cache aware streaming results for Conformer that were there for Fastconformer.
- Issue: TimeError Occurring During Training on node 16 or more
- Speaker Diarization goes haywire due to small segments of audio
- MCore slower than NeMo native implementation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nemo.