Comments (2)
+1
It would be better if init could ask the data it needs instead of handling file I/O and looking for params in args. Some systems may use very special methods to load yaml, json, protobuf, etc. into memory for initialization and configuration.
from deepspeed.
In the process of implementing this enhancement we will consider removing the --deepspeed flag since this is not used internally in DeepSpeed code. It should be up to the user if they want to implement a DeepSpeed and non-DeepSpeed version of their code controlled by a flag.
from deepspeed.
Related Issues (20)
- # [REQUEST] Upstream modifications of PaRO
- Reset Optimizer HOT 1
- nv-ds-chat CI test failure
- [HELP] ZeRO3 partition parameters after fully load to each GPU! HOT 7
- [BUG] ZeRO optimizer with MoE Expert Parallelism HOT 1
- [BUG] Pipeline Dataloader Samler: `shuffle=False`
- [REQUEST] Moving a trainable model with an optimiser between GPU and CPU
- [BUG] RuntimeError: Error building extension 'fused_adam' Loading extension module fused_adam
- [BUG] 1: error: must run as root and 2: raise RuntimeError("Ninja is required to load C++ extensions")
- [BUG] RuntimeError encountered when generating tokens from a DeepSpeedHybridEngine initialized with 4-bit quantization. HOT 2
- [BUG] is_zero_init_model is always False when I'm using zero_init! HOT 4
- [BUG] RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! HOT 1
- Deepspeed stage 3 hanging after 1st validation sample
- [BUG] 4-bit quantized models would repeatedly generate the same tokens when bf16.enabled is true HOT 1
- Deepspeed zero3 + qlora arise problem! Params didn't sharded first before load to each GPU!
- Install errors on Windows HOT 5
- [HELP] How to safely switch trainable parameters in ZeRO-3 stage? HOT 2
- Does deepspeed support aarch64? HOT 6
- [BUG] tortoise_tts.py fails on deepspeed/pydantic error HOT 1
- [BUG] 1 line logic issue: flipped sign/direction in `_partition_param_sec` of `partition_parameters.py`? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepspeed.