Comments (3)
Hello, we currently support multi-GPU training on a single node through PyTorch's Distributed Data Parallel (DDP) technology. Please refer to the discussion on this issue: #196. As for multi-node training, we plan to consider incorporating this functionality in future version updates.
At present, we have integrated experimental monitoring and performance analysis support that is supported by DI-engine into LightZero. For detailed information, please consult this document (https://github.com/opendilab/DI-engine-docs/blob/main/source/04_best_practice/training_generated_folders_zh.rst, Chinese version). For your convenience, we have provided the following English summary and will subsequently integrate it fully into our codebase documentation. Thank you for your suggestion. Best regards.
from lightzero.
Experimental monitoring and logging system in LightZero
LightZero generates log
and checkpoint
folders during the training process. The file tree generated is as follows:
cartpole_muzero
├── ckpt
│ ├── ckpt_best.pth.tar
│ ├── iteration_0.pth.tar
│ └── iteration_10000.pth.tar
├── log
│ ├── buffer
│ │ └── buffer_logger.txt
│ ├── collector
│ │ └── collector_logger.txt
│ ├── evaluator
│ │ └── evaluator_logger.txt
│ ├── learner
│ │ └── learner_logger.txt
│ └── serial
│ └── events.out.tfevents.1626453528.CN0014009700M.local
├── formatted_total_config.py
└── total_config.py
log/collector
In the collector
folder, there is a file named collector_logger.txt
, which contains information related to the interaction between the collector and the environment.
Special information generated when the collector interacts with the environment, such as:
- episode_count: the number of episodes collected
- envstep_count: the number of envsteps collected
- train_sample_count: the number of training sample data
- avg_envstep_per_episode: the average envstep per episode
- avg_sample_per_episode: the average number of samples per episode
- avg_envstep_per_sec: the average env_step per second
- avg_train_sample_per_sec: the average number of training samples per second
- avg_episode_per_sec: the average number of episodes per second
- collect_time: collection time
- reward_mean: the average reward
- reward_std: the standard deviation of the reward
- each_reward: the reward for each episode of the collector's interaction with the environment.
- reward_max: the maximum reward
- reward_min: the minimum reward
- total_envstep_count: the total envstep count
- total_train_sample_count: the total number of training samples
- total_episode_count: the total number of episodes
- total_duration: the total duration
log/evaluator
In the evaluator
folder, there is a file named evaluator_logger.txt
, which contains information about the evaluator's interaction with the environment.
- [INFO]: [EVALUATOR]env x completes an episode, final reward: xxx, current episode: xxx
- train_iter: the number of training iterations
- ckpt_name: the model path, such as iteration_0.pth.tar
- episode_count: episode count
- envstep_count: envstep count
- evaluate_time: the time spent by the evaluator
- avg_envstep_per_episode: the average envstep per episode
- avg_envstep_per_sec: the average envstep per second
- avg_time_per_episode: the average time per episode per second
- reward_mean: the average reward
- reward_std: the standard deviation of the reward
- each_reward: the reward for each episode of the evaluator's interaction with the environment.
- reward_max: the maximum reward
- reward_min: the minimum reward
log/learner
In the learner
folder, there is a file named learner_logger.txt
, which contains information about the learner.
The following information is generated during the MuZero training period:
Policy neural network architecture:
[04-08 13:12:59] INFO [RANK0]: DI-engine DRL Policy base_learner.py:338
MuZeroModelMLP(
(representation_network): RepresentationNetworkMLP(
(fc_representation): Sequential(
(0): Linear(in_features=4, out_features=128, bias=True)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=128, out_features=128, bias=True)
)
Learner information:
Grid table:
| Name | cur_lr_avg | total_loss_avg |
|-------|------------|----------------|
| Value | 0.001000 | 0.098996 |
log/serial
The buffer, collector, evaluator, and learner's relevant information is saved into a file named events.out.tfevents for use with tensorboard.
LightZero saves all tensorboard files from the serial folder as one tensorboard file, rather than individual folders. This is because when running a large number of experiments, say n, it is not easy to distinguish between 4*n individual tensorboard files. Therefore, in LightZero, all tensorboard files are in the serial folder.
ckpt
In the ckpt
folder, there are model parameter checkpoints:
- ckpt_best.pth.tar. The best model that achieved the highest evaluation score.
- "iteration" + iter number. Models saved every iter_number.
You can load the model usingtorch.load('ckpt_best.pth.tar')
.
from lightzero.
Thanks for the information.
from lightzero.
Related Issues (20)
- How to config to use multi GPUs to train a model ? HOT 19
- Question: How can I set up a custom environment? HOT 3
- AlphaZero for Single Player HOT 2
- Error with Convolutional Input in Example Environments (game_2048) HOT 2
- How to use efficient zero for board games HOT 5
- connect 4 setup bugs HOT 1
- Inconsistency Between Episode Counts in MuZero HOT 1
- About Replicating SampledZero Performance in the Hopper-V3 Environment HOT 2
- Multi GPU EfficientZero import failure "No module named ding" HOT 1
- Multi-GPU issue
- The multi-GPU issue HOT 1
- When will Go be supported? HOT 1
- the sampled efficient zero portion of the code HOT 2
- Custom environment HOT 1
- Does LightZero currently support compilation on Windows? HOT 2
- Replicating multi-GPU EfficientZero Atari results HOT 1
- how to well model a grid env when it changes frequently? HOT 8
- Minigrid environment HOT 1
- Great!!! HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightzero.