Comments (2)
args.gamma = 0.99
is better than 0.95
args.agent = AgentModSAC()
is better than AgentSAC()
(AgentSAC can't pass 'BipedalWalkerHardcore-v3')
Ok, I will add 'BipedalWalkerHardcore-v3' to demo.py
.
We have fully upgraded ElegantRL and now supports multiple GPU training (1~8 GPU).
Now the problem you mentioned has been resolved. I'm sorry that we have been busy developing the 80 GPU version (Cloud platform) of ElegantRL, and we were unable to reply to you in time.
from elegantrl.
How to repeat success in video{https://www.bilibili.com/video/BV1wi4y187tC} using eRL in colab?
I tried:
args = Arguments(if_on_policy=False) args.agent = AgentSAC() args.env = PreprocessEnv(gym.make('BipedalWalkerHardcore-v3')) args.reward_scale = 2 ** -1 args.gamma = 0.95 args.rollout_num = 2 args.if_remove = False train_and_evaluate_mp(args)
After a million step (3 hours), the agent scored 37 points (MaxR)
I had add a demo about BipedalWalkerHardcore-v3
in elegantrl/demo.py line 53. And I had no time to do fine tuning on this env.
if_train_bipedal_walker_hard_core = 0
if if_train_bipedal_walker_hard_core:
"TotalStep: 10e5, TargetReward: 0, UsedTime: 10ks ModSAC"
"TotalStep: 25e5, TargetReward: 150, UsedTime: 20ks ModSAC"
"TotalStep: 35e5, TargetReward: 295, UsedTime: 40ks ModSAC"
"TotalStep: 40e5, TargetReward: 300, UsedTime: 50ks ModSAC"
args.env = build_env(env='BipedalWalkerHardcore-v3')
args.target_step = args.env.max_step
args.gamma = 0.98
args.net_dim = 2 ** 8
args.batch_size = args.net_dim * 2
args.learning_rate = 2 ** -15
args.repeat_times = 1.5
args.max_memo = 2 ** 22
args.break_step = 2 ** 24
args.eval_gap = 2 ** 8
args.eval_times1 = 2 ** 2
args.eval_times2 = 2 ** 5
args.target_step = args.env.max_step * 1
# train_and_evaluate(args) # single process
args.worker_num = 4
args.visible_gpu = sys.argv[-1]
train_and_evaluate_mp(args) # multiple process
# args.worker_num = 4
# args.visible_gpu = '0,1'
# train_and_evaluate_mp(args) # multiple GPU
Here are two result of ElegantRL ModSAC for BipedalWalkerHardcore-v3
.
from elegantrl.
Related Issues (20)
- H-term implementation? HOT 2
- None of the IsaacGym related examples work HOT 1
- demo_IsaacGym.py HOT 1
- none of your example works... HOT 1
- how to start with mujoco env?
- Implementation bug in Prioritized Experience Replay HOT 1
- mutil discrete action spaces
- Conditions to stop training when target return is reached
- run.py碰到一些问题 HOT 1
- MADDPG init issues HOT 1
- Isaac Gym Preview4 examples?
- Requirements completely inconsistent HOT 1
- How to get the value of account_value_erl
- How to get the value of account_value_erl HOT 1
- tutorial
- maybe a small bug in the function `explore_vec_env` of discretePPO and discreteA2C?
- Continue Training From Checkpoint
- where is train_and_evaluate function? HOT 1
- train_ppo_a2c_for_lunar_lander_continuous的ppo算法,好像不能完全复现曲线变化情况
- 已经下载好'./China_A_shares.pandas.dataframe',无法加载'./China_A_shares.pandas.dataframe',出现报错UnpicklingError
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elegantrl.