<input type="checkbox" id="" disabled=""

I have tried to fix some of those. <a cla

I have tried to fix some of those. <p dir=

Bugs fix and new feature request for gfootball about di-engine HOT 6 CLOSED

opendilab commented on August 20, 2024

Bugs fix and new feature request for gfootball

from di-engine.

Comments (6)

zxzzz0 commented on August 20, 2024

I have tried to fix some of those.

@ErlebnisW What are these exactly? Should there been bugs in the code, the tests wouldn't have passed.

from di-engine.

ErlebnisW commented on August 20, 2024

I have tried to fix some of those.

@ErlebnisW What are these exactly? Should there been bugs in the code, the tests wouldn't have passed.
Operating system: Ubuntu 18.04.6 LTS, Python 3.8

After installation, run DI-engine/dizoo/gfootball/envs/tests/test_env_gfootball_academy.py, terminal shows:

/home/usr/anaconda3/envs/di-engine/lib/python3.8/site-packages/dizoo/gfootball/envs/init.py:6: UserWarning: not found gfootball env, please install it
When I check this path(i.e. anaconda3/envs/di-engine/lib/python3.8/site-packages/dizoo/gfootball/envs/), I found that the folders "action", "obs", "reward" don't exsit. It matters because gfootball_env.py has following lines:
from .action.gfootball_action_runner import GfootballRawActionRunner
from .obs.gfootball_obs_runner import GfootballObsRunner
from .reward.gfootball_reward_runner import GfootballRewardRunner

So I copied the above folders in to solve this.

run /DI-engine/dizoo/gfootball/entry/parallel/gfootball_ppo_parallel_config.py", and terminal shows:

ImportError: cannot import name 'BaseEnvInfo' from 'ding.envs'

I solved this by delete BaseEnvInfo from import and following lines.

There is also another bug about "exp_name" which I foget how to reproduce , I fix this by specify exp_name in one file.

Following bugs can't fix

run "Di-engine/dizoo/gfootball/entry/parallel/gfootball_ppo_parallel_config.py", terminal shows:

File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 109, in init
self._reward_space = self._env_ref.reward_space
AttributeError: 'GfootballEnv' object has no attribute 'reward_space'

Can't fix this.

run "Di-engine/dizoo/gfootball/entry/parallel/gfootball_il_parallel_config.py", terminal shows:

Traceback (most recent call last):
File "/home/vcis5/Userlist/Wangmingzhi/Di-engine/dizoo/gfootball/entry/parallel/gfootball_il_parallel_config.py", line 123, in
main_config = parallel_transform(main_config)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/config/utils.py", line 205, in parallel_transform
cfg.system = set_system_cfg(cfg)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/config/utils.py", line 158, in set_system_cfg
learner_num = cfg.main.policy.learn.learner.learner_num
AttributeError: 'EasyDict' object has no attribute 'main'

Can't fix this.

run demo ppo_lstm in the doc(https://di-engine-docs.readthedocs.io/zh_CN/latest/env_tutorial/gfootball_zh.html) teminal shows :

File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/torch_utils/checkpoint_helper.py", line 327, in wrapper
return func(*args, **kwargs)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/worker/learner/base_learner.py", line 260, in start
data = self._next_data()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/worker/learner/base_learner.py", line 164, in wrapper
with self._wrapper_timer:
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/utils/time_helper.py", line 84, in enter
self._timer.start_time()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/utils/time_helper_cuda.py", line 43, in start_time
torch.cuda.synchronize()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/torch/cuda/init.py", line 491, in synchronize
_lazy_init()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/torch/cuda/init.py", line 204, in _lazy_init
raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Can't fix this.

from di-engine.

puyuan1996 commented on August 20, 2024

Hello, thanks for your questions and suggestions.

Now in this PR :

we have fixed some bugs in the gfootball env, and we have a naive dqn rl demo to battle with the built-in ai, but the performance now is poor. The naive results is:

We speculate that this may be because pure DQN algorithm lacks the ability to model long sequence dependencies and efficient exploration mechanisms. You can try to adapt advanced algorithms like NGU to the gfootball environment.

we have a imitation learning demo to learn from rule_based_model dataset, but the performance now is not good:

Here is the statistics of our training dataset (100 episodes): the mean, max, min of return in the training dataset is -0.12, 4, -3, respectively, which suggests that we should improve the quality of the dataset. Then we test the accuracy in the training dataset (100 episodes) and validation dataset (50 episodes), is 0.9452 and 0.8009. We also found that the accuracy of some action in training dataset is lower than 0.5 which reflects the problem of class imbalance.

In order to obtain a good imitation learning performance, we are taking steps to address the above problems, thank you for your patience.

Thanks a lot.

from di-engine.

ErlebnisW commented on August 20, 2024

Thanks

from di-engine.

puyuan1996 commented on August 20, 2024

Hello,

We have reduced the difficulty of the built-in AI in the environment，change the env_id from 11_vs_11_stochastic (medium opponent bot) to 11_vs_11_easy_stochastic (easy opponent bot) , now the performance of imitation learning demo that learn from rule_based_model dataset is as follows (you can reproduce the result by runing dizoo/gfootball/entry/gfootball_il_rule_lt0_main.py in PR)

as you can see, the performance is better than #335 (comment).
We guess that because the level of rule_based_model is originally low, in 11_vs_11_stochastic, the average return of the collected data set is less than 0. Although we only select the trajectory with return>0 for imitation learning, these data only cover a small part of the scene, resulting in over-fitting and poor performance.

Regarding the reinforcement learning demo, we are currently trying the R2D2 algorithm (the development version is dizoo/gfootball/entry/gfootball_r2d2_main.py in PR), and we will let you know as soon as we have the results, thank you for your patience and attention.

Thanks a lot.

from di-engine.

ErlebnisW commented on August 20, 2024

Thanks a lot! I'll have a try.

from di-engine.

Bugs fix and new feature request for gfootball about di-engine HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs