GithubHelp home page GithubHelp logo

Comments (6)

zxzzz0 avatar zxzzz0 commented on August 20, 2024

I have tried to fix some of those.

@ErlebnisW What are these exactly? Should there been bugs in the code, the tests wouldn't have passed.

from di-engine.

ErlebnisW avatar ErlebnisW commented on August 20, 2024

I have tried to fix some of those.

@ErlebnisW What are these exactly? Should there been bugs in the code, the tests wouldn't have passed.
Operating system: Ubuntu 18.04.6 LTS, Python 3.8

  1. After installation, run DI-engine/dizoo/gfootball/envs/tests/test_env_gfootball_academy.py, terminal shows:

/home/usr/anaconda3/envs/di-engine/lib/python3.8/site-packages/dizoo/gfootball/envs/init.py:6: UserWarning: not found gfootball env, please install it
When I check this path(i.e. anaconda3/envs/di-engine/lib/python3.8/site-packages/dizoo/gfootball/envs/), I found that the folders "action", "obs", "reward" don't exsit. It matters because gfootball_env.py has following lines:
from .action.gfootball_action_runner import GfootballRawActionRunner
from .obs.gfootball_obs_runner import GfootballObsRunner
from .reward.gfootball_reward_runner import GfootballRewardRunner

So I copied the above folders in to solve this.

  1. run /DI-engine/dizoo/gfootball/entry/parallel/gfootball_ppo_parallel_config.py", and terminal shows:

ImportError: cannot import name 'BaseEnvInfo' from 'ding.envs'

I solved this by delete BaseEnvInfo from import and following lines.

  1. There is also another bug about "exp_name" which I foget how to reproduce , I fix this by specify exp_name in one file.

Following bugs can't fix

  1. run "Di-engine/dizoo/gfootball/entry/parallel/gfootball_ppo_parallel_config.py", terminal shows:

File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/envs/env_manager/base_env_manager.py", line 109, in init
self._reward_space = self._env_ref.reward_space
AttributeError: 'GfootballEnv' object has no attribute 'reward_space'

Can't fix this.

  1. run "Di-engine/dizoo/gfootball/entry/parallel/gfootball_il_parallel_config.py", terminal shows:

Traceback (most recent call last):
File "/home/vcis5/Userlist/Wangmingzhi/Di-engine/dizoo/gfootball/entry/parallel/gfootball_il_parallel_config.py", line 123, in
main_config = parallel_transform(main_config)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/config/utils.py", line 205, in parallel_transform
cfg.system = set_system_cfg(cfg)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/config/utils.py", line 158, in set_system_cfg
learner_num = cfg.main.policy.learn.learner.learner_num
AttributeError: 'EasyDict' object has no attribute 'main'

Can't fix this.

  1. run demo ppo_lstm in the doc(https://di-engine-docs.readthedocs.io/zh_CN/latest/env_tutorial/gfootball_zh.html) teminal shows :

File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/torch_utils/checkpoint_helper.py", line 327, in wrapper
return func(*args, **kwargs)
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/worker/learner/base_learner.py", line 260, in start
data = self._next_data()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/worker/learner/base_learner.py", line 164, in wrapper
with self._wrapper_timer:
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/utils/time_helper.py", line 84, in enter
self._timer.start_time()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/ding/utils/time_helper_cuda.py", line 43, in start_time
torch.cuda.synchronize()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/torch/cuda/init.py", line 491, in synchronize
_lazy_init()
File "/home/vcis5/anaconda3/envs/di/lib/python3.8/site-packages/torch/cuda/init.py", line 204, in _lazy_init
raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Can't fix this.

from di-engine.

puyuan1996 avatar puyuan1996 commented on August 20, 2024

Hello, thanks for your questions and suggestions.

Now in this PR :

  1. we have fixed some bugs in the gfootball env, and we have a naive dqn rl demo to battle with the built-in ai, but the performance now is poor. The naive results is:

image

We speculate that this may be because pure DQN algorithm lacks the ability to model long sequence dependencies and efficient exploration mechanisms. You can try to adapt advanced algorithms like NGU to the gfootball environment.

  1. we have a imitation learning demo to learn from rule_based_model dataset, but the performance now is not good:

image

Here is the statistics of our training dataset (100 episodes): the mean, max, min of return in the training dataset is -0.12, 4, -3, respectively, which suggests that we should improve the quality of the dataset. Then we test the accuracy in the training dataset (100 episodes) and validation dataset (50 episodes), is 0.9452 and 0.8009. We also found that the accuracy of some action in training dataset is lower than 0.5 which reflects the problem of class imbalance.

In order to obtain a good imitation learning performance, we are taking steps to address the above problems, thank you for your patience.

Thanks a lot.

from di-engine.

ErlebnisW avatar ErlebnisW commented on August 20, 2024

Thanks

from di-engine.

puyuan1996 avatar puyuan1996 commented on August 20, 2024

Hello,

We have reduced the difficulty of the built-in AI in the environment,change the env_id from 11_vs_11_stochastic (medium opponent bot) to 11_vs_11_easy_stochastic (easy opponent bot) , now the performance of imitation learning demo that learn from rule_based_model dataset is as follows (you can reproduce the result by runing dizoo/gfootball/entry/gfootball_il_rule_lt0_main.py in PR)
image
as you can see, the performance is better than #335 (comment).
We guess that because the level of rule_based_model is originally low, in 11_vs_11_stochastic, the average return of the collected data set is less than 0. Although we only select the trajectory with return>0 for imitation learning, these data only cover a small part of the scene, resulting in over-fitting and poor performance.

Regarding the reinforcement learning demo, we are currently trying the R2D2 algorithm (the development version is dizoo/gfootball/entry/gfootball_r2d2_main.py in PR), and we will let you know as soon as we have the results, thank you for your patience and attention.

Thanks a lot.

from di-engine.

ErlebnisW avatar ErlebnisW commented on August 20, 2024

Thanks a lot! I'll have a try.

from di-engine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.