marl-book / codebase Goto Github PK

View Code? Open in Web Editor NEW

331.0 331.0 45.0 270 KB

Official code repo for the MARL book (www.marl-book.com)

Python 100.00%

codebase's People

Contributors

Stargazers

Watchers

codebase's Issues

torch requirements typo and advice request for requirements on apple M2

I am running into installation issues on my M2 (macOS 13.2.1 (22D68)) using python 3.7.16 since 3.7 seemed to be required for the munch version specified (and is what is in setup.py). However, I seem to be running into what are maybe M2 issues:

torch issue which I suspect may be M2 related given that torch 1.8.1 supposedly supports python >= 3.6.2

ERROR: No matching distribution found for torch==1.8.*```

and for cpprb and pandas this error:

            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: numpy/core/src/common/ucsnarrow.c
            clang: build/src.macosx-13.2-arm64-3.7/numpy/core/src/npymath/ieee754.c
            clang: numpy/core/src/common/cblasfuncs.c
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include altivec.h explicitly
            Running from numpy source directory.
            /private/var/folders/zv/vlfkt3n94_sgkxv73kn7_gx80000gn/T/pip-build-env-wuds65_9/overlay/lib/python3.7/site-packages/setuptools/_distutils/dist.py:265: UserWarning: Unknown distribution option: 'define_macros'
              warnings.warn(msg)
            error: Command "clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -DNPY_INTERNAL_BUILD=1 -DHAVE_NPY_CONFIG_H=1 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 -D_LARGEFILE64_SOURCE=1 -DNO_ATLAS_INFO=3 -DHAVE_CBLAS -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/umath -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/npymath -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/common -Inumpy/core/include -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/include/numpy -Inumpy/core/src/common -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/Users/ellemcfarlane/Documents/codigo/fast-marl/venv1/include -I/Users/ellemcfarlane/.pyenv/versions/3.7.16/include/python3.7m -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/common -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/npymath -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/common -Ibuild/src.macosx-13.2-arm64-3.7/numpy/core/src/npymath -c numpy/core/src/multiarray/alloc.c -o build/temp.macosx-13.2-arm64-cpython-37/numpy/core/src/multiarray/alloc.o -MMD -MF build/temp.macosx-13.2-arm64-cpython-37/numpy/core/src/multiarray/alloc.o.d -faltivec -I/System/Library/Frameworks/vecLib.framework/Headers" failed with exit status 1
            [end of output]
      
        note: This error originates from a subprocess, and is likely not a problem with pip.
        ERROR: Failed building wheel for numpy
      Failed to build numpy
      ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
      [end of output]

I have spent a few hours going down rabbit holes to make it work (like keiohta/tf2rl#75), so I thought I would pause and ask if someone has gotten this working already on an M1/M2?

Also, the requirement pytorch==1.8.* I think should be changed to torch==1.8.*

lbforaging 2.0.0 seems does not have Foraging-8x8-2p-3f-v2

I successfully execute run.py by installing lbforaging 1.1.1

dqn parameter sharing example

According to the parameter sharing example

python run.py +algorithm=dqn env.name="lbforaging:Foraging-8x8-4p-3f-v2" env.time_limit=25 algorithm.model.critic.parameter_sharing=True

generates error. It looks like dqn.yaml config is not include actor critic.

I successfully ran the example with command

python run.py +algorithm=dqn env.name="lbforaging:Foraging-8x8-4p-3f-v2" env.time_limit=25 algorithm.model.parameter_sharing=True

DQN return normalization

does not de-normalize the next q values

Feature request: update from gym to gymnasium

Super excited about this textbook and the lab’s work in general.

Would it be possible for this repo and the textbook to be upgraded from gym to gymnasium? Gymnasium is the maintained version of openai gym and is compatible with current RL training libraries (rllib and tianshou have already migrated, and stable-baselines3 is in the process of migrating).

For information about upgrading and compatibility, see migration guide and gym compatibility. The main difference is the API has switched to returning truncated and terminated, rather than done, in order to give more information and mitigate edge case issues. It’s usually as simple as changing the step function to return the additional value, and replacing “import gym” with “import gymnasium as gym”.

For some more context, gym v21 is no longer possible to install without complicated workarounds, the next most widely used is gym v26, which is the same api as gymnasium. I would be happy to help with updating the code or helping out however possible.

Suggest to loosen the dependency on stable-baselines3

Hi, your project fast-marl requires "stable-baselines3==1.0" in its dependency. After analyzing the source code, we found that the following versions of stable-baselines3 can also be suitable without affecting your project, i.e., stable-baselines3 0.11.0, 0.11.1, 1.0rc2, 1.0rc1, 1.0rc0, 1.1.0a3. Therefore, we suggest to loosen the dependency on stable-baselines3 from "stable-baselines3==1.0" to "stable-baselines3>=0.11.0,<=1.1.0a3" to avoid any possible conflict for importing more packages or for downstream projects that may use fast-marl.

May I pull a request to further loosen the dependency on stable-baselines3?

By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

We also give our detailed analysis as follows for your reference:

Your project fast-marl directly uses 4 APIs from package stable-baselines3.

stable_baselines3.common.vec_env.base_vec_env.VecEnv.step, stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv.__init__, stable_baselines3.common.vec_env.subproc_vec_env.SubprocVecEnv.__init__, stable_baselines3.common.vec_env.subproc_vec_env.SubprocVecEnv.reset

Beginning from the 4 APIs above, 11 functions are then indirectly called, including -4 stable-baselines3's internal APIs and 15 outsider APIs. The specific call graph is listed as follows (neglecting some repeated function occurrences).

[/semitable/fast-marl]
+--stable_baselines3.common.vec_env.base_vec_env.VecEnv.step
|      +--stable_baselines3.common.vec_env.base_vec_env.VecEnv.step_async
|      +--stable_baselines3.common.vec_env.base_vec_env.VecEnv.step_wait
+--stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv.__init__
|      +--stable_baselines3.common.vec_env.base_vec_env.VecEnv.__init__
|      +--stable_baselines3.common.vec_env.util.obs_space_info
|      +--collections.OrderedDict
|      +--numpy.zeros
+--stable_baselines3.common.vec_env.subproc_vec_env.SubprocVecEnv.__init__
|      +--multiprocessing.get_all_start_methods
|      +--multiprocessing.get_context
|      +--stable_baselines3.common.vec_env.base_vec_env.CloudpickleWrapper.__init__
|      +--stable_baselines3.common.vec_env.base_vec_env.VecEnv.__init__
+--stable_baselines3.common.vec_env.subproc_vec_env.SubprocVecEnv.reset
|      +--stable_baselines3.common.vec_env.subproc_vec_env._flatten_obs
|      |      +--collections.OrderedDict
|      |      +--numpy.stack

We scan stable-baselines3's versions and observe that during its evolution between any version from [0.11.0, 0.11.1, 1.0rc2, 1.0rc1, 1.0rc0, 1.1.0a3] and 1.0, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

diff: 1.0(original) 0.11.0
['stable-baselines3.her.her.HER', 'stable-baselines3.common.distributions.TanhBijector', 'stable-baselines3.common.save_util.load_from_zip_file', 'stable-baselines3.common.save_util.json_to_data', 'stable-baselines3.common.base_class.BaseAlgorithm', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm', 'stable-baselines3.dqn.dqn.DQN', 'stable-baselines3.common.distributions.TanhBijector.atanh', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm._convert_train_freq', 'stable-baselines3.common.preprocessing.maybe_transpose', 'stable-baselines3.common.base_class.BaseAlgorithm.load', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm', 'stable-baselines3.common.utils.set_random_seed', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm._setup_model', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm._setup_model', 'stable-baselines3.common.policies.BasePolicy', 'stable-baselines3.her.her.HER.load', 'stable-baselines3.common.vec_env.obs_dict_wrapper.ObsDictWrapper']

diff: 1.0(original) 0.11.1
['stable-baselines3.her.her.HER', 'stable-baselines3.common.distributions.TanhBijector', 'stable-baselines3.common.save_util.load_from_zip_file', 'stable-baselines3.common.save_util.json_to_data', 'stable-baselines3.common.base_class.BaseAlgorithm', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm', 'stable-baselines3.dqn.dqn.DQN', 'stable-baselines3.common.distributions.TanhBijector.atanh', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm', 'stable-baselines3.common.preprocessing.maybe_transpose', 'stable-baselines3.common.base_class.BaseAlgorithm.load', 'stable-baselines3.common.utils.set_random_seed', 'stable-baselines3.common.policies.BasePolicy', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm._setup_model', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm._setup_model', 'stable-baselines3.her.her.HER.load', 'stable-baselines3.common.vec_env.obs_dict_wrapper.ObsDictWrapper']

diff: 1.0(original) 1.0rc2
[](no clear difference between the source codes of two versions)

diff: 1.0(original) 1.0rc1
['stable-baselines3.her.her.HER', 'stable-baselines3.her.her.HER.load']

diff: 1.0(original) 1.0rc0
['stable-baselines3.her.her.HER', 'stable-baselines3.common.distributions.TanhBijector', 'stable-baselines3.common.save_util.load_from_zip_file', 'stable-baselines3.common.save_util.json_to_data', 'stable-baselines3.common.base_class.BaseAlgorithm', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm', 'stable-baselines3.dqn.dqn.DQN', 'stable-baselines3.common.distributions.TanhBijector.atanh', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm', 'stable-baselines3.common.preprocessing.maybe_transpose', 'stable-baselines3.common.base_class.BaseAlgorithm.load', 'stable-baselines3.common.utils.set_random_seed', 'stable-baselines3.common.policies.BasePolicy', 'stable-baselines3.common.on_policy_algorithm.OnPolicyAlgorithm._setup_model', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm._setup_model', 'stable-baselines3.her.her.HER.load', 'stable-baselines3.common.vec_env.obs_dict_wrapper.ObsDictWrapper']

diff: 1.0(original) 1.1.0a3
['stable-baselines3.common.monitor.ResultsWriter.close', 'stable-baselines3.common.vec_env.vec_transpose.VecTransposeImage', 'stable-baselines3.common.vec_env.vec_monitor.VecMonitor.step_wait', 'stable-baselines3.common.monitor.ResultsWriter.__init__', 'stable-baselines3.common.vec_env.vec_monitor.VecMonitor', 'stable-baselines3.common.monitor.Monitor.close', 'stable-baselines3.sac.sac.SAC', 'stable-baselines3.common.vec_env.vec_monitor.VecMonitor.close', 'stable-baselines3.common.monitor.ResultsWriter.write_row', 'stable-baselines3.common.monitor.ResultsWriter', 'stable-baselines3.common.vec_env.vec_extract_dict_obs.VecExtractDictObs.step_wait', 'stable-baselines3.common.vec_env.vec_extract_dict_obs.VecExtractDictObs.__init__', 'stable-baselines3.common.base_class.BaseAlgorithm', 'stable-baselines3.common.vec_env.vec_normalize.VecNormalize.step_wait', 'stable-baselines3.common.vec_env.vec_monitor.VecMonitor.__init__', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm.__init__', 'stable-baselines3.common.monitor.Monitor.reset', 'stable-baselines3.common.vec_env.vec_extract_dict_obs.VecExtractDictObs', 'stable-baselines3.dqn.dqn.DQN.train', 'stable-baselines3.sac.sac.SAC.__init__', 'stable-baselines3.common.vec_env.vec_transpose.VecTransposeImage.step_wait', 'stable-baselines3.common.vec_env.vec_normalize.VecNormalize', 'stable-baselines3.common.torch_layers.MlpExtractor', 'stable-baselines3.common.monitor.Monitor.step', 'stable-baselines3.common.vec_env.vec_extract_dict_obs.VecExtractDictObs.reset', 'stable-baselines3.common.vec_env.vec_monitor.VecMonitor.reset', 'stable-baselines3.common.monitor.Monitor.get_episode_rewards', 'stable-baselines3.common.off_policy_algorithm.OffPolicyAlgorithm', 'stable-baselines3.ddpg.ddpg.DDPG.__init__', 'stable-baselines3.td3.td3.TD3', 'stable-baselines3.common.monitor.Monitor', 'stable-baselines3.ddpg.ddpg.DDPG', 'stable-baselines3.td3.td3.TD3.__init__', 'stable-baselines3.td3.td3.TD3.train', 'stable-baselines3.dqn.dqn.DQN']

As for other packages, the APIs of collections, numpy and multiprocessing are called by stable-baselines3 in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

Therefore, we believe that it is quite safe to loose your dependency on stable-baselines3 from "stable-baselines3==1.0" to "stable-baselines3>=0.11.0,<=1.1.0a3". This will improve the applicability of fast-marl and reduce the possibility of any further dependency conflict with other projects.

What does the obs of each agent contain?

Hi, what does the obs of each agent contain? It seems there is no introduction in the docs.

a possible bug in the video recording

https://github.com/semitable/fast-marl/blob/0711d925e816fa49f252702f17b8b766590321e7/fastmarl/ac/train.py#L76

I think second component of the if statement is either integer % bool.
In order to record the video with different frequencies we may need to do step % frequency == 0 instead.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble