GithubHelp home page GithubHelp logo

wqynew / xuanpolicy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from agi-brain/xuance

0.0 0.0 0.0 29.79 MB

XuanPolicy: A Comprehensive and Unified Deep Reinforcement Learning Library

Home Page: https://github.com/wenzhangliu/XuanPolicy.git

License: MIT License

Python 100.00%

xuanpolicy's Introduction

XuanPolicy: A Comprehensive and Unified Deep Reinforcement Learning Library

Documentation Status PyPI PyTorch TensorFlow MindSpore Python

gym gymnasium pettingzoo

pettingzoo GitHub all releases GitHub Repo stars GitHub forks GitHub watchers

XuanPolicy is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations.

We call it as Xuan-Ce (玄策) in Chinese. "Xuan (玄)" means incredible and magic box, "Ce (策)" means policy.

DRL algorithms are sensitive to hyper-parameters tuning, varying in performance with different tricks, and suffering from unstable training processes, therefore, sometimes DRL algorithms seems elusive and "Xuan". This project gives a thorough, high-quality and easy-to-understand implementation of DRL algorithms, and hope this implementation can give a hint on the magics of reinforcement learning.

We expect it to be compatible with multiple deep learning toolboxes( PyTorch, TensorFlow, and MindSpore), and hope it can really become a zoo full of DRL algorithms.

Currently Supported Agents

DRL

(Click to show all DRL agents)
  • Vanilla Policy Gradient - PG [Paper]
  • Phasic Policy Gradient - PPG [Paper] [Code]
  • Advantage Actor Critic - A2C [Paper] [Code]
  • Soft actor-critic based on maximum entropy - SAC [Paper] [Code]
  • Soft actor-critic for discrete actions - SAC-Discrete [Paper] [Code]
  • Proximal Policy Optimization with clipped objective - PPO-Clip [Paper] [Code]
  • Proximal Policy Optimization with KL divergence - PPO-KL [Paper] [Code]
  • Deep Q Network - DQN [Paper]
  • DQN with Double Q-learning - Double DQN [Paper]
  • DQN with Dueling network - Dueling DQN [Paper]
  • DQN with Prioritized Experience Replay - PER [Paper]
  • DQN with Parameter Space Noise for Exploration - NoisyNet [Paper]
  • DQN with Convolutional Neural Network - C-DQN [Paper]
  • DQN with Long Short-term Memory - L-DQN [Paper]
  • DQN with CNN and Long Short-term Memory - CL-DQN [Paper]
  • DQN with Quantile Regression - QRDQN [Paper]
  • Distributional Reinforcement Learning - C51 [Paper]
  • Deep Deterministic Policy Gradient - DDPG [Paper] [Code]
  • Twin Delayed Deep Deterministic Policy Gradient - TD3 [Paper][Code]
  • Parameterised deep Q network - P-DQN [Paper]
  • Multi-pass parameterised deep Q network - MP-DQN [Paper] [Code]
  • Split parameterised deep Q network - SP-DQN [Paper]

MARL

(Click to show all MARL agents)
  • Independent Q-learning - IQL [Paper] [Code]
  • Value Decomposition Networks - VDN [Paper] [Code]
  • Q-mixing networks - QMIX [Paper] [Code]
  • Weighted Q-mixing networks - WQMIX [Paper] [Code]
  • Q-transformation - QTRAN [Paper] [Code]
  • Deep Coordination Graphs - DCG [Paper] [Code]
  • Independent Deep Deterministic Policy Gradient - IDDPG [Paper]
  • Multi-agent Deep Deterministic Policy Gradient - MADDPG [Paper] [Code]
  • Counterfactual Multi-agent Policy Gradient - COMA [Paper] [Code]
  • Multi-agent Proximal Policy Optimization - MAPPO [Paper] [Code]
  • Mean-Field Q-learning - MFQ [Paper] [Code]
  • Mean-Field Actor-Critic - MFAC [Paper] [Code]
  • Independent Soft Actor-Critic - ISAC
  • Multi-agent Soft Actor-Critic - MASAC [Paper]
  • Multi-agent Twin Delayed Deep Deterministic Policy Gradient - MATD3 [Paper]

Supported Environments

Toy Environments (Classic Control, Box2D, etc.)

(Click to show part of Toy environments)

CartPole

Pendulum

Lunar_lander

...
(Click to show part of MuJoCo environments)

Ant

HalfCheetah

Hopper

Humanoid

...
(Click to show part of Atari environments)

Breakout

Boxing

Alien

Adventure

Air Raid

...
(Click to show part of MPE environments)

Simple Push

Simple Reference

Simple Spread

...
(Click to show part of Magent environments)

Battle

Tiger Deer

Battle Field

...

Installation

The library can be run at Linux, Windows, MacOS, and EulerOS, etc.

Before installing XuanPolicy, you should install Anaconda to prepare a python environment.

After that, open a terminal and install XuanPolicy by the following steps.

Step 1: Create a new conda environment (python>=3.7 is suggested):

conda create -n xpolicy python=3.7

Step 2: Activate conda environment:

conda activate xpolicy

Step 3: Install the library:

pip install xuanpolicy

This command does not include the dependencies of deep learning toolboxes. To install the XuanPolicy with deep learning tools, you can type pip install xuanpolicy[torch] for PyTorch, pip install xuanpolicy[tensorflow] for TensorFlow2, pip install xuanpolicy[mindspore] for MindSpore, and pip install xuanpolicy[all] for all dependencies.

Note: Some extra packages should be installed manually for further usage.

Basic Usage

Quickly Start

Train a Model

import xuanpolicy as xp

runner = xp.get_runner(agent_name='dqn', env_name='toy/CartPole-v0', is_test=False)
runner.run()

Test the Model

import xuanpolicy as xp

runner_test = xp.get_runner(agent_name='dqn', env_name='toy/CartPole-v0', is_test=True)
runner_test.run()

Logger

You can use tensorboard to visualize what happened in the training process. After training, the log file will be automatically generated in the directory ".results/" and you should be able to see some training data after running the command.

$ tensorboard --logdir ./logs/dqn/torch/CartPole-v0

If everything going well, you should get a similar display like below.

Tensorboard

Selected Results

Toy Environments

Mujoco Environments

MPE Environments

@article{XuanPolicy2023,
    author = {Wenzhang Liu, Wenzhe Cai, Kun Jiang, Yuanda Wang, Guangran Cheng, Jiawei Wang, Jingyu Cao, Lele Xu, Chaoxu Mu, Changyin Sun},
    title = {XuanPolicy: A Comprehensive and Unified Deep Reinforcement Learning Library},
    year = {2023}
}

xuanpolicy's People

Contributors

wenzhangliu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.