- Actor Critic
- Actor Critic Experience Replay
- Advantage Actor Critic
- Asynchronous Advantage Actor Critic
- Deep Deterministic Policy Gradient
- Generalized Advantage Estimation
- N-steps Advantage Actor Critic
- Proximal Policy Optimization (PPO)
- PPO Generalized Advantage Estimation
- PPO LSTM
- Q Learning
- REINFORCE
- REINFORCE Moving Average Baseline
- Soft Actor Critic
leaderj1001 / bag-of-rl Goto Github PK
View Code? Open in Web Editor NEWBag of Reinforcement Learning Algorithm
License: MIT License