djbyrne / core_rl Goto Github PK

View Code? Open in Web Editor NEW

7.0 2.0 5.0 243.92 MB

Repo of core reinforcement learning algorithms and explanations using pytorch lightning

Python 100.00%

reinforcement-learning reinforcement-learning-algorithms reinforcement-learning-agent pytorch pytorch-lightning

core_rl's Introduction

core_rl's People

Contributors

Stargazers

Watchers

Forkers

00krishna-research raymondliz zeguanxiao cemberk cashbeario

core_rl's Issues

Clean Modular Architecture Refactor

Running Test without loading the checkpoint fails

DQN Variants Implementation

DQN
Double DQN
N-Step DQN
Dueling DQN
Noisy DQN
PER DQN
Distributional DQN

Implementation of on-policy algorithms

Hey, a great repository. How do you intend to implement the on-policy algorithms? As in, how would you implement the train_dataloder() method in the lightning module?

Usefulness of Pytorch-lightning for RL

Hey @djbyrne
great repository! I followed the discussion on the pytorch-lightning repo about using it for RL and found you this way.

I was wondering, in your opinion, how useful is it to use pytorch-lightning for RL? I agree that it makes the code "more" structured i.e. people who are used to seeing pytorch lightning, will probably be able to find things out quickly. Additionally one can still argue that things such as multi-gpu training etc. are also made easier.

But to me, pytorch-lightning's main selling point, is the fact that you don't need to implement the whole training loop with batching etc. However when looking at your ExperienceSource class, and the EpisodeExperienceSource class, we basically still had to implement the "training" loop where an agent samples a state and then takes an action and goes to the next state. So in a sense, for supervised learning, it is very error prone, to split up your dataset into batches, and iterate over them. But here for RL it feels like we are interacting with the environment and then explicitely storing it in a dataset, just so we can then sample from it again. And for certain algorithms (that are on policy and dont have a replay buffer) this might not be necessary. And sure, lightning will do the batching for you, but in case of episodic tasks you might only use a batch size of 1 anyway.

I really do like pytorch-lightning and I really do like your code, I am just wondering is it worth the effort? Or in other words, what are the benefits? Because somehow it feels a little bit like writing code to fit into the lightning framework.

I am just trying to figure out for myself if I should use it for RL or not. And I thought that you might have an opinion that could convince me, because I don't see it yet.

Cheers and thanks a lot :)

djbyrne / core_rl Goto Github PK

core_rl's Introduction

Core RL

Algorithms

Off Policy

On Policy

Training Features

Installation

Quick Start

core_rl's People

Contributors

Stargazers

Watchers

Forkers

core_rl's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs