32 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.

Python 1.56% Jupyter Notebook 98.44%

deep-rl-algorithms github-udacity dqn-ppo-ddpg dqn td3 cartpole bipedalwalker deep-reinforcement-learning sac carracing

deep-reinforcement-learning-algorithms's People

Contributors

Stargazers

Watchers

Forkers

roshray donghaoqiao muhamuttaqien northtiger jatinmayekar alp-anka zhangbei123 vishalkrishnablaze jorditorresbcn kyunghoon-jung lgh0504 dkinneybu libin19861023 zhuoranli23 stepneverstop wodole jackyvan alextooter jingmouren staminatang traveldriffter allensmile mingkin zhanyon qinjielin-nu skyknights gabriel1521 lindazha0 xiayuyang niharikavadlamudi whiskyching girime262 zhangjing2020 samuelsuntree huanghaoyu1997 ccoo sparksfly8 stevenjokess eugeneyuz darrenzhang01 tsaijk chenshengduo donkas fdmartins kelvinlz jailukanna danbodanbo nicolizamacorrea ming-er yangzi-0207 arghyachatterjee mlgenometech 123world alexdavydov357 hmcbsj neverstoplearn mayur62987 nebulabiu ciniks117 wolflegend99 rk753 pythonairoad sahanabalappa darkraipro 1zzc hahacome cmuslima wesley-yang alirezashamsoshoara hongyonghan yang970901 mxd6 scm-codes yash-tandon tianyu-z rohit-chandra minalp22 nmar33 upqsdy46853 dogballlee karina0711 zhenyuanlin mste-sysu pa-wan zhiyizeng rubenwol rl-code-lib tjevgerres jck-1096 devineng czqstrong aleksficek baldwin054212 touranisatyajit zyc9894 dandan0102 zijiwang liao-wk python-repository-hub aishan224

deep-reinforcement-learning-algorithms's Issues

How to remove the environment logging in the console?

I have a lot of this kind of logging when a new episode begins: Track generation: 1220..1529 -> 309-tiles track

I noticed that there is no such kind of logging in your console in the .ipynb page. Can you tell me how to remove them? Thank you very much.

Reward shaping not removed in evaluation in CarRacing-From-Pixels-PPO

Hi,

The figure and log in README shows scores >1000, which due to the CarRacing's design, is not quite possible.
It turns out that the reward shaping in Wrapper.step() is not removed in evaluation and that leads to incorrect results.
Commenting out relevant lines, I got an average score of 820 over 100 episodes.

Why did you remove the death penalty for solving CarRacing with PPO from raw pixels?

Hi, I've been looking through your code as a reference to figure out how to solve CarRacing-v0.

Mine works up to a point then has a catastrophic performance crash.
The only difference I can find between my version and yours is that When the unwrapped environment is done (fails) the agent gets a big negative reward.
You removed this in your wrapper, and I don't understand why.

What's the significance of offsetting the reward there?

rafael1s / deep-reinforcement-learning-algorithms Goto Github PK

deep-reinforcement-learning-algorithms's People

Contributors

Stargazers

Watchers

Forkers

deep-reinforcement-learning-algorithms's Issues

How to remove the environment logging in the console?

Reward shaping not removed in evaluation in CarRacing-From-Pixels-PPO

Why did you remove the death penalty for solving CarRacing with PPO from raw pixels?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs