GithubHelp home page GithubHelp logo

zzu0654 / tensor4rl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xiaoxiaoguo/tensor4rl

0.0 1.0 0.0 23 KB

This repository contains an implementation of the paper Hybrid Reinforcement Learning with Expert State Sequences.

License: MIT License

Python 100.00%

tensor4rl's Introduction

Hybrid reinforcement learning with expert state sequences

About this repository

This repository contains an implementation of the paper Hybrid Reinforcement Learning with Expert State Sequences. The implementation is built directly on top of PyTorch implementation of Advantage Actor Critic (A2C).

Dependencies

To get started with the framework, install the following dependencies:

Training and model configuration

A2C baseline (A2C agent in the paper):

python main.py --policy-coef 1 --entropy-coef 0.5 --value-loss-coef 0.2 --dual-act-coef 0 --dual-state-coef 0 --dual-sup-coef 0 --dual-emb-coef 0 --log-dir baseline_a2c 

Hybrid agent combining A2C and behavior cloning from observation with the proposed action inference:

python main.py --policy-coef 1 --entropy-coef 0.5 --value-loss-coef 0.2 --dual-act-coef 2 --dual-sup-coef 2 --dual-emb-coef 0.1 --dual-rank 2 --dual-emb-dim 128 --dual-type dual --log-dir hybrid_dual 

Hybrid agent combining A2C and behavior cloning from observation with the MLP-based action inference (Hybrid-MLP agent in the paper):

python main.py --policy-coef 1 --entropy-coef 0.5 --value-loss-coef 0.2 --dual-act-coef 2 --dual-sup-coef 2 --dual-emb-coef 0.1 --dual-rank 2 --dual-emb-dim 128 --dual-type mlp --log-dir hybrid_mlp 

(dual-rank is used as the number of MLP layers for MLP type action inference model.)

Behavior cloning from observation with the proposed action inference (BC-Dual agent in the paper):

python main.py --policy-coef 0 --entropy-coef 0.5 --value-loss-coef 0 --dual-act-coef 2 --dual-sup-coef 2 --dual-emb-coef 0.1 --dual-rank 2 --dual-emb-dim 128 --dual-type dual --log-dir dual_only 

Behavior cloning from observation with the MLP-based action inference model (BC-MLP agent in the paper):

python main.py --policy-coef 0 --entropy-coef 0.5 --value-loss-coef 0 --dual-act-coef 2 --dual-sup-coef 2 --dual-emb-coef 0.1 --dual-rank 2 --dual-emb-dim 128 --dual-type mlp --log-dir mlp_only 

The noise in the expert demonstration can be controlled using arguments --demo-eps (the non-optimal action ratio) and --demo-eta (the missing state ratio).

License

MIT License

tensor4rl's People

Contributors

xiaoxiaoguo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.