GithubHelp home page GithubHelp logo

awjuliani / meta-rl Goto Github PK

View Code? Open in Web Editor NEW
400.0 400.0 109.0 834 KB

Implementation of Meta-RL A3C algorithm

License: MIT License

Python 15.06% Jupyter Notebook 84.94%
reinforcement-learning tensorflow

meta-rl's People

Contributors

awjuliani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meta-rl's Issues

Why remove n-steps function?

Hello, I have question. If I use n-step to updated episode-buffer which will work better?

I thinks off-policy will work better than on-policy. sorry I haven't understand meta-learning, I just want ask your opinion

Slicing bandit image in helper.py

Hi, thanks for sharing this very interesting project !!!

I think I might have found a typo/bug?

Traceback (most recent call last):
  File "/home/ajay/anaconda3/envs/rllab3/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/ajay/anaconda3/envs/rllab3/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-9-be9dc3580d09>", line 33, in <lambda>
    worker_work = lambda: worker.work(gamma,sess,coord,saver,train)
  File "<ipython-input-4-804d524ea4f0>", line 94, in work
    episode_frames.append(set_image_bandit(episode_reward,self.env.bandit,a,t))
  File "/home/ajay/PythonProjects/Meta-RL/helper.py", line 65, in set_image_bandit
    bandit_image[115:115+values[0]*2.5,20:75,:] = [0,255.0,0]
TypeError: slice indices must be integers or None or have an __index__ method

On another note I'd like to try to apply this code to different problems. The problem I'd like to examine is where the agent tries to attempt to make small logic circuits at each time-step. Mathematically I'd like agent to be able to output a simply transition probability matrix. That is, rather than a distribution over an action set, the agent's policy network outputs a small square matrix, say 8x8 with a 1 on each row. I guess I could somehow make a_dist be vector of length 64, and then apply 8 softmax's followed by one_hots or argmax's, but it seems rather messy? Also I'm not sure how this effects the "responsible output" in the loss calculation? Just wondered if you've ever seen something like that before? Cheers, Ajay

A3C-Meta-Grid did not work for CartPole

Hi awjuliani,

I have enjoyed reading your excellent examples. I tried to modify your code of A3C-Meta-Grid to solve CartPole problem, but it did not work. Even when I remove the RNN layer, it still did not work. It roughly can score about 60 points at maximum. Do you know why?

 Thanks!

Init AC Network

Hi, thank you for your code.

I have a issue with init AC Network for meta bandit:
hidden = tf.concat(1, [self.prev_rewards, self.prev_actions_onehot, self.timestep])

It gave me a error:
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Thank you very much for replay

'NoneType' object has no attribute 'model_checkpoint_path'

Hello and happy 2020! I am trying to run A3C-Meta-Bandit and run into the error below. All the cells are running fine, except for the final cell of the python notebook script which produces this error. Is there some folder that is missing? Thank you!

image

tensorflow 2

Hi, thank you for the code! do you know of any similar implementation of meta rl using tensorflow 2?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.