GithubHelp home page GithubHelp logo

pythonlessons / reinforcement_learning Goto Github PK

View Code? Open in Web Editor NEW
351.0 7.0 149.0 89.09 MB

Reinforcement learning tutorials

Home Page: https://pylessons.com/

License: MIT License

Python 100.00%
dqn ddqn dueling-dqn d3qn reinforcement-learning a2c policy-gradient a3c ppo ppo-agent

reinforcement_learning's People

Contributors

pythonlessons avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

reinforcement_learning's Issues

module 'tensorflow' has no attribute 'ConfigProto'

I have done the TF backport changes by adding %tensorflow_version 1.x

but still getting the error...
TensorFlow is already loaded. Please restart the runtime to change versions. So is this tested in tensorflow 1.x or TF 2?


AttributeError Traceback (most recent call last)

in ()
21
22 # configure Keras and TensorFlow sessions and graph
---> 23 config = tf.ConfigProto()
24 config.gpu_options.allow_growth = True
25 sess = tf.Session(config=config)

AttributeError: module 'tensorflow' has no attribute 'ConfigProto'

Add licence

Hi, as many people I'm (probably) use your implementation as a start for a project of mine, it would be great if you add a licence to it.
Many thanks for this awesome repo ๐Ÿ‘ :)

library version

Could you update an environment.txt for me easier to run your code. Since i got some errors below.

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/minh/anaconda3/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/home/minh/anaconda3/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "main.py", line 302, in train_threading
action, prediction = agent.act(state)
File "main.py", line 107, in act
prediction = self.Actor.predict(state)[0]
File "/home/minh/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1462, in predict
callbacks=callbacks)
File "/home/minh/anaconda3/lib/python3.7/site-packages/keras/engine/training_arrays.py", line 324, in predict_loop
batch_outs = f(ins_batch)
File "/home/minh/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3292, in call
run_metadata=self.run_metadata)
File "/home/minh/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_1/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_1/bias)
[[{{node dense_1/BiasAdd/ReadVariableOp}}]]

A problem in PPOAgent

Hi, Rokas:

First of all thanks for your great tutorial on reinforcement learning, I went through all the series and learned a lot.

In the PPOAgent I think there may be something wrong with this line. When I vstack the discounted_r (shape of (n,1)) and subtract it with predicted values (shape of (n,)), the advantages become shape of (n,n). So I think maybe we should not vstack discounted_r, but vstack the advantages in this line advantages = np.vstack(discounted_r - values), then the advantages are shape of (n,1), which is the expected result.

Thanks.

Dueling question

Hello,
Thanks for a great project. It's very useful. I have a question on the model code related to the Dueling algorithm. For example:
Pong-v0_DQN_CNN_TF2.py

Here is an example of the code:
action_advantage = Lambda(lambda a: a[:, :] - K.mean(a[:, :], keepdims=True), output_shape=(action_space,))(action_advantage)

let's say our batch looks like this:
a = tf.constant([[1.0, 2.0], [-2.0, 3.0], [3.0, -4.0]])
print('a=', a)
a= tf.Tensor(
[[ 1. 2.]
[-2. 3.]
[ 3. -4.]], shape=(3, 2), dtype=float32)

The result of the "K.mean" function will be a tensor with shape (1, 1):
print('Kmean=', K.mean(a[:, :], keepdims=True))
Kmean= tf.Tensor([[0.5]], shape=(1, 1), dtype=float32)

Shouldn't there be a tensor with shape (3, 1)?
print('Kmean=', K.mean(a[:, :], axis=1, keepdims=True))
Kmean= tf.Tensor(
[[ 1.5]
[ 0.5]
[-0.5]], shape=(3, 1), dtype=float32)

If we assume that our batch contains 3 elements, then the mean value should be calculated for each element in the batch separately. Or am I missing something ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.