pythonlessons / reinforcement_learning Goto Github PK
View Code? Open in Web Editor NEWReinforcement learning tutorials
Home Page: https://pylessons.com/
License: MIT License
Reinforcement learning tutorials
Home Page: https://pylessons.com/
License: MIT License
I have done the TF backport changes by adding %tensorflow_version 1.x
but still getting the error...
TensorFlow is already loaded. Please restart the runtime to change versions. So is this tested in tensorflow 1.x or TF 2?
AttributeError Traceback (most recent call last)
in ()
21
22 # configure Keras and TensorFlow sessions and graph
---> 23 config = tf.ConfigProto()
24 config.gpu_options.allow_growth = True
25 sess = tf.Session(config=config)
AttributeError: module 'tensorflow' has no attribute 'ConfigProto'
Hi, as many people I'm (probably) use your implementation as a start for a project of mine, it would be great if you add a licence to it.
Many thanks for this awesome repo ๐ :)
I've noticed that 'break' statement is missed at the end of the 'done' condition. It seems that the inner loop works infinitely without it as well as the replay experience grows. Found it during re-implementation of this tutorial in pytorch.
Could you update an environment.txt for me easier to run your code. Since i got some errors below.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/minh/anaconda3/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/home/minh/anaconda3/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "main.py", line 302, in train_threading
action, prediction = agent.act(state)
File "main.py", line 107, in act
prediction = self.Actor.predict(state)[0]
File "/home/minh/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1462, in predict
callbacks=callbacks)
File "/home/minh/anaconda3/lib/python3.7/site-packages/keras/engine/training_arrays.py", line 324, in predict_loop
batch_outs = f(ins_batch)
File "/home/minh/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3292, in call
run_metadata=self.run_metadata)
File "/home/minh/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_1/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_1/bias)
[[{{node dense_1/BiasAdd/ReadVariableOp}}]]
Hi, Rokas:
First of all thanks for your great tutorial on reinforcement learning, I went through all the series and learned a lot.
In the PPOAgent I think there may be something wrong with this line. When I vstack
the discounted_r (shape of (n,1)) and subtract it with predicted values (shape of (n,)), the advantages become shape of (n,n). So I think maybe we should not vstack
discounted_r, but vstack
the advantages in this line advantages = np.vstack(discounted_r - values)
, then the advantages are shape of (n,1), which is the expected result.
Thanks.
Hello,
Thanks for a great project. It's very useful. I have a question on the model code related to the Dueling algorithm. For example:
Pong-v0_DQN_CNN_TF2.py
Here is an example of the code:
action_advantage = Lambda(lambda a: a[:, :] - K.mean(a[:, :], keepdims=True), output_shape=(action_space,))(action_advantage)
let's say our batch looks like this:
a = tf.constant([[1.0, 2.0], [-2.0, 3.0], [3.0, -4.0]])
print('a=', a)
a= tf.Tensor(
[[ 1. 2.]
[-2. 3.]
[ 3. -4.]], shape=(3, 2), dtype=float32)
The result of the "K.mean" function will be a tensor with shape (1, 1):
print('Kmean=', K.mean(a[:, :], keepdims=True))
Kmean= tf.Tensor([[0.5]], shape=(1, 1), dtype=float32)
Shouldn't there be a tensor with shape (3, 1)?
print('Kmean=', K.mean(a[:, :], axis=1, keepdims=True))
Kmean= tf.Tensor(
[[ 1.5]
[ 0.5]
[-0.5]], shape=(3, 1), dtype=float32)
If we assume that our batch contains 3 elements, then the mean value should be calculated for each element in the batch separately. Or am I missing something ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.