ikostrikov / pytorch-ddpg-naf Goto Github PK
View Code? Open in Web Editor NEWImplementation of algorithms for continuous control (DDPG and NAF).
License: MIT License
Implementation of algorithms for continuous control (DDPG and NAF).
License: MIT License
I tried running the code with HalfCheetah and commented out the wrappers.Monitor(...) line and any line that rendered the result. I get the error:
Traceback (most recent call last):
File "main.py", line 98, in
agent.update_parameters(batch)
File "/home/sbhupatiraju/pytorch-ddpg-naf/naf.py", line 126, in update_parameters
loss.backward()
File "/home/sbhupatiraju/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 156, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/sbhupatiraju/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 98, in backward
variables, grad_variables, retain_graph)
RuntimeError: element 0 of variables tuple is volatile
Any idea on what might be going on?
hi ikostrikov, I got this error when running your code
Traceback (most recent call last):
File "main.py", line 89, in <module>
agent.update_parameters(batch)
File "/home/andrewliao11/Work/pytorch-naf/naf.py", line 121, in update_parameters
param.grad.data.clamp(-1, 1)
AttributeError: 'NoneType' object has no attribute 'data'
the original code is:
for param in self.model.parameters():
param.grad.data.clamp(-1, 1)
maybe we should modify in into:
torch.nn.utils.clip_grad_norm(self.model.parameters(), 1)
I'm just a newbie to pytorch, not sure if it's right, thx!
pytorch-ddpg-naf/normalized_actions.py
Line 16 in 7870655
Should this return action, not actions?
Line 121 in d2e587a
The NAF algorithm does not work on Pendulum or any of the PyBullet environments. @ikostrikov Do you have any guesses why that might be the case? Which environments did you experiment with this code on? In case you used different hyperparameters than the default values, could you mention the changes that need to be made to get the NAF algorithm working.
Hi, I was wondering if there is any particular reason why this repo doesn't use parallel environments like those in the a2c-ppo-acktr repo.
Hi @ikostrikov ,
I appreciate your implementation, and I wonder if you've benchmarked your implementation?
If so, can I have some roughly results. Many thanks!
Could you help saving the learnt model after each updates.
Hi,
I am doing some work about RL, and very interested in the two algorithms. I have tried to train your models both on CPU and GPU, however, both outputted "out of memory" error. The memory in use was keeping increasing.
It seems that the data and/or the model in former steps are not released . And the code is very similar to the example, as follows:
action = agent.select_action(state, ounoise, param_noise)
next_state, reward, done, info = env.step(action.cpu().numpy()[0])
total_numsteps += 1
episode_reward += reward
action = torch.Tensor(action.cpu())
mask = torch.Tensor([not done])
next_state = torch.Tensor(next_state.cpu())
reward = torch.Tensor([reward])
# pdb.set_trace()
memory.push(state, action, mask, next_state, reward)
state = next_state
if len(memory) > args.batch_size:
for _ in range(args.updates_per_step):
transitions = memory.sample(args.batch_size)
batch = Transition(*zip(*transitions))
value_loss, policy_loss = agent.update_parameters(batch)
writer.add_scalar('loss/value', value_loss, updates)
writer.add_scalar('loss/policy', policy_loss, updates)
updates += 1
Would you please help to solve the problem? Thanks in advance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.