Comments (8)
I tried this with A2C (same code, just with a2c) and got the following error:
Traceback (most recent call last):
File "independent_atari.py", line 7, in <module>
experiment.train(frames=2e6)
File "/home/ben/class_projs/autonomous-learning-library/all/experiments/single_env_experiment.py", line 46, in train
self._run_training_episode()
File "/home/ben/class_projs/autonomous-learning-library/all/experiments/single_env_experiment.py", line 73, in _run_training_episode
action = self._agent.act(state)
File "/home/ben/class_projs/autonomous-learning-library/all/bodies/_body.py", line 24, in act
return self.process_action(self.agent.act(self.process_state(state)))
File "/home/ben/class_projs/autonomous-learning-library/all/bodies/_body.py", line 24, in act
return self.process_action(self.agent.act(self.process_state(state)))
File "/home/ben/class_projs/autonomous-learning-library/all/bodies/_body.py", line 24, in act
return self.process_action(self.agent.act(self.process_state(state)))
[Previous line repeated 1 more time]
File "/home/ben/class_projs/autonomous-learning-library/all/agents/a2c.py", line 61, in act
self._train(states)
File "/home/ben/class_projs/autonomous-learning-library/all/agents/a2c.py", line 69, in _train
states, actions, advantages = self._buffer.advantages(next_states)
File "/home/ben/class_projs/autonomous-learning-library/all/memory/advantage.py", line 38, in advantages
rewards, lengths = self._compute_returns()
File "/home/ben/class_projs/autonomous-learning-library/all/memory/advantage.py", line 52, in _compute_returns
device=self._rewards[0].device
AttributeError: 'float' object has no attribute 'device'
from autonomous-learning-library.
The PPO implementation is a ParallelAgent
/ParallelPreset
, so it is not compatible with SingleEnvExperiment
. Try using a ParallelEnvExperiment
and setting ppo.hyperparameters(n_envs=1)
.
from autonomous-learning-library.
I don't think this is a bug, but it would probably be useful for the experiment types to enforce the agent type and throw a helpful error message instead of throwing random runtime errors, so I'm classifying this as "style."
from autonomous-learning-library.
Merged #241 to develop for now. It should allow n_envs=1 to work.
from autonomous-learning-library.
Ah, I see. Yes, it is very hard to get that from the error message.
Before, when we were trying to use ALL for our primary work with pettingzoo, the # 1 issue we had with the library, the reason that made us turn away from it, was the error messages were just too difficult to understand. It made every little mistake we made take an hour to track down.
Not sure what can be done about that, but explicit type checking would be a good start. For the policies/approximations, shape checking would also be super helpful. I got a ton of weird error messages when I was trying to make custom neural networks, and using incorrect shapes for the input and output layers.
from autonomous-learning-library.
from autonomous-learning-library.
So more context for this particular issue, the problem came up with someone wanted to use PPO to train one agent and DQN to use another. This is a very unusual use case that is probably not a good idea, but it brought up the fact that PPO isn't really supported at all for multiagent.
I made a small Preset wrapper and an Agent wrapper to handle this issue.
https://gist.github.com/weepingwillowben/400b42d54b6e57034da1e5293166aa80
Not sure if this should be officially supported or not.
from autonomous-learning-library.
I think this is fine for single agent now. #288 will handle the multiagent case.
from autonomous-learning-library.
Related Issues (20)
- Multiagent Atari not working with non-DQN based presets
- Refactor `Preset` and `Experiment` API
- Refactor tensorboard logs
- Refactor `State` to increase flexibility and enable nesting
- Log hyperparameters with run HOT 1
- Update Linting/Formatting
- Improve hyperparameter parsing
- Fix deployment workflow
- Cleanup test output HOT 1
- Update dependencies to latest HOT 1
- Gym DuplicateEnvironment does not duplicate all params properly
- Make it easy to inject own Logger
- Update logging behavior HOT 1
- Change SAC test sampling
- add_summary bug
- Update benchmarks and run on release candidate
- Fix plotting code
- Improve hyperparmeter logging
- Include swig/unrar in dependencies
- Update API Documentation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autonomous-learning-library.