Comments (5)
Hi,
Could you please let me know if this is my code problem or the package problem? @araffin
Thank you
from stable-baselines3-contrib.
@araffin Thank you for your response.
Could you please explain what you mean by "more information"? Should I post all the environment code?
from stable-baselines3-contrib.
The detailed error is: @araffin
An error occurred during training: Function 'MseLossBackward0' returned nan values in its 1th output.
C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\torch\autograd\__init__.py:200: UserWarning: Error detected in MseLossBackward0. Traceback of forward call that caused the error:
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel_launcher.py", line 17, in <module>
app.launch_new_instance()
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\traitlets\config\application.py", line 1046, in launch_instance
app.start()
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelapp.py", line 736, in start
self.io_loop.start()
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\tornado\platform\asyncio.py", line 195, in start
self.asyncio_loop.run_forever()
File "C:\Program Files\Python311\Lib\asyncio\base_events.py", line 607, in run_forever
self._run_once()
File "C:\Program Files\Python311\Lib\asyncio\base_events.py", line 1922, in _run_once
handle._run()
File "C:\Program Files\Python311\Lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 516, in dispatch_queue
await self.process_one()
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 505, in process_one
await dispatch(*args)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 412, in dispatch_shell
await result
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 740, in execute_request
reply_content = await reply_content
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\ipkernel.py", line 422, in do_execute
res = shell.run_cell(
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\zmqshell.py", line 546, in run_cell
return super().run_cell(*args, **kwargs)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3024, in run_cell
result = self._run_cell(
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3079, in _run_cell
result = runner(coro)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
coro.send(None)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3284, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3466, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "C:\Users\12368\AppData\Local\Temp\ipykernel_23684\999724894.py", line 2, in <module>
model.learn(1000000, callback=checkpoint_callback)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 547, in learn
self.train()
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 447, in train
value_loss = F.mse_loss(rollout_data.returns, values_pred)
File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\torch\nn\functional.py", line 3295, in mse_loss
return torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
(Triggered internally at ..\torch\csrc\autograd\python_anomaly_mode.cpp:119.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
from stable-baselines3-contrib.
@araffin More info if that helps:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[72], line 2
1 try:
----> 2 model.learn(1000000)
3 except (AssertionError, ValueError) as e:
4 print("An error occurred during training:", e)
File ~\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py:547, in MaskablePPO.learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, use_masking, progress_bar)
544 self.logger.record("time/total_timesteps", self.num_timesteps, exclude="tensorboard")
545 self.logger.dump(step=self.num_timesteps)
--> 547 self.train()
549 callback.on_training_end()
551 return self
File ~\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py:478, in MaskablePPO.train(self)
476 # Optimization step
477 self.policy.optimizer.zero_grad()
--> 478 loss.backward()
479 # Clip grad norm
480 th.nn.utils.clip_grad_norm_(self.policy.parameters(), self.max_grad_norm)
File ~\AppData\Roaming\Python\Python311\site-packages\torch\_tensor.py:487, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
477 if has_torch_function_unary(self):
478 return handle_torch_function(
479 Tensor.backward,
480 (self,),
(...)
485 inputs=inputs,
486 )
--> 487 torch.autograd.backward(
488 self, gradient, retain_graph, create_graph, inputs=inputs
489 )
File ~\AppData\Roaming\Python\Python311\site-packages\torch\autograd\__init__.py:200, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
195 retain_graph = create_graph
197 # The reason we repeat same the comment below is that
198 # some Python versions print out the first line of a multi-line function
199 # calls in the traceback and some print out the last line
--> 200 Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
201 tensors, grad_tensors_, retain_graph, create_graph, inputs,
202 allow_unreachable=True, accumulate_grad=True)
RuntimeError: Function 'MseLossBackward0' returned nan values in its 1th output.
from stable-baselines3-contrib.
Might be a duplicate of #81 or #195
Probably a combination from your env/hyperparameters.
Please note that we do not offer tech support, see #81 (comment)
from stable-baselines3-contrib.
Related Issues (20)
- How to use LSTM ? RecurrentPPO from sb3-contrib HOT 6
- Worse training with Vectorized Environment
- Recurrent PPO Not Training Well on a Very Simple Environment
- Predicting actions after using MaskablePPO model outputs invalid action HOT 2
- [Question] Recurrent PPO evaluation HOT 2
- [Feature Request] Expand RNN Options and Algorithm Flexibility HOT 2
- [Question] how to use "lstm_states" from rollout_buffer to reconstruct LSTM states during training HOT 2
- [Feature Request] STAC algorithm HOT 4
- Implementing "Sibling Rivalry" Method from "Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards" Paper HOT 1
- EvalCallback crashes Maskable PPO without error HOT 3
- Episodic training with TQC? HOT 2
- [Question] LSTM observations HOT 3
- [Question] Simple way to implement data augmentation when training agent HOT 2
- [Question] Why does MaskablePPO does not mask with some logic with last observation? HOT 4
- [Feature Request] Implement CrossQ
- [Question] RecurrentPPO: Reset LSTM states early? HOT 3
- [Question] What is the difference between old_distribution and distribution in train function of TRPO HOT 2
- [Question] Recurrent Maskable PPO ?!? Rudder ?!? HOT 1
- Dependent Actions in MultiDiscrete Action Space HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stable-baselines3-contrib.