Light

[Bug]: producing NAN values during training in MaskablePPO about stable-baselines3-contrib HOT 5 OPEN

vahidqo commented on June 3, 2024

[Bug]: producing NAN values during training in MaskablePPO

from stable-baselines3-contrib.

Comments (5)

vahidqo commented on June 3, 2024

Hi,

Could you please let me know if this is my code problem or the package problem? @araffin

Thank you

from stable-baselines3-contrib.

vahidqo commented on June 3, 2024

@araffin Thank you for your response.
Could you please explain what you mean by "more information"? Should I post all the environment code?

from stable-baselines3-contrib.

vahidqo commented on June 3, 2024

The detailed error is: @araffin

An error occurred during training: Function 'MseLossBackward0' returned nan values in its 1th output.
C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\torch\autograd\__init__.py:200: UserWarning: Error detected in MseLossBackward0. Traceback of forward call that caused the error:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\traitlets\config\application.py", line 1046, in launch_instance
    app.start()
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelapp.py", line 736, in start
    self.io_loop.start()
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\tornado\platform\asyncio.py", line 195, in start
    self.asyncio_loop.run_forever()
  File "C:\Program Files\Python311\Lib\asyncio\base_events.py", line 607, in run_forever
    self._run_once()
  File "C:\Program Files\Python311\Lib\asyncio\base_events.py", line 1922, in _run_once
    handle._run()
  File "C:\Program Files\Python311\Lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 516, in dispatch_queue
    await self.process_one()
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 505, in process_one
    await dispatch(*args)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 412, in dispatch_shell
    await result
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\kernelbase.py", line 740, in execute_request
    reply_content = await reply_content
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\ipkernel.py", line 422, in do_execute
    res = shell.run_cell(
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\ipykernel\zmqshell.py", line 546, in run_cell
    return super().run_cell(*args, **kwargs)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3024, in run_cell
    result = self._run_cell(
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3079, in _run_cell
    result = runner(coro)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3284, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3466, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "C:\Users\12368\AppData\Local\Temp\ipykernel_23684\999724894.py", line 2, in <module>
    model.learn(1000000, callback=checkpoint_callback)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 547, in learn
    self.train()
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 447, in train
    value_loss = F.mse_loss(rollout_data.returns, values_pred)
  File "C:\Users\12368\AppData\Roaming\Python\Python311\site-packages\torch\nn\functional.py", line 3295, in mse_loss
    return torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
 (Triggered internally at ..\torch\csrc\autograd\python_anomaly_mode.cpp:119.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass

from stable-baselines3-contrib.

vahidqo commented on June 3, 2024

@araffin More info if that helps:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[72], line 2
      1 try:
----> 2     model.learn(1000000)
      3 except (AssertionError, ValueError) as e:
      4     print("An error occurred during training:", e)

File ~\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py:547, in MaskablePPO.learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, use_masking, progress_bar)
    544         self.logger.record("time/total_timesteps", self.num_timesteps, exclude="tensorboard")
    545         self.logger.dump(step=self.num_timesteps)
--> 547     self.train()
    549 callback.on_training_end()
    551 return self

File ~\AppData\Roaming\Python\Python311\site-packages\sb3_contrib\ppo_mask\ppo_mask.py:478, in MaskablePPO.train(self)
    476 # Optimization step
    477 self.policy.optimizer.zero_grad()
--> 478 loss.backward()
    479 # Clip grad norm
    480 th.nn.utils.clip_grad_norm_(self.policy.parameters(), self.max_grad_norm)

File ~\AppData\Roaming\Python\Python311\site-packages\torch\_tensor.py:487, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
    477 if has_torch_function_unary(self):
    478     return handle_torch_function(
    479         Tensor.backward,
    480         (self,),
   (...)
    485         inputs=inputs,
    486     )
--> 487 torch.autograd.backward(
    488     self, gradient, retain_graph, create_graph, inputs=inputs
    489 )

File ~\AppData\Roaming\Python\Python311\site-packages\torch\autograd\__init__.py:200, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    195     retain_graph = create_graph
    197 # The reason we repeat same the comment below is that
    198 # some Python versions print out the first line of a multi-line function
    199 # calls in the traceback and some print out the last line
--> 200 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    201     tensors, grad_tensors_, retain_graph, create_graph, inputs,
    202     allow_unreachable=True, accumulate_grad=True)

RuntimeError: Function 'MseLossBackward0' returned nan values in its 1th output.

from stable-baselines3-contrib.

araffin commented on June 3, 2024

Might be a duplicate of #81 or #195
Probably a combination from your env/hyperparameters.

Please note that we do not offer tech support, see #81 (comment)

from stable-baselines3-contrib.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs