GithubHelp home page GithubHelp logo

Segmentation fault about pyrddlgym HOT 9 CLOSED

Zhennan-Wu avatar Zhennan-Wu commented on June 16, 2024
Segmentation fault

from pyrddlgym.

Comments (9)

ataitler avatar ataitler commented on June 16, 2024

Hi,
Can you please post here the exact code you are running (Wildfire?) and the stack trace if possible?
We have not tested with wsl so it might be related, we have tested with native linux, windows and Apple silicon without errors.

from pyrddlgym.

mike-gimelfarb avatar mike-gimelfarb commented on June 16, 2024

Hi,
This is also an issue I have observed recently with wsl as well. I believe the error can be traced to the visualizer internals, e.g. matplotlib or pillow, so it is very likely the error is on their end. We will do more tests and let you know if we come up with a solution. In the meantime, a simple solution may be to run the GymExample without the visualization, which should (hopefully) not raise this error. If you still receive the error, can you please share the code and the trace with us as above?

from pyrddlgym.

Zhennan-Wu avatar Zhennan-Wu commented on June 16, 2024

Hi,
Yes, running it without rendering gets rid of the error. Thank you for the help.

from pyrddlgym.

mike-gimelfarb avatar mike-gimelfarb commented on June 16, 2024

Thanks for your report. We will keep this issue open for now until we can find a better solution for wsl.

from pyrddlgym.

Zhennan-Wu avatar Zhennan-Wu commented on June 16, 2024

I did have the stack trace available, just posting it here for reference.

(gdb) run /home/leo/ipc2023/demo.py
Starting program: /home/leo/miniconda3/envs/pyrddlgym/bin/python /home/leo/ipc2023/demo.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff486e700 (LWP 1591)]
[New Thread 0x7ffff206d700 (LWP 1592)]
[New Thread 0x7fffef86c700 (LWP 1593)]
[New Thread 0x7fffed06b700 (LWP 1594)]
[New Thread 0x7fffea86a700 (LWP 1595)]
[New Thread 0x7fffe8069700 (LWP 1596)]
[New Thread 0x7fffe5868700 (LWP 1597)]
[New Thread 0x7fffe3067700 (LWP 1598)]
[New Thread 0x7fffe2866700 (LWP 1599)]
[New Thread 0x7fffde065700 (LWP 1600)]
[New Thread 0x7fffdd864700 (LWP 1601)]
[Thread 0x7fffe2866700 (LWP 1599) exited]
[Thread 0x7fffdd864700 (LWP 1601) exited]
[Thread 0x7fffe3067700 (LWP 1598) exited]
[Thread 0x7fffe8069700 (LWP 1596) exited]
[Thread 0x7fffed06b700 (LWP 1594) exited]
[Thread 0x7fffea86a700 (LWP 1595) exited]
[Thread 0x7fffef86c700 (LWP 1593) exited]
[Thread 0x7ffff486e700 (LWP 1591) exited]
[Thread 0x7fffde065700 (LWP 1600) exited]
[Thread 0x7fffe5868700 (LWP 1597) exited]
[Thread 0x7ffff206d700 (LWP 1592) exited]
[Detaching after fork from child process 1602]
warning: Loadable section ".note.gnu.property" outside of ELF segments
[New Thread 0x7fffdd864700 (LWP 1604)]
[New Thread 0x7fffde065700 (LWP 1605)]
[New Thread 0x7fffe2866700 (LWP 1606)]
[New Thread 0x7fffe3067700 (LWP 1607)]
[New Thread 0x7fffd42e5700 (LWP 1608)]
[New Thread 0x7fffd3ae4700 (LWP 1609)]
[New Thread 0x7fffd32e3700 (LWP 1610)]
[New Thread 0x7fffd2ae2700 (LWP 1611)]
[New Thread 0x7fffd22e1700 (LWP 1612)]
[New Thread 0x7fffd1ae0700 (LWP 1613)]
[New Thread 0x7fffd12df700 (LWP 1614)]
[Thread 0x7fffd42e5700 (LWP 1608) exited]
[Thread 0x7fffd32e3700 (LWP 1610) exited]
[Thread 0x7fffd2ae2700 (LWP 1611) exited]
[Thread 0x7fffd1ae0700 (LWP 1613) exited]
[Thread 0x7fffd12df700 (LWP 1614) exited]
[Thread 0x7fffd22e1700 (LWP 1612) exited]
[Thread 0x7fffd3ae4700 (LWP 1609) exited]
[Thread 0x7fffe3067700 (LWP 1607) exited]
[Thread 0x7fffe2866700 (LWP 1606) exited]
[Thread 0x7fffde065700 (LWP 1605) exited]
[Thread 0x7fffdd864700 (LWP 1604) exited]
[Detaching after fork from child process 1615]
[New Thread 0x7fffd12df700 (LWP 1616)]
[Detaching after vfork from child process 1617]
[New Thread 0x7fffd1ae0700 (LWP 1619)]
[New Thread 0x7fffd22e1700 (LWP 1620)]
[New Thread 0x7fffd2ae2700 (LWP 1621)]
[New Thread 0x7fffd42e5700 (LWP 1622)]
[New Thread 0x7fffd3ae4700 (LWP 1623)]
[New Thread 0x7fffc2d32700 (LWP 1624)]
[New Thread 0x7fffc2531700 (LWP 1625)]
[New Thread 0x7fffc1d30700 (LWP 1626)]
[New Thread 0x7fffc152f700 (LWP 1627)]
[New Thread 0x7fffc0d2e700 (LWP 1628)]
[New Thread 0x7fffbbfff700 (LWP 1629)]
[New Thread 0x7fffbb7fe700 (LWP 1630)]
[New Thread 0x7fffbaffd700 (LWP 1631)]
[New Thread 0x7fffba7fc700 (LWP 1632)]
[New Thread 0x7fffb9ffb700 (LWP 1633)]
[New Thread 0x7fffb97fa700 (LWP 1634)]
episode ended with reward -24310.0
[Thread 0x7fffd12df700 (LWP 1616) exited]
NoneType: None
[Thread 0x7fffbb7fe700 (LWP 1630) exited]
[Thread 0x7fffba7fc700 (LWP 1632) exited]
[Thread 0x7fffbaffd700 (LWP 1631) exited]
[Thread 0x7fffb97fa700 (LWP 1634) exited]
[Thread 0x7fffb9ffb700 (LWP 1633) exited]
[Thread 0x7fffbbfff700 (LWP 1629) exited]
[Thread 0x7fffc0d2e700 (LWP 1628) exited]
[Thread 0x7fffc152f700 (LWP 1627) exited]
[Thread 0x7fffc1d30700 (LWP 1626) exited]
[Thread 0x7fffc2531700 (LWP 1625) exited]
[Thread 0x7fffc2d32700 (LWP 1624) exited]
--Type for more, q to quit, c to continue without paging--c

Thread 25 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1ae0700 (LWP 1619)]
0x00007fffc5ac9efa in ?? () from /usr/lib/wsl/drivers/nvamig.inf_amd64_d36b3e14914fc88f/libnvwgf2umx.so
(gdb) backtrace
#0 0x00007fffc5ac9efa in ?? () from /usr/lib/wsl/drivers/nvamig.inf_amd64_d36b3e14914fc88f/libnvwgf2umx.so
#1 0x00007fffc5ac8c5e in ?? () from /usr/lib/wsl/drivers/nvamig.inf_amd64_d36b3e14914fc88f/libnvwgf2umx.so
#2 0x00007fffc5ac8bd6 in ?? () from /usr/lib/wsl/drivers/nvamig.inf_amd64_d36b3e14914fc88f/libnvwgf2umx.so
#3 0x00007ffff7fa3609 in start_thread (arg=) at pthread_create.c:477
#4 0x00007ffff7d6e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The demo code I ran is Wildfire, also posting it here.

from pyRDDLGym import RDDLEnv
from pyRDDLGym import ExampleManager
from pyRDDLGym.Policies.Agents import RandomAgent

ENV = 'Wildfire'

get the environment infos

EnvInfo = ExampleManager.GetEnvInfo(ENV)

set up the environment class, choose instance 0 because every example has at least one example instance

myEnv = RDDLEnv.RDDLEnv(domain=EnvInfo.get_domain(), instance=EnvInfo.get_instance(0))

set up the environment visualizer

myEnv.set_visualizer(EnvInfo.get_visualizer())

set up an example aget

agent = RandomAgent(action_space=myEnv.action_space, num_actions=myEnv.numConcurrentActions)

total_reward = 0
state = myEnv.reset()

for step in range(myEnv.horizon):
# myEnv.render()
action = agent.sample_action()
next_state, reward, done, info = myEnv.step(action)
total_reward += reward
state = next_state
if done:
break

print("episode ended with reward {}".format(total_reward))
myEnv.close()

from pyrddlgym.

mike-gimelfarb avatar mike-gimelfarb commented on June 16, 2024

Thanks for the stack trace. Are you by any chance using a dedicated gpu (e.g. nvidia) and/or any graphical extension for wsl?

This is almost surely a graphical/driver error with matplotlib/pil and wsl/wsl2.
We will look into this, and get back to you if we find a solution.

from pyrddlgym.

Zhennan-Wu avatar Zhennan-Wu commented on June 16, 2024

I do have nvidia gpu installed but I am not sure about wsl. I tried to print out the driver info and the results are in the following

lspci
3ed2:00:00.0 3D controller: Microsoft Corporation Basic Render Driver
64a1:00:00.0 SCSI storage controller: Red Hat, Inc. Virtio filesystem (rev 01)
686a:00:00.0 SCSI storage controller: Red Hat, Inc. Virtio filesystem (rev 01)
9f09:00:00.0 SCSI storage controller: Red Hat, Inc. Virtio filesystem (rev 01)
a96b:00:00.0 System peripheral: Red Hat, Inc. Virtio file system (rev 01)
cae7:00:00.0 SCSI storage controller: Red Hat, Inc. Virtio console (rev 01)
e7c8:00:00.0 3D controller: Microsoft Corporation Basic Render Driver

from pyrddlgym.

mike-gimelfarb avatar mike-gimelfarb commented on June 16, 2024

Hi,
Thanks for providing this info.

We've investigated this and were able to reproduce the problem in WSL2 using hyper-v virtual machine.
It is likely related to a well-known problem between pygame and wsl2. You can try the solutions posted there:

PyGame issue 3260

You can try updating drivers as suggested there, or (ideally) use a different vm (e.g. vmware) if you need visualization on linux.
Since we cannot provide a definitive solution as of yet, we can leave this problem open for now.

from pyrddlgym.

Zhennan-Wu avatar Zhennan-Wu commented on June 16, 2024

Thank you!

from pyrddlgym.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.