GithubHelp home page GithubHelp logo

real-stanford / flingbot Goto Github PK

View Code? Open in Web Editor NEW
101.0 4.0 25.0 19.92 MB

[CoRL 2021 Best System Paper] This repository contains code for training and evaluating FlingBot in both simulation and real-world settings on a dual-UR5 robot arm setup for Ubuntu 18.04

Home Page: https://flingbot.cs.columbia.edu/

Dockerfile 0.07% C++ 66.17% C 14.82% Objective-C 0.24% Python 18.49% Shell 0.04% CMake 0.17%
robotics cloth-simulation computer-vision

flingbot's People

Contributors

huy-ha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

flingbot's Issues

How can I run the environment outside of Docker?

How can I run the environment outside of Docker? After prepare.sh and compile.sh are compiled in Docker, .so file is created, but the environment cannot be run outside of Docker. Is it because additional operations are needed?

Readme Error! 'tasks_path' is a unrecognized arguments!!

Hi @huy-ha!

When I using the commend:

python run_sim.py --tasks_path flingbot-rect-train.hdf5 --num_processes 16 --log flingbot-train-from-scratch --action_primitives fling

to train Flingbot, an error occured:
Dynamic Cloth Manipulation: error: unrecognized arguments: --tasks_path flingbot-rect-train.hdf5

I also tried to change the argument tasks_path, like tasks and datasets_path. But all of these changes were unhelpful.

I need your help, dear huy-ha.

ImportError: /home/hc/dextairity/PyFlex/bindings/build/pyflex.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cudaSetupArgument

Hi guys,
When I run python test_sim.py, I get the following error:

Traceback (most recent call last): File "test_sim.py", line 2, in <module> from sim_env import SimEnv File "/home/hc/dextairity/sim_env.py", line 7, in <module> import pyflex ImportError: /home/hc/dextairity/PyFlex/bindings/build/pyflex.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cudaSetupArgument

Can anyone help me??

conda env and openexr install error

When I tried to create the conda env inside the flingbot Docker instance, I got this error:

    Running setup.py install for openexr: started
    Running setup.py install for openexr: finished with status 'error'

Pip subprocess error:
  ERROR: Command errored out with exit status 1:
   command: /root/miniconda3/envs/flingbot/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-aizbv_dq
       cwd: /tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/
  Complete output (16 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.6
  copying Imath.py -> build/lib.linux-x86_64-3.6
  running build_ext
  building 'OpenEXR' extension
  creating build/temp.linux-x86_64-3.6
  gcc -pthread -B /root/miniconda3/envs/flingbot/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/OpenEXR -I/usr/local/include/OpenEXR -I/opt/local/include/OpenEXR -I/root/miniconda3/envs/flingbot/include/python3.6m -c OpenEXR.cpp -o build/temp.linux-x86_64-3.6/OpenEXR.o -g -DVERSION="1.3.2"
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  OpenEXR.cpp:36:10: fatal error: ImathBox.h: No such file or directory
   #include <ImathBox.h>
            ^~~~~~~~~~~~
  compilation terminated.
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for openexr
    ERROR: Command errored out with exit status 1:
     command: /root/miniconda3/envs/flingbot/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-a91hs22t/install-record.txt --single-version-externally-managed --compile --install-headers /root/miniconda3/envs/flingbot/include/python3.6m/openexr
         cwd: /tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/
    Complete output (16 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.6
    copying Imath.py -> build/lib.linux-x86_64-3.6
    running build_ext
    building 'OpenEXR' extension
    creating build/temp.linux-x86_64-3.6
    gcc -pthread -B /root/miniconda3/envs/flingbot/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/OpenEXR -I/usr/local/include/OpenEXR -I/opt/local/include/OpenEXR -I/root/miniconda3/envs/flingbot/include/python3.6m -c OpenEXR.cpp -o build/temp.linux-x86_64-3.6/OpenEXR.o -g -DVERSION="1.3.2"
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    OpenEXR.cpp:36:10: fatal error: ImathBox.h: No such file or directory
     #include <ImathBox.h>
              ^~~~~~~~~~~~
    compilation terminated.
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /root/miniconda3/envs/flingbot/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-zlilc69r/openexr_9a1d5c6112364fe5bb20b4c95e63c15e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-a91hs22t/install-record.txt --single-version-externally-managed --compile --install-headers /root/miniconda3/envs/flingbot/include/python3.6m/openexr Check the logs for full command output.

failed

CondaEnvException: Pip failed

Installing the "libopenexr-dev" package inside the Docker instance as suggested here, and then creating the conda environment fixes this issue.

Could not find a package configuration file provided by "pybind11"

Hello,

I followed the instructions exactly in README.

Entered the docker environment using this command:

sudo docker exec -t -i 0e912bc162b8 /bin/bash

When I tried to compile the .so inside the docker, I got the following error:

CMake Error at CMakeLists.txt:5 (find_package):
  By not providing "Findpybind11.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "pybind11",
  but CMake did not find one.

  Could not find a package configuration file provided by "pybind11" with any
  of the following names:

    pybind11Config.cmake
    pybind11-config.cmake

  Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set
  "pybind11_DIR" to a directory containing one of the above files.  If
  "pybind11" provides a separate development package or SDK, be sure it has
  been installed.


-- Configuring incomplete, errors occurred!
See also "/workspace/flingbot/PyFlex/bindings/build/CMakeFiles/CMakeOutput.log".
See also "/workspace/flingbot/PyFlex/bindings/build/CMakeFiles/CMakeError.log".
make: *** No targets specified and no makefile found.  Stop.

Interestingly, I was able to compile successfully outside of docker (in the conda environment). But when I run the sample command, I got the following error:

raceback (most recent call last):
  File "run_sim.py", line 1, in <module>
    from utils import (
  File "/home/workspace/flingbot/utils.py", line 2, in <module>
    from environment import SimEnv, TaskLoader
  File "/home/workspace/flingbot/environment/__init__.py", line 1, in <module>
    from .simEnv import SimEnv
  File "/home/workspace/flingbot/environment/simEnv.py", line 3, in <module>
    from .utils import (
  File "/home/workspace/flingbot/environment/utils.py", line 7, in <module>
    import pyflex
ImportError: /home/workspace/flingbot/PyFlex/bindings/build/pyflex.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cudaSetupArgument

RayActorError: The actor died unexpectedly before finishing this task.

ID: fffffffffffffffffd5f641d1e7ed592e00f065c01000000 Worker ID: 20556ec92d7abbb0b2bf7fb0d363e865ab7587196b608cb3605c3f2f Node ID: 75d7f57c65e988cab741a0c5412548fb08f8fcb7e118eb82fc38fc4d Worker IP address: 172.17.0.2 Worker port: 37185 Worker PID: 1224 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
Traceback (most recent call last):
File "run_sim.py", line 46, in
envs, task_loader = setup_envs(dataset=dataset_path, **vars(args))
File "/workspace/flingbot/utils.py", line 158, in setup_envs
ray.get([e.setup_ray.remote(e) for e in envs])
File "/home/li/anaconda3/envs/flingbot/lib/python3.6/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/li/anaconda3/envs/flingbot/lib/python3.6/site-packages/ray/_private/worker.py", line 2277, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
class_name: SimEnv
actor_id: fd5f641d1e7ed592e00f065c01000000
pid: 1224
namespace: 79413bf6-9d63-48f7-baa3-20f27c337fe9
ip: 172.17.0.2
The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
The actor never ran - it was cancelled before it started running.

ray.get([e.reset.remote() for e in envs]) does not work

I set the environment according to the instruction without any error.
But, when I decide to train the model. the ray.get does not work anymore.

input:
python run_sim.py --tasks flingbot-rect-train.hdf5 --num_processes 2 --log flingbot-train-from-scratch --action_primitives fling

the code is following. (In order to debug, I printed some information)

    #run_sim.py
    print(envs)
    observations = ray.get([e.reset.remote() for e in envs])

the output of the terminal:

2022-07-09 01:18:03,451 INFO services.py:1476 -- View the Ray dashboard at http://127.0.0.1:8265
SEEDING WITH 0
[Policy] Action primitives:
        fling
Replay Buffer path: flingbot-train-from-scratch/replay_buffer.hdf5
[Actor(SimEnv, d24c6cc9f6d506fd4331284101000000), Actor(SimEnv, 4b8d98d4a8025b3e5e0e3ccf01000000)]

The code did not run further, The code is stuck in an endless wait without outputting any more information.

OutOfMemoryError: Task was killed due to the node running low on memory.

Traceback (most recent call last):
File "run_sim.py", line 152, in
remaining_observations=remaining_observations)
File "/home/zcs/work/train-my-fling/flingbot/utils.py", line 416, in step_env
for obs, env_id in ray.get(step_retval):
File "/home/zcs/miniconda3/envs/flingbot/lib/python3.6/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/zcs/miniconda3/envs/flingbot/lib/python3.6/site-packages/ray/_private/worker.py", line 2523, in get
raise value
ray.exceptions.OutOfMemoryError: Task was killed due to the node running low on memory.
Memory on the node (IP: 192.168.0.107, ID: fc2befb2867ce88e73a8a45572c43a640751ae1f2b5e15bd8315f293) where the task (actor ID: f9cc340f5aef7b479d86345001000000, name=SimEnv.init, pid=4331, memory used=2.22GB) was running was 59.49GB / 62.58GB (0.950744), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: d98ac96cdd66ea8c0a2604609381c3256c8285b87822896c767f7714) because it was the most recently scheduled task; to see more information about memory usage on this node, use ray logs raylet.out -ip 192.168.0.107. To see the logs of the worker, use ray logs worker-d98ac96cdd66ea8c0a2604609381c3256c8285b87822896c767f7714*out -ip 192.168.0.107. Top 10 memory users: PID MEM(GB) COMMAND 7904 2.92 /home/zcs/work/software/pycharm-2023.2.5/jbr/bin/java -classpath /home/zcs/work/software/pycharm-202... 4312 2.22 ray::SimEnv 4331 2.22 ray::SimEnv 4253 2.17 ray::SimEnv 4288 2.15 ray::SimEnv 4252 2.15 ray::SimEnv 4268 2.14 ray::SimEnv.step 4302 2.13 ray::SimEnv.step 4279 2.13 ray::SimEnv.step 4296 2.12 ray::SimEnv Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. Set max_restarts and max_task_retries to enable retry when the task crashes due to OOM. To adjust the kill threshold, set the environment variable RAY_memory_usage_thresholdwhen starting Ray. To disable worker killing, set the environment variableRAY_memory_monitor_refresh_ms` to zero

when i run `python run_sim.py', the worker died or was killed by an unexpected system error

when i run python run_sim.py --eval --tasks flingbot-normal-rect-eval.hdf5 --load flingbot.pth --num_processes 1 --gui
the error shows:ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. 2021-11-22 15:10:23,194 WARNING worker.py:1228 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff341cd030556402df7c59625701000000 Worker ID: 4f72e151e496fac468e1c730556e291e00ec1cfb29882f51097186fd Node ID: d4a9eb590967aeb63fe838e2eca52cf666565bf009207c0ec4a730e6 Worker IP address: 192.168.1.106 Worker port: 41747 Worker PID: 18687
i don't know why occur this issue, could you please help me?

AttributeError: module 'typing' has no attribute '_SpecialForm'

When I run
python run_sim.py --eval --tasks flingbot-normal-rect-eval.hdf5 --load flingbot.pth --num_processes 1 --gui
there is a problem :

Traceback (most recent call last):
File "run_sim.py", line 1, in
from utils import (
File "/media/randy/299D817A2D97AD94/fty/flingbot/utils.py", line 2, in
from environment import SimEnv, TaskLoader
File "/media/randy/299D817A2D97AD94/fty/flingbot/environment/init.py", line 1, in
from .simEnv import SimEnv
File "/media/randy/299D817A2D97AD94/fty/flingbot/environment/simEnv.py", line 3, in
from .utils import (
File "/media/randy/299D817A2D97AD94/fty/flingbot/environment/utils.py", line 1, in
from torch import cat, tensor
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/init.py", line 643, in
from .functional import * # noqa: F403
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/functional.py", line 6, in
import torch.nn.functional as F
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/nn/modules/init.py", line 2, in
from .linear import Identity, Linear, Bilinear, LazyLinear
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 6, in
from .. import functional as F
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/nn/functional.py", line 11, in
from .._jit_internal import boolean_dispatch, _overload, BroadcastingList1, BroadcastingList2, BroadcastingList3
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/torch/_jit_internal.py", line 34, in
from typing_extensions import Final
File "/home/randy/anaconda3/envs/flingbot/lib/python3.6/site-packages/typing_extensions.py", line 159, in
class _FinalForm(typing._SpecialForm, _root=True):
AttributeError: module 'typing' has no attribute '_SpecialForm'

How can I solve this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.