GithubHelp home page GithubHelp logo

rr-learning / causalworld Goto Github PK

View Code? Open in Web Editor NEW
201.0 20.0 25.0 182.04 MB

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

Home Page: https://sites.google.com/view/causal-world/home

License: MIT License

Python 100.00%
reinforcement-learning transfer-learning robotics causality sim2real generalization

causalworld's Introduction

CausalWorld

license GitHub release Documentation Status Maintenance PR Open Source Love png2 Discord

<rr-learning>

CausalWorld is an open-source simulation framework and benchmark for causal structure and transfer learning in a robotic manipulation environment (powered by bullet) where tasks range from rather simple to extremely hard. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures. The release v1.2 supports many interesting goal shape families as well as exposing many causal variables in the environment to perform do_interventions on them.

Checkout the project's website for the baseline results and the paper.

Go here for the documentation.

Go here for the tutorials.

Go here for the discord channel for discussions.

This package can be pip-installed.

Content:

Announcements

October 12th 2020

We release v1.2. Given that its the first release of the framework, we are expecting some issues here and there, so please report any issues you encounter.

Install as a pip package from latest release

  1. Install causal_world
pip install causal_world
optional steps 3. Make the docs.
(causal_world) cd docs
(causal_world) make html
  1. Run the tests.
(causal_world) python -m unittest discover tests/causal_world/
  1. Install other packages for stable baselines (optional)
(causal_world) pip install tensorflow==1.14.0
(causal_world) pip install stable-baselines==2.10.0
(causal_world) conda install mpi4py
  1. Install other packages for rlkit (optional)
(causal_world) cd ..
(causal_world) git clone https://github.com/vitchyr/rlkit.git
(causal_world) cd rlkit 
(causal_world) pip install -e .
(causal_world) pip install torch==1.2.0
(causal_world) pip install gtimer
  1. Install other packages for viskit (optional)
(causal_world) cd ..
(causal_world) git clone https://github.com/vitchyr/viskit.git
(causal_world) cd viskit 
(causal_world) pip install -e .
(causal_world) python viskit/frontend.py path/to/dir/exp*
  1. Install other packages for rlpyt (optional)
 (causal_world) cd ..
 (causal_world) git clone https://github.com/astooke/rlpyt.git
 (causal_world) cd rlpyt 
 (causal_world) pip install -e .
 (causal_world) pip install pyprind

Install as a pip package in a conda env from source

  1. Clone this repo and then create it's conda environment to install all dependencies.
git clone https://github.com/rr-learning/CausalWorld
cd CausalWorld
conda env create -f environment.yml OR conda env update --prefix ./env --file environment.yml  --prune
  1. Install the causal_world package inside the (causal_world) conda env.
conda activate causal_world
(causal_world) pip install -e .
optional steps 3. Make the docs.
(causal_world) cd docs
(causal_world) make html
  1. Run the tests.
(causal_world) python -m unittest discover tests/causal_world/
  1. Install other packages for stable baselines (optional)
(causal_world) pip install tensorflow==1.14.0
(causal_world) pip install stable-baselines==2.10.0
(causal_world) conda install mpi4py
  1. Install other packages for rlkit (optional)
(causal_world) cd ..
(causal_world) git clone https://github.com/vitchyr/rlkit.git
(causal_world) cd rlkit 
(causal_world) pip install -e .
(causal_world) pip install torch==1.2.0
(causal_world) pip install gtimer
  1. Install other packages for viskit (optional)
(causal_world) cd ..
(causal_world) git clone https://github.com/vitchyr/viskit.git
(causal_world) cd viskit 
(causal_world) pip install -e .
(causal_world) python viskit/frontend.py path/to/dir/exp*
  1. Install other packages for rlpyt (optional)
 (causal_world) cd ..
 (causal_world) git clone https://github.com/astooke/rlpyt.git
 (causal_world) cd rlpyt 
 (causal_world) pip install -e .
 (causal_world) pip install pyprind

Why would you use CausalWorld for your research?

The main advantages of this benchmark can be summarized as below:

  • That the environment is a simulation of an open-source robotic platform (TriFinger Robot), hence offering the possibility of sim-to-real transfer.
  • It provides a combinatorial family of tasks with a common causal structure and underlying factors (including, e.g., robot and object masses, colors, sizes).
  • The user (or the agent) may intervene on all causal variables, which allows for fine-grained control over how similar different tasks (or task distributions) are.
  • Easily defining training and evaluation distributions of a desired difficulty level, targeting a specific form of generalization (e.g., only changes in appearance or object mass).
  • A modular design to create any learning curriculum through a series of interventions at different points in time.
  • Defining curricula by interpolating between an initial and a target task.
  • Explicitly evaluating and quantifying generalization across the various axes.
  • A modular design offering great flexibility to design interesting new task distribution.
  • Investigate the understanding of actions and their effect on the properties of the different objects themselves.
  • Lots of tutorials provided for the various features offered.

Dive In!!

from causal_world.envs import CausalWorld
from causal_world.task_generators import generate_task
task = generate_task(task_generator_id='general')
env = CausalWorld(task=task, enable_visualization=True)
for _ in range(10):
  env.reset()
  for _ in range(100):
      obs, reward, done, info = env.step(env.action_space.sample())
env.close()

Main Features

Features Causal World
Do Interventions ✔️
Counterfactual Environments ✔️
Imitation Learning ✔️
Custom environments ✔️
Support HER Style Algorithms ✔️
Sim2Real ✔️
Meta Learning ✔️
Multi-task Learning ✔️
Discrete Action Space ✔️
Structured Observations ✔️
Visual Observations ✔️
Goal Images ✔️
Disentangling Generalization ✔️
Benchmarking algorithms ✔️
Modular interface ✔️
Ipython / Notebook friendly ✔️
Documentation ✔️
Tutorials ✔️

Comparison to other benchmarks

Benchmark Do Interventions Interface Procedurally Generated Environments Online Distribution of Tasks Setup Custom Curricula Disentangle Generlaization Ability Real World Similarity Open Source Robot Low Level Motor Control Long Term Planning Unified Success Metric
RLBench ✔️ ✔️
MetaWorld ✔️ ✔️
IKEA ✔️ ✔️ ✔️ ✔️
Mujoban ✔️ ✔️ ✔️ ✔️ ✔️
BabyAI ✔️ ✔️ ✔️
Coinrun ✔️ ✔️
AtariArcade ✔️/❌ ✔️
CausalWorld ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️

Do Interventions

To provide a convenient way of evaluating robustness of RL algorithms, we expose a lot of variables in the environment and allow do-interventions on them at any point in time.

from causal_world.envs import CausalWorld
from causal_world.task_generators import generate_task
import numpy as np

task = generate_task(task_generator_id='general')
env = CausalWorld(task=task, enable_visualization=True)
for _ in range(10):
  env.reset()
  success_signal, obs = env.do_intervention(
          {'stage_color': np.random.uniform(0, 1, [
              3,
          ])})
  print("Intervention success signal", success_signal)
  for _ in range(100):
      obs, reward, done, info = env.step(env.action_space.sample())
env.close()

Curriculum Through Interventions

To provide a convenient way of specifying learning curricula, we introduce intervention actors. At each time step, such an actor takes all the exposed variables of the environment as inputs and may intervene on them. To encourage modularity, one may combine multiple actors in a learning curriculum. This actor is defined by the episode number to start intervening,the episode number to stop intervening, the time step within the episode it should intervene and the episode periodicity of interventions.

from causal_world.task_generators import generate_task
from causal_world.envs import CausalWorld
from causal_world.intervention_actors import GoalInterventionActorPolicy
from causal_world.wrappers.curriculum_wrappers import CurriculumWrapper

task = generate_task(task_generator_id='reaching')
env = CausalWorld(task, skip_frame=10, enable_visualization=True)
env = CurriculumWrapper(env,
                        intervention_actors=[GoalInterventionActorPolicy()],
                        actives=[(0, 1000000000, 1, 0)])

for reset_idx in range(30):
    obs = env.reset()
    for time in range(100):
        desired_action = env.action_space.sample()
        obs, reward, done, info = env.step(action=desired_action)
env.close()

Train Your Agents

from causal_world.task_generators import generate_task
from causal_world.envs import CausalWorld
from stable_baselines import PPO2
import tensorflow as tf
from stable_baselines.common.policies import MlpPolicy
from stable_baselines.common.vec_env import SubprocVecEnv


def _make_env(rank):
    def _init():
        task = generate_task(task_generator_id='pushing')
        env = CausalWorld(task=task)
        return env
    return _init

policy_kwargs = dict(act_fun=tf.nn.tanh, net_arch=[256, 128])
env = SubprocVecEnv([_make_env(rank=i) for i in range(10)])
model = PPO2(MlpPolicy,
             env,
             policy_kwargs=policy_kwargs,
             verbose=1)
model.learn(total_timesteps=1000000)

Disentangling Generalization

from causal_world.evaluation import EvaluationPipeline
from causal_world.benchmark import PUSHING_BENCHMARK
import causal_world.evaluation.visualization.visualiser as vis
from stable_baselines import PPO2
task_params = dict()
task_params['task_generator_id'] = 'pushing'
world_params = dict()
world_params['skip_frame'] = 3
evaluation_protocols = PUSHING_BENCHMARK['evaluation_protocols']
evaluator_1 = EvaluationPipeline(evaluation_protocols=evaluation_protocols,
                                 task_params=task_params,
                                 world_params=world_params,
                                 visualize_evaluation=False)
evaluator_2 = EvaluationPipeline(evaluation_protocols=evaluation_protocols,
                                 task_params=task_params,
                                 world_params=world_params,
                                 visualize_evaluation=False)
stable_baselines_policy_path_1 = "./model_pushing_curr0.zip"
stable_baselines_policy_path_2 = "./model_pushing_curr1.zip"
model_1 = PPO2.load(stable_baselines_policy_path_1)
model_2 = PPO2.load(stable_baselines_policy_path_2)

def policy_fn_1(obs):
    return model_1.predict(obs, deterministic=True)[0]

def policy_fn_2(obs):
    return model_2.predict(obs, deterministic=True)[0]
scores_model_1 = evaluator_1.evaluate_policy(policy_fn_1, fraction=0.005)
scores_model_2 = evaluator_2.evaluate_policy(policy_fn_2, fraction=0.005)
experiments = dict()
experiments['PPO(0)'] = scores_model_1
experiments['PPO(1)'] = scores_model_2
vis.generate_visual_analysis('./', experiments=experiments)

Meta-Learning

CausalWorld supports meta lerarning and multi-task learning naturally by allowing do-interventions on a lot of the shared high level variables. Further, by splitting the set of parameters into a set A, intended for training and in-distribution evaluation, and a set B, intended for out-of-distribution evaluation, we make sure its easy for users to define meaningful distributions of environments for training and evaluation.

We also support online task distributions where task distributions are not known apriori, which is closer to what the robot will face in real life.

from causal_world.envs import CausalWorld
from causal_world.task_generators import generate_task
from causal_world.sim2real_tools import TransferReal


task = generate_task(task_generator_id='stacked_blocks')
env = CausalWorld(task=task, enable_visualization=True)
env.reset()
for _ in range(10):
    for i in range(200):
        obs, reward, done, info = env.step(env.action_space.sample())
    goal_intervention_dict = env.sample_new_goal()
    print("new goal chosen: ", goal_intervention_dict)
    success_signal, obs = env.do_intervention(goal_intervention_dict)
    print("Goal Intervention success signal", success_signal)
env.close()

Imitation-Learning

Using the different tools provided in the framework its straight forward to perform imitation learning (at least in the form of behavior cloning) by logging the data you want while using your favourite controller.

from causal_world.envs.causalworld import CausalWorld
from causal_world.task_generators import generate_task

from causal_world.loggers.data_recorder import DataRecorder
from causal_world.loggers.data_loader import DataLoader

data_recorder = DataRecorder(output_directory='pushing_episodes',
                             rec_dumb_frequency=11)
task = generate_task(task_generator_id='pushing')
env = CausalWorld(task=task,
                  enable_visualization=True,
                  data_recorder=data_recorder)

for _ in range(23):
    env.reset()
    for _ in range(50):
        env.step(env.action_space.sample())
env.close()

data = DataLoader(episode_directory='pushing_episodes')
episode = data.get_episode(14)

Sim2Real

Once you are ready to transfer your policy to the real world and you 3D printed your own robot in the lab. You can do so easily.

from causal_world.envs import CausalWorld
from causal_world.task_generators import generate_task
from causal_world.sim2real_tools import TransferReal


task = generate_task(task_generator_id='reaching')
env = CausalWorld(task=task, enable_visualization=True)
env = TransferReal(env)
for _ in range(20):
    for _ in range(200):
        obs, reward, done, info = env.step(env.action_space.sample())

Baselines

We establish baselines for some of the available tasks under different learning algorithms, thus verifying the feasibility of these tasks.

The tasks are: Pushing, Picking, Pick And Place, Stacking2

Reproducing Baselines

Go here to reproduce the baselines

Authors

CausalWorld is work done by Ossama Ahmed (ETH Zürich), Frederik Träuble (MPI Tübingen), Anirudh Goyal (MILA), Alexander Neitz (MPI Tübingen), Manuel Wüthrich (MPI Tübingen), Yoshua Bengio (MILA), Bernhard Schölkopf (MPI Tübingen) and Stefan Bauer (MPI Tübingen).

Citing CausalWorld

@inproceedings{
ahmed2021causalworld,
title={CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning},
author={Ossama Ahmed and Frederik Tr{\"a}uble and Anirudh Goyal and Alexander Neitz and Manuel W{\"u}thrich and Yoshua Bengio and Bernhard Sch{\"o}lkopf and Stefan Bauer},
booktitle={International Conference on Learning Representations},
year={2021}
}

Acknowledgments

The authors would like to thank Felix Widmaier (MPI Tübingen), Vaibhav Agrawal (MPI Tübingen) and Shruti Joshi (MPI Tübingen) for the useful discussions and for the development of the TriFinger robot’s simulator, which served as a starting point for the work presented in this paper.

Contributing

A guide coming up soon for how you can contribute to causal_world.

List of publications and submissions using CausalWorld:

causalworld's People

Contributors

ftraeuble avatar michaelfeil avatar ossamaahmed avatar phlippe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

causalworld's Issues

computation of PD gains and save-state

'_latest_full_state' is used for computing the pd feedbacks but is updated at different places in the code which might cause trouble.

For a more consistent data flow, maybe it would be good to update the state only once also higher up the calling stack.

I suggest to update it once at the beginning of do_simulation (as a copy of the current state). See also the other issue about the endeffector position.

Also, the _latest_full_state should be stored in 'save_state' and 'restore_state' from 'task'. Otherwise, the PD controller is in a weird state at a restoration.

trifinger_pro

Hey,

I would like to use trifinger_pro (trifinger) but there are some stl files missing (SIM_*.stl). Is it possible to add them again?
Also I get the following error, if I try to run the pro version (with the stl files taken from the challenge code)

Traceback (most recent call last):
  File "~/.local/share/virtualenvs/mbrl-COlWRWWx/src/causalworld/python/src/causal_world/task_generators/base_task.py", line 677, in init_task
    self.save_state()
  File "~/.local/share/virtualenvs/mbrl-COlWRWWx/src/causalworld/python/src/causal_world/task_generators/base_task.py", line 157, in save_state
    self._robot.get_full_env_state()
  File "~/.local/share/virtualenvs/mbrl-COlWRWWx/src/causalworld/python/src/causal_world/envs/robot/trifinger.py", line 118, in get_full_env_state
    return self.get_current_scm_values()
  File "~/.local/share/virtualenvs/mbrl-COlWRWWx/src/causalworld/python/src/causal_world/envs/robot/trifinger.py", line 539, in get_current_scm_values
    [WorldConstants.VISUAL_SHAPE_IDS[robot_finger_link]][7][:3]
IndexError: tuple index out of range

Best,
Sebastian

correctness and speed of TriFinger._safety_torque_check

Issue:

  • _safety_torque_check is slow and possibly incorrect

Proposed solution:

  • implement a fix speeds up the function (fix 1 or fix 2)
  • optionally also change logic (fix 2) of function (would break backward compatability)

I profiled causalworlld env.step() function.
It seems roughly 66% is speed in the robot apply action, whereas most in self.step_simulation()

def apply_action(self, action):
"""
Applied the passed action to the robot.
:param action: (nd.array) the action to be applied. Should adhere to
the action_mode.
:return: None.
"""
self._control_index += 1
clipped_action = self._robot_actions.clip_action(action)
action_to_apply = clipped_action
if self._normalize_actions:
action_to_apply = self._robot_actions.denormalize_action(
clipped_action)
if self._action_mode == "joint_positions":
self._last_applied_joint_positions = action_to_apply
for _ in range(self._skip_frame):
desired_torques = \
self.compute_pd_control_torques(action_to_apply)
self.send_torque_commands(
desired_torque_commands=desired_torques)
self.step_simulation()
elif self._action_mode == "joint_torques":
for _ in range(self._skip_frame):
self.send_torque_commands(
desired_torque_commands=action_to_apply)
self.step_simulation()
elif self._action_mode == "end_effector_positions":
#TODO: just a hack since IK is not stable
if np.isclose(self._latest_full_state['end_effector_positions'],
action_to_apply).all():
joint_positions = self._last_applied_joint_positions
else:
joint_positions = self.get_joint_positions_from_tip_positions\
(action_to_apply, list(self._latest_full_state['positions']))
self._last_applied_joint_positions = joint_positions
for _ in range(self._skip_frame):
desired_torques = \
self.compute_pd_control_torques(joint_positions)
self.send_torque_commands(
desired_torque_commands=desired_torques)
self.step_simulation()
else:
raise Exception("The action mode {} is not supported".
format(self._action_mode))
self._last_action = action
self._last_clipped_action = clipped_action
return

Of that, 6% of the step time (40 mu s per call) is spent on this safety check implemented in numpy. I did not understand the current implementation of the torque check from a logical point and also believe its rather slow

def _safety_torque_check(self, desired_torques):
"""
:param desired_torques: (nd.array) the desired torque commands to be applied
to the robot.
:return: (nd.array) the modified torque commands after applying a safety check.
"""
applied_torques = np.clip(
np.asarray(desired_torques),
-self._max_motor_torque,
+self._max_motor_torque,
)
applied_torques -= self._safety_kd * self._latest_full_state[
'velocities']
applied_torques = list(
np.clip(
np.asarray(applied_torques),
-self._max_motor_torque,
+self._max_motor_torque,
))
return applied_torques

First of all, this function performs exactly the same function in roughly 5-10 times faster. (5 mu s per call)
Reason is that self._max_motor_torque is broadcasted and np.clip is slow. step() Simulation time would speed up by ~4%
Fix 1

    def _safety_torque_check(self, desired_torques):
        """

        :param desired_torques: (nd.array) the desired torque commands to be applied
                                            to the robot.

        :return: (nd.array) the modified torque commands after applying a safety check.
        """       
        applied_torques = np.maximum(
            np.minimum(
                np.asarray(desired_torques),
                +self._max_motor_torque,
            ),
            -self._max_motor_torque,
        )
        
        applied_torques -=  self._safety_kd * self._latest_full_state[
            'velocities']

        applied_torques = np.maximum(
            np.minimum(
                applied_torques,
                +self._max_motor_torque,
            ),
            -self._max_motor_torque,
        )
        
        return applied_torques.tolist()

In my opinion, this function in general does not implement a reasonable logic.
Currently, the safety factor _safety_kd is only applied where there are positive torques and positive velocities.
The torque limits are asymmetric, I don't understand how this makes sense on the robot. Maybe, the upper (or respectively; lower ) bounds should be tightened when moving with positive (or respectively; negative) velocity.
Fix 2

    def _safety_torque_check(self, desired_torques):
        applied_torques = np.asarray(desired_torques)
        
        # alternative 
        # torque_bound = self._max_motor_torque - np.abs(self._safety_kd * self._latest_full_state['velocities'])
        
        torque_bound_upper = self._max_motor_torque - self._safety_kd * np.maximum(self._latest_full_state['velocities'],0.)
        torque_bound_lower = - self._max_motor_torque - self._safety_kd * np.minimum(self._latest_full_state['velocities'],0.)
        
        # clip upper torque
        applied_torques = np.minimum(
            applied_torques, 
            torque_bound_upper, 
        )
        # clip lower torque
        applied_torques = np.maximum(
            applied_torques, 
            torque_bound_lower, 
        )
        
        return applied_torques.tolist()
    ```

How to record higher quality videos?

I'm trying out the Model Predictive Control script provided in the docs. I notice that when setting enable_visualization=True when instantiating a CausalWorld environment, the real-time visualization of the CEM is very high resolution, which is great. However, the recording of the CEM obtained via gym.wrappers.monitoring.video_recorder.VideoRecorder has much poorer quality.

Is there a way to record the videos such that their resolution is as good as the real-time visualization, or at least better than what we currently obtain from the VideoRecorder?

I suspect that one would have to modify the render() function of CausalWorld to achieve this, but I don't know what exactly needs to be done.

is_not_fixed wrong call

base_task.py, l. 502:
if self._stage.get_rigid_objects()[rigid_object].is_not_fixed:

'is_not_fixed' in the if statement is a function but doesn't get called. I guess is_not_fixed is supposed to be a property method.

set_full_state of stage broken

We noticed that set_full_env_state of the stage is recreating the objects in the scene which will create a different physical state, so running the simulation from a reloaded state yields different results than reloading.

In case interventions should be applied on the fly it might be good to only change properties and not recreate objects all the time.

For our use case we only want to store and set states without changing the structure of the envs (no new objects, etc.). For that reason we tried to use stage.set_full_state/get_full_state pair of function instead of the full_env_state functions. These didn't worked out of the box but we fixed some errors, see the diffs below.

With this changes we getting already smaller deviations between the original environment and a reseted copy. But we still get some deviations for the object and we don't know where they come from.

(In general, the list version of the state should be removed for clarity/bugs reasons)

Index: python/src/causal_world/envs/scene/silhouette.py
===================================================================
--- python/src/causal_world/envs/scene/silhouette.py	(date 1597787231831)
+++ python/src/causal_world/envs/scene/silhouette.py	(date 1597787231831)
@@ -211,11 +211,14 @@
                 self._pybullet_client_ids[0])
             position = np.array(position)
             position[-1] -= WorldConstants.FLOOR_HEIGHT
+            cylindrical_position = cart2cyl(np.array(position))
             for name in self._state_variable_names:
                 if name == 'type':
                     state.append(self._type_id)
                 elif name == 'cartesian_position':
                     state.extend(position)
+                elif name == 'cylindrical_position':
+                    state.extend(cylindrical_position)
                 elif name == 'orientation':
                     state.extend(orientation)
                 elif name == 'size':
@@ -445,7 +448,7 @@
         position[-1] += WorldConstants.FLOOR_HEIGHT
         shape_id = pybullet.createVisualShape(
             shapeType=pybullet.GEOM_BOX,
-            halfExtents=self._size / 2,
+            halfExtents=np.asarray(self._size) / 2,
             rgbaColor=np.append(self._color, self._alpha),
             physicsClientId=pybullet_client_id)
         block_id = pybullet.createMultiBody(baseVisualShapeIndex=shape_id,
Index: python/src/causal_world/envs/scene/objects.py
===================================================================
--- python/src/causal_world/envs/scene/objects.py	(date 1597787843999)
+++ python/src/causal_world/envs/scene/objects.py	(date 1597787843999)
@@ -384,6 +384,7 @@
                 self._pybullet_client_ids[0])
             position = np.array(position)
             position[-1] -= WorldConstants.FLOOR_HEIGHT
+            cylindrical_position = cart2cyl(np.array(position))
             if self.is_not_fixed():
                 linear_velocity, angular_velocity = pybullet.\
                     getBaseVelocity(
@@ -395,6 +396,8 @@
                     state.append(self._type_id)
                 elif name == 'cartesian_position':
                     state.extend(position)
+                elif name == 'cylindrical_position':
+                    state.extend(cylindrical_position)
                 elif name == 'orientation':
                     state.extend(orientation)
                 elif name == 'linear_velocity':
@@ -457,15 +460,12 @@
             position = interventions_dict['cartesian_position']
         if 'orientation' in interventions_dict:
             orientation = interventions_dict['orientation']
-        if 'mass' in interventions_dict:
-            self._mass = interventions_dict['mass']
-        if 'friction' in interventions_dict:
-            self._lateral_friction = interventions_dict['friction']
         if 'size' in interventions_dict:
-            self._size = interventions_dict['size']
-            self._set_volume()
-            self.reinit_object()
-        elif 'cartesian_position' in interventions_dict or 'orientation' in \
+            if not np.isclose(self._size, interventions_dict['size']):
+                self._size = interventions_dict['size']
+                self._set_volume()
+                self.reinit_object()
+        if 'cartesian_position' in interventions_dict or 'orientation' in \
                 interventions_dict:
             for i in range(0, len(self._pybullet_client_ids)):
                 position[-1] += WorldConstants.FLOOR_HEIGHT
@@ -474,14 +474,16 @@
                     position,
                     orientation,
                     physicsClientId=self._pybullet_client_ids[i])
-        elif 'mass' in interventions_dict:
-            for i in range(0, len(self._pybullet_client_ids)):
-                pybullet.changeDynamics(
-                    self._block_ids[i],
-                    -1,
-                    mass=self._mass,
-                    physicsClientId=self._pybullet_client_ids[i])
-        elif 'friction' in interventions_dict:
+        if 'mass' in interventions_dict:
+            if not np.isclose(self._mass, interventions_dict['mass']):
+                self._mass = interventions_dict['mass']
+                for i in range(0, len(self._pybullet_client_ids)):
+                    pybullet.changeDynamics(
+                        self._block_ids[i],
+                        -1,
+                        mass=self._mass,
+                        physicsClientId=self._pybullet_client_ids[i])
+        if 'friction' in interventions_dict:
             self._set_lateral_friction(self._lateral_friction)
 
         if 'color' in interventions_dict:

to use OpenGL3 with X11

when I test the CausalWorld/tutorials/requesting_task/tutorial_one.py ,I met the issue like this:
pybullet build time: May 20 2022 19:43:01
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Failed to create GL 3.3 context ... using old-style GLX context
Failed to create an OpenGL context
And I failed to solve it,it seems like X11 cannot match with opengl3,can you help me.

Observation mode "pixel" raises AttributeError

Hi, when trying to create an environment that runs in the observation mode "pixel", the environment crashes due to the attribute self.observation_space not being defined:

from causal_world.envs import CausalWorld
from causal_world.task_generators.task import generate_task

task = generate_task(task_generator_id='stacked_blocks')
env = CausalWorld(task=task,
                  observation_mode='pixel')

Error:

File causalworld.py:238, in CausalWorld._reset_observations_space(self)
    237 def _reset_observations_space(self):
--> 238     if self._observation_mode == "pixel" and self.observation_space is None:
    239         self._stage.select_observations(["goal_image"])
    240         self.observation_space = combine_spaces(
    241             self._robot.get_observation_spaces(),
    242             self._stage.get_observation_spaces())

AttributeError: 'CausalWorld' object has no attribute 'observation_space'

I am not sure if this might be caused by a new gym version. A quick fix is to simply define self.observation_space = None before calling self._reset_observations_space() in the init (line 171).

Stage using unexpected client ID

Hi, when performing an intervention on the friction of the floor or stage, I noticed that the 'ground truth' info dictionary did not return the new, correct friction values. However, the environment physics showed that the effect corresponding to the changed frictions. To me, it seems that this line is a bug: https://github.com/rr-learning/CausalWorld/blob/master/causal_world/envs/scene/stage.py#L516. The client ID is there set to a bool, which would be constant 'True', i.e. interpreted as 1, while the actual client ID is 0. Replacing this line with client = self._pybullet_client_w_o_goal_id fixes the issue for me.

Is this indeed a bug, or a misinterpretation on my side? If a bug, I can set up a pull request if wanted.

reward wrong in reaching task

In 'get_reward' in base_task you add the goal_distance to the reward. Given that you typically try to maximize the reward in the RL setting, shouldn't be the negative distance added to the reward?

Trained baseline models perform poorly

Hello, I have trained the baseline models for all of the tasks and the results are only good for the pushing and picking tasks, and not even that good for the picking one. As for pick and place and stacking, the trained baseline model fails consistently. Is this expected behavior?

I ran the reproduce_experiments.py script and then evaluated each trained model with the evaluation pipeline. I can post the videos and the evaluation script if needed.

Torque-mode wrong achieved_goal

The 'get_achieved_goal' function, in e.g. reaching, returns the end_effector_positions from the '_latest_full_state' dict. This is not updated in 'joint_torque' mode, giving constant values for the achieved goal.

I suggest to split the purpose of the last_full_state.
There should be a structure to store the endeffector position: current_state that is updated just once every step (I think in
do_simulation() after the physics update)

then the storage of the last_full_state and its update is a different issue that I report in a moment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.