GithubHelp home page GithubHelp logo

facebookresearch / spot-sim2real Goto Github PK

View Code? Open in Web Editor NEW
56.0 11.0 3.0 391.14 MB

Spot Sim2Real Infrastructure

License: MIT License

Python 76.02% Shell 0.44% Jupyter Notebook 23.47% Dockerfile 0.06%

spot-sim2real's Introduction

๐Ÿค– Spot-Sim2Real

Spot-Sim2Real is a modular library for development of Spot for embodied AI tasks (e.g., Language-guided Skill Coordination (LSC), Adaptive Skill Coordination (ASC)) -- configuring Spot robots, controlling sensorimotor skills, and coordinating Large Language Models (LLMs).

๐Ÿ“ Setup instructions

Please refer to the setup instructions page for information on how to setup the repo. Note that this repo by-default does not track dirty status of submodules, if you're making any intentional changes within the third-party packages be sure to track them separately.

๐Ÿ’ป Connecting to the robot

Computer can be connected to the robot in one of the following modes.

  1. Ethernet (Gives best network speed, but it is cluttery :sad: )
    This mode can be used to create a wired connection with the robot. Useful for teleoperating the robot via computer
  2. Access Point Mode
    This is a wireless mode where robot creates its wifi network. Connect robot to this mode for teleoperating it using controller over long distances. Robot is in Access Point mode if you see a wifi with name like spot-BD-*********** (where * is a number)
  3. Client Mode (Gives 2nd best network speed, we usually prefer this)
    This is a wireless mode where robot is connected to an external wifi network (from a nearby router). Computer should be connected to this same network, wired connection between router and computer will be faster than wireless connection.

Follow the steps from Spot's Network Setup page by Boston Dynamics to connect to the robot.

After setting up spot in correct network configuration, please add its IP inside bashrc

echo "export SPOT_IP=<spot's ip address>" >> ~/.bashrc
source ~/.bashrc

Test and ensure you can ping spot

ping $SPOT_IP

If you get response like this, then you are on right network

(spot_ros) user@linux-machine:~$ ping $SPOT_IP
PING 192.168.1.5 (192.168.1.5) 56(84) bytes of data.
64 bytes from 192.168.1.5: icmp_seq=1 ttl=64 time=8.87 ms
64 bytes from 192.168.1.5: icmp_seq=2 ttl=64 time=7.36 ms

Before starting to run the code, you need to ensure that all ROS env variables are setup properly inside bashrc. Please follow the steps from Setting ROS env variables for proper ROS env var setup.

๐Ÿ–ฅ๏ธ Getting to the repo

Go to the repository

cd /path/to/spot-sim2real/

The code for the demo lies inside the main branch.

# Check your current git branch
git rev-parse --abbrev-ref HEAD

# If you are not in the `main` branch, then checkout to the `main` branch
git checkout main

๐Ÿšˆ Try teleoperating the robot using keyboard

๐Ÿšจ Running Emergency Stop

  • Since we do not have a physical emergency stop button (like the large red push buttons), we need to run an e-stop node.
    python -m spot_wrapper.estop
  • Keep this window open at all the times, if the robot starts misbehaving you should be able to quickly press s or space_bar to kill the robot

๐ŸŽน Running keyboard teleop

๐ŸŽฎ Instructions to record waypoints (use joystick to move robot around)

  • Before running scripts on the robot, waypoints should be recorded. These waypoints exist inside file spot-sim2real/spot_rl_experiments/configs/waypoints.yaml

  • Before recording receptacles, make the robot sit at home position then run following command

    spot_reset_home
  • There are 2 types of waypoints that one can record,

    1. clutter - These only require the (x, y, theta) of the target receptacle
    2. place - These requre (x, y, theta) for target receptable as well as (x, y, z) for exact drop location on the receptacle
  • To record a clutter target, teleoperate the robot to reach near the receptacle target (using joystick). Once robot is at a close distance to receptacle, run the following command

    spot_rl_waypoint_recorder -c <name_for_clutter_receptacle>
  • To record a place target, teleoperate the robot to reach near the receptacle target (using joystick). Once robot is at a close distance to receptacle, use manipulation mode in the joystick to manipulate the end-effector at desired (x,y,z) position. Once you are satisfied with the end-effector position, run the following command

    spot_rl_waypoint_recorder -p <name_for_place_receptacle>

๐Ÿš€ Running instructions

Running the demo (ASC/LSC/Seq-Experts)

Step1. Run the local launch executable

  • In a new terminal, run the executable as

    spot_rl_launch_local

    This command starts 4 tmux sessions\n

    1. roscore
    2. img_publishers
    3. proprioception
    4. tts
  • You can run tmux ls in the terminal to ensure that all 4 tmux sessions are running. You need to ensure that all 4 sessions remain active until 70 seconds after running the spot_rl_launch_local. If anyone of them dies before 70 seconds, it means there is some issue and you should rerun spot_rl_launch_local.

  • You should try re-running spot_rl_launch_local atleast 2-3 times to see if the issue still persists. Many times roscore takes a while to start due to which other nodes die, re-running can fix this issue.

  • You can verify if all ros nodes are up and running as expected if the output of rostopic list looks like the following

    (spot_ros) user@linux-machine:~$ rostopic list
    /filtered_hand_depth
    /filtered_head_depth
    /hand_rgb
    /mask_rcnn_detections
    /mask_rcnn_visualizations
    /raw_hand_depth
    /raw_head_depth
    /robot_state
    /rosout
    /rosout_agg
    /text_to_speech
  • If you don't get the output as follows, one of the tmux sessions might be failing. Follow the debugging strategies described in ISSUES.md for triaging and resolving these errors.

Step2. Run ROS image visualization

  • This is the image visualization tool that helps to understand what robot is seeing and perceiving from the world

    spot_rl_ros_img_vis
  • Running this command will open an image viewer and start printing image frequency from different rosotopics.

  • If the image frequency corresponding to mask_rcnn_visualizations is too large and constant (like below), it means that the bounding box detector has not been fully initialized yet

    raw_head_depth: 9.33 raw_hand_depth: 9.33 hand_rgb: 9.33 filtered_head_depth: 11.20 filtered_hand_depth: 11.20 mask_rcnn_visualizations: 11.20
    raw_head_depth: 9.33 raw_hand_depth: 9.33 hand_rgb: 9.33 filtered_head_depth: 11.20 filtered_hand_depth: 8.57 mask_rcnn_visualizations: 11.20
    raw_head_depth: 9.33 raw_hand_depth: 9.33 hand_rgb: 9.33 filtered_head_depth: 8.34 filtered_hand_depth: 8.57 mask_rcnn_visualizations: 11.20

    Once the mask_rcnn_visualizations start becoming dynamic (like below), you can proceed with next steps

    raw_head_depth: 6.87 raw_hand_depth: 6.88 hand_rgb: 6.86 filtered_head_depth: 4.77 filtered_hand_depth: 5.01 mask_rcnn_visualizations: 6.14
    raw_head_depth: 6.87 raw_hand_depth: 6.88 hand_rgb: 6.86 filtered_head_depth: 4.77 filtered_hand_depth: 5.01 mask_rcnn_visualizations: 5.33
    raw_head_depth: 4.14 raw_hand_depth: 4.15 hand_rgb: 4.13 filtered_head_depth: 4.15 filtered_hand_depth: 4.12 mask_rcnn_visualizations: 4.03
    raw_head_depth: 4.11 raw_hand_depth: 4.12 hand_rgb: 4.10 filtered_head_depth: 4.15 filtered_hand_depth: 4.12 mask_rcnn_visualizations: 4.03

Step3. Reset home in a new terminal

  • This is an important step. Ensure robot is at its start location and sitting, then run the following command in a new terminal

    spot_reset_home
  • The waypoints that were recorded are w.r.t the home location. Since the odometry drifts while robot is moving, it is necessary to reset home before start of every new run

Step4. Emergency stop

Step5. Main demo code in a new terminal

  • Ensure you have correctly added the waypoints of interest by following the intructions to record waypoints

  • In a new terminal you can now run the code of your choice

    1. To run Sequencial experts

      spot_rl_mobile_manipulation_env
    2. To run Adaptive skill coordination

      spot_rl_mobile_manipulation_env -m
    3. To run Language instructions with Sequencial experts, ensure the usb microphone is connected to the computer

      python spot_rl_experiments/spot_rl/envs/lang_env.py
  • If you are done with demo of one of the above code and want to run another code, you do not need to re-run other sessions and nodes. Running a new command in the same terminal will work just fine. But make sure to bring robot at home location and reset its home using spot_reset_home in the same terminal

Step6. [Optional] Pick with Pose estimation (uses NVIDIA's FoundationPose Model)

  • Ensure FoundationPoseForSpotSim2real is setup as submodule in third_party folder please follow instructions in SETUP_INSTRUCTIONS.mds
  • Currently we only support pose estimation for bottle, penguine plush toy & paper cup found in FB offices' microktichen
  • New Meshes can be added using 360 video of the object from any camera (iphone, android), entire process will be described in the above repo's README
  • Pose estimation model FoundationPose runs as a microservice & can be communicated through pyzmq socket
  • The Step 1 should also start the pose estimation service & no other step is required to start this microservice
  • How to use Pose Estimation ?
    • You can pass two flags enable_pose_estimation & enable_pose_correction with pick skill as skillmanager.pick(enable_pose_estimation=True, enable_pose_correction=True)
    • If you enable pose correction, spot will first manually correct the object pose for eg. rotate horizontal object to be vertical etc & place the corrected the object at the same place.
    • Our orientationsolver can also correct the object to face the camera but it incurs additional pick attempt before place can be run thus is kept to be false by default
    • Enabling pose estimation can help in two major way - informs grasp api how to approach the object viz. topdown or side which increases the grasp success probability & correct object orientation before place is ran.

Using Spot Data-logger

All logs will get stored inside data/data_logs directory

Logged keys

The logger will capture spot's data such that each timestamp's log packet is a dict with following keys:

"timestamp" : double, # UTC epoch time from time.time()
"datatime": str # human readable corresponding local time as "YY-MM-DD HH:MM:SS"
"camera_data" : [
                    {
                        "src_info" : str, # this is name of camera source as defined in SpotCamIds
                        "raw_image": np.ndarray, # this is spot's camera data as cv2 (see output of Spot.image_response_to_cv2() for more info)
                        "camera_intrinsics": np.ndarray, # this is 3x3 matrix holding camera intrinsics
                        "base_T_camera": np.ndarray, # this is 4x4 transformation matrix of camera w.r.t base frame of robot
                    },
                    ...
                ],
"vision_T_base": np.ndarray, # this is 4x4 transformation matrix of base frame w.r.t vision frame
"base_pose_xyt": np.ndarray, # this is 3 element array representing x,y,yaw w.r.t home frame
"arm_pose": np.array, # this is a 6 element array representing arm joint states (ordering : sh0, sh1, el0, el1, wr0, wr1)
"is_gripper_holding_item": bool, # whether gripper is holding something or not
"gripper_open_percentage": double, # how much is the gripper open
"gripper_force_in_hand": np.ndarray, # force estimate on end-effector in hand frame

Logging data

The data logger is designed to log the data provided here at whatever rate sensor data becomes available (which depends on network setup).

To run the logger async, simply run the following command in a new terminal

python -m spot_wrapper.data_logger --log_data

This will record data in a while loop, press Ctrl+c to spot the logger. That will save the log file inside data/data_logs/<YY,MM,DD-HH,MM,SS>.pkl file

Warning : This logger will cause motion blur as camera data is logged while the robot moves. Currently we do not support Spot-Record-Go protocol to log

Log replay

It is also possible to replay the logged data (essentially the camera streams that have been logged) using the following command :

python -m spot_wrapper.data_logger --replay="<name_of_log_file>.pkl"

Caution : For replay, the log file SHOULD be a pkl file with the keys provided here

Caution : Please ensure the log file is present inside data/data_logs dir.

๐Ÿ”ง Call skills (non-blocking) without installing spot-sim2real in your home environment

We provide an function that can call skills in seperate conda environment. And the calling of skill itself is a non-blocking call.

Step1. Follow Running instructions section to setup image client in spot_ros conda environment

Step2. Run skill_executor.py to listen to which skill to use. This will run on the background.

python spot_rl_experiments/spot_rl/utils/skill_executor.py 

Step3. Use ROS to use skill in your application. Now you can call skills in non-blocking way.

# In your application, you import rospy for calling which skill to use 
import rospy # This is the only package you need to install in your environment
rospy.set_param("skill_name_input", "Navigate,desk") # Call navigation skills to navigate to the desk. This is a non-blocking call.

๐Ÿ‘“ Run Spot-Aria project code

Follow the steps in the project documentation.

โญ Convert pytorch weights to torchscript

To convert pytorch weights to torchscript, please follow Torchscript Conversion Instructions.

๐Ÿ“ฃ Acknowledgement

We thank Naoki Yokoyama for setting up the foundation of the codebase, and Joanne Truong for polishing the codebase. Spot-Sim2Real is built upon Naoki's codebases: bd_spot_wrapper and spot_rl_experiments , and with new features (LLMs, pytest) and improving robustness.

โœ๏ธ Citations

If you find this repository helpful, feel free to cite our papers: Adaptive Skill Coordination (ASC) and Language-guided Skill Coordination (LSC).

@article{yokoyama2023adaptive,
  title={Adaptive Skill Coordination for Robotic Mobile Manipulation},
  author={Yokoyama, Naoki and Clegg, Alexander William and Truong, Joanne and Undersander, Eric  and Yang, Tsung-Yen and Arnaud, Sergio and Ha, Sehoon and Batra, Dhruv and Rai, Akshara},
  journal={arXiv preprint arXiv:2304.00410},
  year={2023}
}

@misc{yang2023adaptive,
    title={LSC: Language-guided Skill Coordination for Open-Vocabulary Mobile Pick-and-Place},
    author={Yang, Tsung-Yen and Arnaud, Sergio and Shah, Kavit and Yokoyama, Naoki and  Clegg, Alexander William and Truong, Joanne and Undersander, Eric and Maksymets, Oleksandr and Ha, Sehoon and Kalakrishnan, Mrinal and Mottaghi, Roozbeh and Batra, Dhruv and Rai, Akshara},
    howpublished={\url{https://languageguidedskillcoordination.github.io/}}
}

License

Spot-Sim2Real is MIT licensed. See the LICENSE file for details.

spot-sim2real's People

Contributors

kavitshah1998 avatar tusharsangam avatar jimmytyyang avatar zephirefaith avatar jdvakil avatar

Stargazers

Alexis  avatar  avatar Jung Yeon Lee avatar Yin Jie avatar  avatar EreQ avatar Vladislav Sorokin avatar Liu JiaKang avatar  avatar Xumin Gao avatar Michael Schock avatar  avatar Alex Wu avatar  avatar Nick Imanzi avatar obin avatar  avatar  avatar H Ruthrash avatar George Sultani avatar Mohammed Azharudeen avatar Loc Tran avatar Hao Lu avatar Junwei Liang avatar journaux avatar Dylan Sotomayor avatar  avatar Tomato1107 avatar Abhishek Kashyap avatar Caelan Garrett avatar Se June Joo avatar Yonatan Bisk avatar Minh-Long Pham avatar Shuqiang Cao avatar  avatar  avatar Sandalots avatar ็ˆฑๅฏๅฏ-็ˆฑ็”Ÿๆดป avatar  avatar Dhruv Shah avatar Pedro Diamel Marrero Fernรกndez avatar Santhosh Kumar Ramakrishnan avatar Scott Laue avatar antx avatar Yue Li avatar Sagnik Majumder avatar jiangplus avatar Yoon, Seungje avatar  avatar Evan avatar Than Lwin Aung avatar Ma avatar Kashu Yamazaki avatar jiawei avatar  avatar  avatar

Watchers

Franziska Meier avatar Oleksandr avatar Mike Lambeta avatar Alexander Clegg avatar  avatar Sasha Sax avatar Tingfan Wu avatar Dhruv Shah avatar Joanne Truong avatar Arun Sathiya avatar  avatar

spot-sim2real's Issues

Record trajectories in super (Skill) class for both base & arm

As suggested by @akshararai in PR##131 -> #131 (comment)

Currently:
Nav skill returns trajectory.
Pick skill returns a dict.

self.gaze_result = {
            "target_object": target_object,
            "time_taken": time.time() - self.start_time,
            "success": self.env.grasp_attempted and success_status_from_user_feedback,
        }

Place skill returns a dict

self.place_result = {
            "time_taken": time.time() - self.start_time,
            "success": is_position_within_bounds(
                local_place_target_spot,
                local_ee_pose_spot,
                self.config.SUCC_XY_DIST,
                self.config.SUCC_Z_DIST,
                convention="spot",
            ),
            "place_target": local_place_target_spot,
            "ee_pos": local_ee_pose_spot,
        }

we should standardize what they all return when we move to an abstract class

Refactor SpotQRDetector class and Spot Interface

  • Move get_spot_a_tform_b to main Spot class and replace all get_ABC_T_XYZ calls with this generix function #83
  • Convert all transformations inside class SpotQRDetector to Sophus,SE3 instead of magnum
  • Also update the return types inside april_tag_pose_estimator
  • Check all comments and ensure there is no trace of magnum inside aria related files
  • Remove dependence of class Spot from class AprilTagPoseEstimator
  • Refactor convert_XYZ_to_ABC kind of functions to: <library>_<representation>_to_<library>_<representation>, e.g. ros_Pose_to_ros_TransformStamped
  • Use the SP3 averaging methods from math_utils.py in SpotQRDetector too.
  • Refactor class AprilTagDetector to make it generic enough for spot & aria
  • Update ros_topic_frames.yaml to have nested frame structure for spot & aria

O

import os
import os
os.environ["OPENAI_API_KEY"] = "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"
import time
from collections import Counter

import magnum as mn
import numpy as np
from spot_wrapper.spot import Spot

from spot_rl.envs.base_env import SpotBaseEnv
from spot_rl.envs.gaze_env import SpotGazeEnv
from spot_rl.real_policy import GazePolicy, MixerPolicy, NavPolicy, PlacePolicy
from spot_rl.utils.remote_spot import RemoteSpot
from spot_rl.utils.utils import (
    WAYPOINTS,
    closest_clutter,
    construct_config,
    get_clutter_amounts,
    get_default_parser,
    nav_target_from_waypoints,
    object_id_to_nav_waypoint,
    place_target_from_waypoints,
)
#from spot_rl.models.whisper import WhisperTranslator
from spot_rl.models.sentence_similarity import SentenceSimilarity

from spot_rl.llm.src.rearrange_llm import RearrangeEasyChain

from hydra import compose, initialize
import subprocess
import time

CLUTTER_AMOUNTS = Counter()
CLUTTER_AMOUNTS.update(get_clutter_amounts())
NUM_OBJECTS = np.sum(list(CLUTTER_AMOUNTS.values()))
DOCK_ID = int(os.environ.get("SPOT_DOCK_ID", 520))

DEBUGGING = False

def main(spot, use_mixer, config, out_path=None):


    #audio_to_text = WhisperTranslator()
    sentence_similarity = SentenceSimilarity()
    #with initialize(config_path='../llm/src/conf'):
    #    llm_config = compose(config_name='config')
    #llm = RearrangeEasyChain(llm_config)
    #print('Give instruction!')
    #audio_to_text.record()
    #instruction = audio_to_text.translate()
    #instruction = 'take the rubik cube from the sining table to the hamper'
    #print(instruction)

    nav_1, pick, nav_2, _ = 'dining table', 'box', 'hamper'
    print('PARSED', nav_1, pick, nav_2)

    nav_1 = sentence_similarity.get_most_similar_in_list(nav_1, list(WAYPOINTS['nav_targets'].keys()))
    nav_2 = sentence_similarity.get_most_similar_in_list(nav_2, list(WAYPOINTS['nav_targets'].keys()))
    print('Most Similar', nav_1, pick, nav_2)

    print('Calling processes')
    subprocess.Popen(['tmux', 'kill-session', '-t', 'img_pub'])
    cmd = f"tmux new -s img_pub -d '/home/akshara/anaconda3/envs/spot_ros/bin/python -m spot_rl.utils.img_publishers --local --owlvit_label \"{pick}\"'"
    print(cmd)
    s = subprocess.getstatusoutput(cmd)
    print(s)

    policy = SequentialExperts(
            config.WEIGHTS.NAV,
            config.WEIGHTS.GAZE,
            config.WEIGHTS.PLACE,
            device=config.DEVICE,
        )
    env_class = SpotMobileManipulationSeqEnv
    env = env_class(config, spot)
    env.power_robot()
    time.sleep(1)
    count = Counter()
    out_data = []

    waypoint = nav_target_from_waypoints(nav_1)
    observations = env.reset(waypoint=waypoint)
    env.target_obj_name = nav_2

    policy.reset()
    done = False
    expert = Tasks.NAV
    env.stopwatch.reset()
    while not done:
        out_data.append((time.time(), env.x, env.y, env.yaw))

        base_action, arm_action = policy.act(observations, expert=expert)
        nav_silence_only = True
        env.stopwatch.record("policy_inference")
        observations, _, done, info = env.step(
            base_action=base_action,
            arm_action=arm_action,
            nav_silence_only=nav_silence_only,
        )
        
        expert = info["correct_skill"]

        if done:
            env.say("Finished object rearrangement. Heading to dock.")
            waypoint = nav_target_from_waypoints("dock")
            observations = env.reset(waypoint=waypoint)

        policy.nav_policy.reset()
        env.owlvit_pick_up_object_name = pick

        env.stopwatch.print_stats(latest=True)


    # Go to the dock
    env.say("Finished object rearrangement. Heading to dock.")
    waypoint = nav_target_from_waypoints("dock")
    observations = env.reset(waypoint=waypoint)
    expert = Tasks.NAV
    policy.nav_policy.reset()
    while True:

        base_action, arm_action = policy.act(observations, expert=expert)
        nav_silence_only = True
        env.stopwatch.record("policy_inference")
        observations, _, done, info = env.step(
            base_action=base_action,
            arm_action=arm_action,
            nav_silence_only=nav_silence_only,
        )
        try:
            spot.dock(dock_id=DOCK_ID, home_robot=True)
            spot.home_robot()
            break
        except:
            print("Dock not found... trying again")
            time.sleep(0.1)

    print("Done!")

    out_data.append((time.time(), env.x, env.y, env.yaw))

    if out_path is not None:
        data = (
                "\n".join([",".join([str(i) for i in t_x_y_yaw]) for t_x_y_yaw in out_data])
                + "\n"
        )
        with open(out_path, "w") as f:
            f.write(data)


class Tasks:
    r"""Enumeration of types of tasks."""

    NAV = "nav"
    GAZE = "gaze"
    PLACE = "place"


class SequentialExperts:
    def __init__(self, nav_weights, gaze_weights, place_weights, device="cuda"):
        print("Loading nav_policy...")
        self.nav_policy = NavPolicy(nav_weights, device)
        print("Loading gaze_policy...")
        self.gaze_policy = GazePolicy(gaze_weights, device)
        print("Loading place_policy...")
        self.place_policy = PlacePolicy(place_weights, device)
        print("Done loading all policies!")

    def reset(self):
        self.nav_policy.reset()
        self.gaze_policy.reset()
        self.place_policy.reset()

    def act(self, observations, expert):
        base_action, arm_action = None, None
        if expert == Tasks.NAV:
            base_action = self.nav_policy.act(observations)
        elif expert == Tasks.GAZE:
            arm_action = self.gaze_policy.act(observations)
        elif expert == Tasks.PLACE:
            arm_action = self.place_policy.act(observations)

        return base_action, arm_action


class SpotMobileManipulationBaseEnv(SpotGazeEnv):
    node_name = "SpotMobileManipulationBaseEnv"

    def __init__(self, config, spot: Spot):
        super().__init__(config, spot)

        # Nav
        self.goal_xy = None
        self.goal_heading = None
        self.succ_distance = config.SUCCESS_DISTANCE
        self.succ_angle = np.deg2rad(config.SUCCESS_ANGLE_DIST)
        self.gaze_nav_target = None
        self.place_nav_target = None
        self.rho = float("inf")
        self.heading_err = float("inf")

        # Gaze
        self.locked_on_object_count = 0
        self.target_obj_name = config.TARGET_OBJ_NAME

        # Place
        self.place_target = None
        self.ee_gripper_offset = mn.Vector3(config.EE_GRIPPER_OFFSET)
        self.place_target_is_local = False

        # General
        self.max_episode_steps = 1000
        self.navigating_to_place = False

    def reset(self, waypoint=None, *args, **kwargs):
        # Move arm to initial configuration (w/ gripper open)
        self.spot.set_arm_joint_positions(
            positions=np.deg2rad(self.config.GAZE_ARM_JOINT_ANGLES), travel_time=0.75
        )
        # Wait for arm to arrive to position
        # import pdb; pdb.set_trace()
        time.sleep(0.75)
        print("open gripper called in SpotMobileManipulationBaseEnv")
        self.spot.open_gripper()

        # Nav
        if waypoint is None:
            self.goal_xy = None
            self.goal_heading = None
        else:
            self.goal_xy, self.goal_heading = (waypoint[:2], waypoint[2])

        # Place
        self.place_target = mn.Vector3(-1.0, -1.0, -1.0)

        # General
        self.navigating_to_place = False

        return SpotBaseEnv.reset(self)

    def step(self, base_action, arm_action, *args, **kwargs):
        # import pdb; pdb.set_trace()
        _, xy_dist, z_dist = self.get_place_distance()
        place = xy_dist < self.config.SUCC_XY_DIST and z_dist < self.config.SUCC_Z_DIST
        if place:
            print("place is true")

        if self.grasp_attempted:
            print("A")
            grasp = False
        else:
            print("b")
            grasp = self.should_grasp()

        if self.grasp_attempted:
            max_joint_movement_key = "MAX_JOINT_MOVEMENT_2"
        else:
            max_joint_movement_key = "MAX_JOINT_MOVEMENT"

        # Slow the base down if we are close to the nav target for grasp to limit blur
        if (
            not self.grasp_attempted
            and self.rho < 0.5
            and abs(self.heading_err) < np.rad2deg(45)
        ):
            self.slowdown_base = 0.5  # Hz
            print("!!!!!!Slow mode!!!!!!")
        else:
            self.slowdown_base = -1
        disable_oa = False if self.rho > 0.3 and self.config.USE_OA_FOR_NAV else None
        # import ipdb; ipdb.set_trace()
        observations, reward, done, info = SpotBaseEnv.step(
            self,
            base_action=base_action,
            arm_action=arm_action,
            grasp=grasp,
            place=place,
            max_joint_movement_key=max_joint_movement_key,
            disable_oa=disable_oa,
            *args,
            **kwargs,
        )
        if done:
            print("done is true")

        if self.grasp_attempted and not self.navigating_to_place:
            # Determine where to go based on what object we've just grasped

            waypoint_name = self.target_obj_name
            waypoint = nav_target_from_waypoints(waypoint_name)

            #waypoint_name, waypoint = object_id_to_nav_waypoint(self.target_obj_name)
            self.say("Navigating to " + waypoint_name)
            self.place_target = place_target_from_waypoints(waypoint_name)
            self.goal_xy, self.goal_heading = (waypoint[:2], waypoint[2])
            self.navigating_to_place = True
            info["grasp_success"] = True

        return observations, reward, done, info

    def get_observations(self):
        observations = self.get_nav_observation(self.goal_xy, self.goal_heading)
        rho = observations["target_point_goal_gps_and_compass_sensor"][0]
        self.rho = rho
        goal_heading = observations["goal_heading"][0]
        self.heading_err = goal_heading
        self.use_mrcnn = False
        observations.update(super().get_observations())
        observations["obj_start_sensor"] = self.get_place_sensor()

        return observations

    def get_success(self, observations):
        return self.place_attempted


class SpotMobileManipulationSeqEnv(SpotMobileManipulationBaseEnv):
    node_name = "SpotMobileManipulationSeqEnv"

    def __init__(self, config, spot: Spot):
        super().__init__(config, spot)
        self.current_task = Tasks.NAV
        self.timeout_start = float("inf")

    def reset(self, *args, **kwargs):
        observations = super().reset(*args, **kwargs)
        self.current_task = Tasks.NAV
        self.target_obj_name = 0
        self.timeout_start = float("inf")

        return observations

    def step(self, *args, **kwargs):
        pre_step_navigating_to_place = self.navigating_to_place
        observations, reward, done, info = super().step(*args, **kwargs)

        if self.current_task != Tasks.GAZE:
            # Disable target searching if we are not gazing
            self.last_seen_objs = []

        if self.current_task == Tasks.NAV and self.get_nav_success(
            observations, self.succ_distance, self.succ_angle
        ):
            if not self.grasp_attempted:
                self.current_task = Tasks.GAZE
                self.timeout_start = time.time()
                #self.target_obj_name = None
            else:
                self.current_task = Tasks.PLACE
                self.say("Starting place")
                self.timeout_start = time.time()

        if self.current_task == Tasks.PLACE and time.time() > self.timeout_start + 10:
            # call place after 10s of trying
            print("Place failed to reach target")
            done = True


        if not pre_step_navigating_to_place and self.navigating_to_place:
            # This means that the Gaze task has just ended
            self.current_task = Tasks.NAV


        info["correct_skill"] = self.current_task

        self.use_mrcnn = False
        #

        return observations, reward, done, info


if __name__ == "__main__":

    parser = get_default_parser()
    parser.add_argument("-m", "--use-mixer", action="store_true")
    parser.add_argument("--output")
    args = parser.parse_args()
    config = construct_config(args.opts)

    spot = (RemoteSpot if config.USE_REMOTE_SPOT else Spot)("RealSeqEnv")
    if config.USE_REMOTE_SPOT:
        try:
            main(spot, args.use_mixer, config, args.output)
        finally:
            spot.power_off()
    else:
        with spot.get_lease(hijack=True):
            try:
                main(spot, args.use_mixer, config, args.output)
            finally:
                spot.power_off()

Unnecessary Dependencies

  1. File : spot_rl_experiments/spot_rl/envs/spot_base_env.py
    Spot2Habitat transform SHOULD NOT BE A PART OF SpotBaseEnv Class. It is an unnecessary dependency.

...

Address comments from Heuristic Nav PR

PR whos comments need to be fixed - #99

  • Remove all references to Magnum and use Sophus instead. Reasons -> Magnum installation causes errors with P1 demo stack, P1 demo and other external packages installed all use Sophus so maintenance is easy
  • Remove images generated (tests/hardware_tests/imagesearch_looking_forward.png)
  • If images are going to be generated in every run, please add the path to those images to .gitignore
  • [ ]

Add copyright to each file for open sourcing

Copyright (c) Meta Platforms, Inc. and its affiliates.

This source code is licensed under the MIT license found in the

LICENSE file in the root directory of this source tree.

Move the functionality to power on and shutdown the robot into spot.py

This PR is an optimization of current state of the repo.

Currently, the latest "controllers" that are getting added to Nav, Pick and Place each have a shutdown() method which detects the base and docks the robot .. or sits down in place if no dock is detected.

This code duplication can be prevented by moving it to spot.py.

Open issues from review of P1 demo PR#110

This issue is used to track all open/unresolved conversations from PR - #110

Reviewers
(Please check against your names if you don't have any additional changes / recommendations for the PR#110

Comments from PR#110 to be reviewed
@zephirefaith
#110 (comment)
#110 (comment)

@jimmytyyang
#110 (comment)
#110 (comment)
#110 (comment)

@akshararai
#110 (comment)
#110 (comment)
#110 (comment)
#110 (comment)
#110 (comment)

Recommended TODOs for improving the PR#110

  • Upgrade the demo_trigger mechanism
  • PLEASE ADD TODOs has tasks here.

Missing functionality for different default arm configurations

For resetting the arm, right now each environment (that uses the arm) is storing the arm initial configuration differently and resetting it inside its own implementation of reset method.

While this is good, it blocks the dev from reusing such resets in cases where envs are not to be used (eg : while using BD APIs)

Thus is would make more sense to create default configurations (for stow, carry, ready, etc) for spot-sim2real and have a wrapper method responsible for achieving those arm configuration.

The envs can further utilize this wrapper method. This will prevent a lot of code duplication and would be cleaner in the long run.

Fix ROS node init issue in base_env, its parent & its children

The issue is that SpotRobotSubscriberMixin which is absolute base class is the one that initializes ros node.

SpotBaseEnv, SpotNavEnv, SpotGazeEnv, SpotPlaceEnv keep modifying the ros node name which finally get's initialized in SpotRobotSubscriberMixin. This system is prone to bug because SpotSkillManager which does not inherit from any of these above mentioned classes will not be able to alter the node name and the file that utilizes SpotSkillManager cannot have a separate ros node init as that raises runtime error.

[Spot] `num_steps` stays constant even when Spot is moving, leading to infinite loops

Saw this when MG was called on 12/08/2023. @KavitShah1998 has also seen this in the past for other skills, so seems like a logic problem at skill-manager or skill-to-env integration layer.

****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.01, -0.12]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.01, -0.12]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.13]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.14]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.14]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.15]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.15]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.16]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.17]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.17]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.18]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.18]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.19]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.19]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.19]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.20]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 100
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.20]	raw_arm_ac: None
base_action: [0.00, 0.00, -0.11]	arm_action: None
****************************************************num_steps: 101
Execution has not exceeded 500 steps.
raw_base_ac: [0.02, -0.44]	raw_arm_ac: None
base_action: [0.00, 0.00, -0.23]	arm_action: None
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.10]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.01, -0.06]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.06]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.06]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.07]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.07]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.08]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.09]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.08]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.01, -0.01]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.04]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.05]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [-0.00, -0.05]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.06]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.07]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.00, -0.07]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102
Execution has not exceeded 500 steps.
raw_base_ac: [0.01, -0.08]	raw_arm_ac: None
base_action: None	arm_action: None
!!!! NO ACTIONS CALLED: moving to next step !!!!
****************************************************num_steps: 102

Gripper does not rotate by 90 deg if place fails

When robot tries to attempt place, it approaches the target pose using Boston Dynamics API. But sometimes, the operation fails and when that happens, robot performs a hardcoded operation to drop the object and retract the arm.

It would be much smoother if right before placing the object, we also rotate the arm by 90 degree.

Branch : lang_env

Robot sometimes crashes with an error stating the command sent to the robot has a distant end-time .. far off into the future and crashes code execution

!!!!!!Slow mode!!!!!!
raw_base_ac: [-0.25, 0.40]	raw_arm_ac: [0.00, 0.00, 0.00, 0.00]
base_action: [-0.06, 0.00, 0.10]	arm_action: None
DETECTIONS STR
None
policy_inference: 0.0065 run_actions: 0.7089 get_observations: 0.0066 total: 0.7219
gater: nav	corrective: 0.007, 0.011, 0.002, 0.048
!!!!!!Slow mode!!!!!!
raw_base_ac: [-0.43, -0.46]	raw_arm_ac: [0.00, 0.00, 0.00, 0.00]
Traceback (most recent call last):
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/akshara/asc_demo/spot-sim2real/spot_rl_experiments/spot_rl/envs/mobile_manipulation_env.py", line 395, in <module>
    main(spot, args.use_mixer, config, args.output)
  File "/home/akshara/asc_demo/spot-sim2real/spot_rl_experiments/spot_rl/envs/mobile_manipulation_env.py", line 104, in main
    observations, _, done, info = env.step(
  File "/home/akshara/asc_demo/spot-sim2real/spot_rl_experiments/spot_rl/envs/mobile_manipulation_env.py", line 278, in step
    observations, reward, done, info = SpotBaseEnv.step(
  File "/home/akshara/asc_demo/spot-sim2real/spot_rl_experiments/spot_rl/envs/base_env.py", line 331, in step
    self.spot.set_base_velocity(
  File "/home/akshara/asc_demo/spot-sim2real/bd_spot_wrapper/spot_wrapper/spot.py", line 428, in set_base_velocity
    cmd_id = self.command_client.robot_command(
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/bosdyn/client/robot_command.py", line 324, in robot_command
    return self.call(self._stub.RobotCommand, req, _robot_command_value, _robot_command_error,
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/bosdyn/client/common.py", line 237, in processor
    return func(self, rpc_method, request, value_from_response=value_from_response,
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/bosdyn/client/common.py", line 362, in call
    return self.handle_response(response, error_from_response, value_from_response)
  File "/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/bosdyn/client/common.py", line 370, in handle_response
    raise exc  # pylint: disable=raising-bad-type
bosdyn.client.robot_command.TooDistantError: bosdyn.api.RobotCommandResponse (TooDistantError): The command end time was too far in the future.
/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/rospy/core.py:146: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
  _rospy_logger.warn(msg, *args, **kwargs)
/home/akshara/miniconda3/envs/spot_ros/lib/python3.8/site-packages/rospy/core.py:598: DeprecationWarning: isAlive() is deprecated, use is_alive() instead
  if t.isAlive():
(spot_ros) akshara@akshara-laptop:~/asc_demo/spot-sim2real$ ^C

Observed while running ASC (so far)

Move generic util functions, classes from Aria and Spot to 3rd common package

          This file had some functions such as _rotate_img(), _rectify_image() which are also repeated in aria_live_streamer.py. At some point in the future, we may want to move such functions in some utils file? Similarly, if I am not mistaken, the parse_camera_stream() function in aria_streamer basically does what process_frame() function in aria_live_streamer does but over the entire stream of frames. So perhaps this function can also be refactored into a "utility" to reduce code duplication.

Originally posted by @rutadesai in #110 (comment)

Phantom pick at the end of ASC demo when robot is supposed to go to dock

We saw this several times during RL-R demo and also during tests inside apartment space in Fremont.

Robot on finishing all tasks from waypoints.yaml tries to navigate to the base. While doing so if it sees any detections even from a non-registered receptacle, it does and picks the object (from object_targets inside waypoints.yaml) and takes it to its desired location (again from object_targets inside waypoints.yaml).

On reaching this desired location, it thinks that it reached the dock & sits down there itself.

Messages from log & attached image are evidence for this. The chair on which robot detects a dinosour as a "penguin" is not mapped as a receptable at all, but spotty does pick it nevertheless.

policy_inference: 0.0408 run_actions: 0.8101 get_observations: 0.0008 total: 0.8517
[base_env.py]: Saying: Finished object rearrangement. Heading to dock.
open gripper called in SpotMobileManipulationBaseEnv


...

[bounding_box]: Detected: ball, penguin
policy_inference: 0.0361 run_actions: 0.5098 get_observations: 0.0008 total: 0.5467
gater: gaze	corrective: 0.047, -0.025
raw_base_ac: [0.00, -0.00]	raw_arm_ac: [-0.94, 0.90, 0.99, 0.86]
base_action: None	arm_action: [-0.25, -2.56, 2.02, 0.00, 1.57, 0.00]
DETECTIONS STR
penguin,0.09405618906021118,197,120,249,177
[bounding_box]: Detected: penguin
Locked on to target 1 time(s)...
policy_inference: 0.0043 run_actions: 0.5114 get_observations: 0.0012 total: 0.5169
gater: gaze	corrective: 0.062, 0.091
raw_base_ac: [0.00, -0.00]	raw_arm_ac: [-0.34, 0.97, 0.51, -0.98]
base_action: None	arm_action: [-0.28, -2.48, 2.07, 0.00, 1.49, 0.00]
DETECTIONS STR
penguin,0.19008836150169373,175,68,226,133
[bounding_box]: Detected: penguin
Locked on to target 2 time(s)...
policy_inference: 0.0759 run_actions: 0.5098 get_observations: 0.0010 total: 0.5867
gater: gaze	corrective: -0.015, -0.021
raw_base_ac: [0.00, -0.00]	raw_arm_ac: [0.33, 0.98, -0.58, -1.00]
base_action: None	arm_action: [-0.25, -2.40, 2.02, 0.00, 1.40, 0.00]
DETECTIONS STR
penguin,0.33202460408210754,144,51,210,122
[bounding_box]: Detected: penguin
Locked on to target 3 time(s)...
policy_inference: 0.0192 run_actions: 0.5111 get_observations: 0.0009 total: 0.5312
gater: gaze	corrective: -0.009, 0.099
raw_base_ac: [0.00, -0.00]	raw_arm_ac: [0.77, 0.98, -0.76, -1.00]
DETECTIONS STR
penguin,0.36048218607902527,148,37,203,113
[bounding_box]: Detected: penguin
Saved grasp image as /home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/grasp_visualizations/1685142245.4967258.png
Lost lock-on!
GRASP CALLED: Aiming at (x, y): (250, 106)!
[base_env.py]: Saying: Grasping penguin
Current grasp_point_in_image state:  MANIP_STATE_SEARCHING_FOR_GRASP
Current grasp_point_in_image state:  MANIP_STATE_SEARCHING_FOR_GRASP
Current grasp_point_in_image state:  MANIP_STATE_SEARCHING_FOR_GRASP
Current grasp_point_in_image state:  MANIP_STATE_SEARCHING_FOR_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_MOVING_TO_GRASP
Current grasp_point_in_image state:  MANIP_STATE_GRASP_SUCCEEDED
base_action: None	arm_action: [0.07, -2.76, 2.02, 0.00, 1.22, 0.00]
[base_env.py]: Saying: Navigating to kitchen_sink

...


base_action: [-0.05, 0.00, -0.51]	arm_action: None
policy_inference: 0.0095 run_actions: 0.5094 get_observations: 0.0140 total: 0.5328
gater: nav	corrective: 0.007, 0.011, -0.015, 0.029
raw_base_ac: [0.11, -0.94]	raw_arm_ac: [0.00, 0.00, -0.00, -0.00]
base_action: [0.06, 0.00, -0.49]	arm_action: None
[base_env.py]: Saying: Executing automatic docking
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
Dock not found... trying again
/home/kavitshah/miniconda3/envs/spot_ros/lib/python3.8/site-packages/rospy/core.py:146: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
  _rospy_logger.warn(msg, *args, **kwargs)
/home/kavitshah/miniconda3/envs/spot_ros/lib/python3.8/site-packages/rospy/core.py:598: DeprecationWarning: isAlive() is deprecated, use is_alive() instead
  if t.isAlive():

/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/grasp_visualizations/1685142245.4967258.png

Sleep before grasping

In the base environment there's a sleep of half a second before grasping.

"Briefly pause and get latest gripper image to ensure precise grasp"

I actually think we could not hardcode the .5 but wait for the next gripper image and as soon as it arrives we continue the execution, does this make sense @KavitShah1998?

<3

from transformers import OwlViTProcessor, OwlViTForObjectDetection
import torch
from PIL import Image
import time
import cv2
import argparse

class OwlVit():
    def __init__(self, labels, score_threshold, show_img):
        #self.device = torch.device('cpu')
        self.device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

        self.model = OwlViTForObjectDetection.from_pretrained('google/owlvit-base-patch32')
        self.model.eval()
        self.model.to(self.device)

        self.processor = OwlViTProcessor.from_pretrained('google/owlvit-base-patch32')

        self.labels = labels
        self.score_threshold = score_threshold
        self.show_img = show_img

    def run_inference(self, img):
        '''
        img: an open cv image in (H, W, C) format
        '''
        # Process inputs
        #img = img.to(self.device)
        inputs = self.processor(text=self.labels, images=img, return_tensors='pt')

        #Target image sizes (height, width) to rescale box predictions [batch_size, 2]
        #target_sizes = torch.Tensor([img.size[::-1]]) this is for PIL images
        target_sizes = torch.Tensor([img.shape[:2]]).to(self.device)
        inputs = inputs.to(self.device)

        # Inference
        with torch.no_grad():
            outputs = self.model(**inputs)

        # Convert outputs (bounding boxes and class logits) to COCO API
        results = self.processor.post_process(outputs=outputs, target_sizes=target_sizes)
        #img = img.to('cpu')

        if self.show_img:
            self.show_img_with_overlaid_bounding_boxes(img, results)
        
        return self.get_most_confident_bounding_box_per_label(results)

    def run_inference_and_return_img(self, img):
        '''
        img: an open cv image in (H, W, C) format
        '''
        #img = img.to(self.device)

        inputs = self.processor(text=self.labels, images=img, return_tensors='pt')
        target_sizes = torch.Tensor([img.shape[:2]]).to(self.device)
        inputs = inputs.to(self.device)
        # Inference
        with torch.no_grad():
            outputs = self.model(**inputs)

        # Convert outputs (bounding boxes and class logits) to COCO API
        results = self.processor.post_process(outputs=outputs, target_sizes=target_sizes)
        #img = img.to('cpu')
        #if self.show_img:
        #    self.show_img_with_overlaid_bounding_boxes(img, results)

        return self.get_most_confident_bounding_box_per_label(results), self.create_img_with_bounding_box(img, results)

    def show_img_with_overlaid_bounding_boxes(self, img, results):
        img = self.create_img_with_bounding_box(img, results)
        cv2.imshow('img', img)
        cv2.waitKey(1)

    def get_bounding_boxes(self, results):
        '''
        Returns all bounding boxes with a score above the threshold
        '''
        boxes, scores, labels = results[0]['boxes'], results[0]['scores'], results[0]['labels']
        boxes = boxes.to('cpu')
        labels = labels.to('cpu')
        scores = scores.to('cpu')

        target_boxes = []
        for box, score, label in zip(boxes, scores, labels):
            box = [round(i, 2) for i in box.tolist()]
            if score >= self.score_threshold:
                target_boxes.append([self.labels[0][label.item()], score.item(), box])
            
        return target_boxes

    def get_most_confident_bounding_box(self, results):
        '''
        Returns the most confident bounding box
        '''
        boxes, scores, labels = results[0]['boxes'], results[0]['scores'], results[0]['labels']
        boxes = boxes.to('cpu')
        labels = labels.to('cpu')
        scores = scores.to('cpu')

        target_box = []
        target_score = -float('inf')

        for box, score, label in zip(boxes, scores, labels):
            box = [round(i, 2) for i in box.tolist()]
            if score >= self.score_threshold:
                if score > target_score:
                    target_score = score
                    target_box = box

        if target_score == -float('inf'):
            return None
        else:
            x1 = int(target_box[0])
            y1 = int(target_box[1])
            x2 = int(target_box[2])
            y2 = int(target_box[3])

            print("location:", x1, y1, x2, y2)
            return x1, y1, x2, y2

    def get_most_confident_bounding_box_per_label(self, results):
        '''
        Returns the most confident bounding box for each label above the threshold
        '''
        boxes, scores, labels = results[0]['boxes'], results[0]['scores'], results[0]['labels']
        boxes = boxes.to('cpu')
        labels = labels.to('cpu')
        scores = scores.to('cpu')

        # Initialize dictionaries to store most confident bounding boxes and scores per label
        target_boxes = {}
        target_scores = {}

        for box, score, label in zip(boxes, scores, labels):
            box = [round(i, 2) for i in box.tolist()]
            if score >= self.score_threshold:
                # If the current score is higher than the stored score for this label, update the target box and score
                if label.item() not in target_scores or score > target_scores[label.item()]:
                    target_scores[label.item()] = score
                    target_boxes[label.item()] = box

        # Format the output
        result = []
        for label, box in target_boxes.items():
            x1 = int(box[0])
            y1 = int(box[1])
            x2 = int(box[2])
            y2 = int(box[3])

            result.append([self.labels[0][label], target_scores[label], [x1, y1, x2, y2]])

        return result

    def create_img_with_bounding_box(self, img, results):
        '''
        Returns an image with all bounding boxes avove the threshold overlaid
        '''

        results = self.get_most_confident_bounding_box_per_label(results)
        font = cv2.FONT_HERSHEY_SIMPLEX

        for label, score, box in results:
            img = cv2.rectangle(img, box[:2], box[2:], (255,0,0), 5)
            if box[3] + 25 > 768:
                y = box[3] - 10
            else:
                y = box[3] + 25
            img = cv2.putText(
                    img, label, (box[0], y), font, 1, (255,0,0), 2, cv2.LINE_AA
                )

        return img

    def update_label(self, labels):
        self.labels = labels

Spot Sim2Real scratch installation on a new desktop

  • pycocotools and tensorflow had to be commented in environment.yaml for conda env install
  • installation guide says source ~/.bash_profile change that to ~/.bashrc
  • source bashrc after exporting env lines
  • gcc & g++ had to be installed on new machines
  • tensorflow(2.9.0) had to be installed separately and specifically for habitat-lab
  • tmux had to be installed for new machines

Discussion about hardware and software testing

Daily testing for hardware (running automatic testing script)

  • Low level motors (we do not need this, since we use BD API)
    • Single joints
    • Could you move the base given the command
  • High level integration tests
    • Navigation: if the robot can navigate to the goal point or not
    • Grasping: if the robot can grasp the object or not
    • Perception: if the robot can return the bounding box or not
    • Frequency
      • Weekly testing or daily testing
      • 5 or 10 mins for asking robot to do an obstacle course
  • Software
    • CI test for launching the ROS node

in discussion with Mrinal

Torchscript errors

Issue1 : I am getting this error about missing torchscript weights.

Traceback (most recent call last):
  File "/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/utils/img_publishers.py", line 452, in <module>
    model = OWLVITModel()
  File "/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/utils/img_publishers.py", line 363, in __init__
    self.config = config = construct_config()
  File "/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/utils/utils.py", line 56, in construct_config
    raise KeyError(f"Neither {v} nor {new_v} exist!")
KeyError: 'Neither weights/torchscript/CUTOUT_WT_True_SD_200_ckpt.99.pvp_combined_net.torchscript nor /home/kavitshah/fair/spot-sim2real/spot_rl_experiments/weights/torchscript/CUTOUT_WT_True_SD_200_ckpt.99.pvp_combined_net.torchscript exist!'

Can you add a section in ISSUES.md to explain steps to resolve it?

Issue2 : Mobile Manipulation Environment fails (but Sequential Experts run fine)

(spot_ros) kavitshah@frerd001:~/fair/spot-sim2real/spot_rl_experiments/spot_rl/envs$ python mobile_manipulation_env.py -m
/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/albumentations/augmentations/geometric/functional.py:6: DeprecationWarning: Please use `gaussian_filter` from the `scipy.ndimage` namespace, the `scipy.ndimage.filters` namespace is deprecated.
  from scipy.ndimage.filters import gaussian_filter
/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/pretrainedmodels/datasets/utils.py:33: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  def __init__(self, size, interpolation=Image.BILINEAR):
/home/kavitshah/fair/spot-sim2real/third_party/mask_rcnn_detectron2/detectron2/detectron2/data/transforms/transform.py:46: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  def __init__(self, src_rect, output_size, interp=Image.BILINEAR, fill=0):
/home/kavitshah/fair/spot-sim2real/third_party/mask_rcnn_detectron2/detectron2/detectron2/data/transforms/augmentation_impl.py:117: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  def __init__(self, shape, interp=Image.BILINEAR):
/home/kavitshah/fair/spot-sim2real/third_party/habitat-lab/habitat/utils/visualizations/maps.py:25: DeprecationWarning: Starting with ImageIO v3 the behavior of this function will switch to that of iio.v3.imread. To keep the current behavior (and make this warning disappear) use `import imageio.v2 as imageio` or call `imageio.v2.imread` directly.
  AGENT_SPRITE = imageio.imread(
/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  if not hasattr(tensorboard, "__version__") or LooseVersion(
2024-02-12 15:44:59,217 - INFO - Current battery charge: 85.0%
2024-02-12 15:44:59,243 - INFO - Starting lease check-in
Loading policy...
/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/torch/serialization.py:707: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
  warnings.warn("'torch.load' received a zip file that looks like a TorchScript archive"
Executing automatic docking
Dock not found... trying again
Dock not found... trying again
2024-02-12 15:45:03,252 - INFO - Robot safely powered off.
2024-02-12 15:45:03,263 - INFO - Lease check-in stopped
2024-02-12 15:45:03,274 - INFO - Returned the lease.
Traceback (most recent call last):
  File "mobile_manipulation_env.py", line 423, in <module>
    main(spot, args.use_mixer, config, args.output)
  File "mobile_manipulation_env.py", line 40, in main
    policy = MixerPolicy(
  File "/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/real_policy.py", line 298, in __init__
    super().__init__(
  File "/home/kavitshah/fair/spot-sim2real/spot_rl_experiments/spot_rl/real_policy.py", line 76, in __init__
    self.policy = policy_class.from_config(
  File "/home/kavitshah/fair/spot-sim2real/third_party/habitat-lab/habitat_baselines/rl/ppo/moe.py", line 172, in from_config
    return cls(
  File "/home/kavitshah/fair/spot-sim2real/third_party/habitat-lab/habitat_baselines/rl/ppo/moe.py", line 442, in __init__
    super().__init__(
  File "/home/kavitshah/fair/spot-sim2real/third_party/habitat-lab/habitat_baselines/rl/ppo/moe.py", line 114, in __init__
    nav_ckpt, self.expert_nav_policy = get_ckpt_policy(nav_ckpt_path)
  File "/home/kavitshah/fair/spot-sim2real/third_party/habitat-lab/habitat_baselines/rl/ppo/moe.py", line 109, in get_ckpt_policy
    ckpt["config"].RL.POLICY["init"] = False
  File "/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/torch/jit/_script.py", line 838, in __getitem__
    return self.forward_magic_method("__getitem__", idx)
  File "/home/kavitshah/miniforge3/envs/spot_ros/lib/python3.8/site-packages/torch/jit/_script.py", line 831, in forward_magic_method
    raise NotImplementedError()
NotImplementedError
2024-02-12 15:45:03,276 - INFO - signal_shutdown [atexit]

Pyaudio

To record audio in Python on Linux, you can use the 'pyaudio' library. Here's a simple example showing how to record audio and save it to a WAV file:

  1. First, you need to install 'pyaudio'. You can do this using pip:
pip install pyaudio
  1. Next, create a Python script with the following code:
import pyaudio
import wave

# Configuration
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = 1024
RECORD_SECONDS = 5
WAV_OUTPUT_FILENAME = "output.wav"

# Initialize PyAudio
audio = pyaudio.PyAudio()

# Open a streaming stream
stream = audio.open(format=FORMAT,
                    channels=CHANNELS,
                    rate=RATE,
                    input=True,
                    frames_per_buffer=CHUNK)

print("Recording...")4gh5p-

frames = []

# Record audio data
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)

print("Finished recording")

# Stop and close the stream
stream.stop_stream()
stream.close()

# Terminate the PortAudio interface
audio.terminate()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.