personalrobotics / ada_feeding Goto Github PK
View Code? Open in Web Editor NEWRobot-assisted feeding demos and projects for the ADA robot
Robot-assisted feeding demos and projects for the ADA robot
As it stands, if the moveit2_lock
is acquired, the behavior's tick function will block that thread. This is undesirable; instead it should return running.
(Note that we haven't run into this issue yet because we have designed the trees without Parallels, but it might come up in the future.)
As discussed in #102
These can be gleaned directly from self.blackboard.namespace
and self.blackboard_tree_root.namespace
.
Additionally, we should be able to get rid of self.blackboard_tree_root
entirely.
Proposal: send_goal
and get_result
will only be run in the root tree, so the distinction is unnecessary in those functions.
But get_feedback
needs some introspection to work. This can be done with a Visitor.
It is generally bad BT design for a given sub-tree to be dependent on knowing the structure of its parent.
Currently, sometimes the MoveFromMouth
action will have a cartesian motion back from the mouth, and will then move closer to the user's face during the kinematic motion. That movement closer to the user's face feels scary and unnecessary.
To address this, we should have a wall in front of the user that prevents the robot from moving closer to the user. For safety, we should have this wall for all motions, and only disable it for the cartesian motions to/from the mouth in MoveToMouth / MoveFromMouth. However, if we put this wall too close to the staging pose, it is possible that the end configuration of the cartesian motion in MoveFromMouth leaves the robot in collision with this. Therefore, we might also need to be able to move this wall based on the robot arm pose so it is always right beyond the forktip, and essentially prevents the robot from moving closer to the user than it already is.
Currently, the MoveTo behavior executes both planning and execution. However, there might be cases where we want to do multi-stage planning (e.g., plan the movement to the food and the movement within the food) before executing them both. To enable something like this, MoveTo must have the option to either plan or execute or both.
We should add github actions / CI to the ada_feeding repo as a standardized check before anyone merges PRs
When running screen capture on t0b1
while moving the robot arm, the resulting controller motion is very jerky.
My hunch is that this is just due to high CPU utilization which is preventing the controller from running as fast as it should. I think the best solution is just to not run anything with high CPU utilization while commanding the robot. Nevertheless, I'm creating this issue in case folks notice jerky controller motions in other contexts as well.
Sometimes on LoveLace, in the process of all the service calls in pre_moveto_config
(i.e., toggle watchdog listener off, re-tare FT sensor, toggle watchdog listener on, and set FT thresholds) the watchdog trips because a watchdog message hasn't been received in 0.5 sec (or 0.1, depending on the parameter setting). However, the update
function of that behavior is async, so it should not be blocking. This is something that is necessary to look into -- why sometimes watchdog messages are not received for a time period. It may be related to #62 , or to this article on deadlock.
Note that both LoveLace and Nano are using Cyclone DDS
Currently, all the behaviors we developed that take in a ROS node take it in as an argument to __init__
. On the other hand, all py_trees_ros
behaviors take the node in as a kwarg to setup()
. This issue is to move all our behaviors to take in the node during setup, to unify it with py_trees_ros
Right now, our collision scene (#39 ) has several issues:
wheelchair_collision_object
which is meant to expand the wheelchair to account for the user's body/. However, user's have a variety of body types. We have currently implemented a wheelchair_collision_object
focused on a large, tall body type. However, if someone who is shorter sits in the wheelchair, their head is in inside the wheelchair_collision_object
which means the robot cannot move to their head. We currently address this by turning off the wheelchair_collision_object
(see #64 ) but that is dangerous. Instead, we should do one or more of the following:
wheelchair_collision_mesh
more granular, so as opposed to having one mesh that expands the wheelchair and adds the user body, we separate this into two meshes. Then, when we move the head mesh in bite transfer, we should also move the body mesh. Then we won't have to turn off the body mesh during transfer.Now that #102 has gotten merged, we have a concise way to specify inputs and outputs to every behaviors. This issue is to update all behaviors/decorators to subclass BlackboardBehavior
, and then updates the idioms/trees that use those behaviors accordingly.
So we aren't consuming GPU constantly
Currently, MoveIt2Plan only executes a single cartesian plan. However, due to limitations in the cartesian interpolator, this planning call may only have completed a fraction of the requested distance. Empirically, I have found that just calling 1-2 other cartesian planning calls from the end of the previous one can be sufficient to complete the motion. Thus, we should consider adding a flag to MoveIt2Plan that allows it to engage in up to n
carteisan planning attempts until the end of its trajectory is at the goal, and then concatenate them together.
Currently, the Moveto behavior has its own joint state subscriber. However, the MoveIt2 object already has a joint state subscriber, that can be accessed via a property. Therefore, we should modify the MoveTo behavior to use the MoveIt2 joint state property.
We should have more consistency (and more documentation/comments) about where we follow what collision constraints. It took me a long time to understand that the collision constraints in Feeding Demo (e.g., this) are actually empty as far as I can tell, and to get the world collision constraints you have to call a function in aikido (e.g., this call to this aikido function). Before I realized the different collision constraints that are being used, I'd freely copy/paste a call to planToConfiguration
without considering which collision constraints are being used. This makes it more challenging for new developers to onboard onto ADA.
See #11 for the problem that led me to realize the confusion and inconsistencies in how we treat collisions.
As a solution, I'd propose the following:
getWorldCollisionConstraint()
) for getting the collision constraints that are currently gotten from the aikido function.mCollisionFreeConstraint
and mCollisionFreeConstraintWithWallFurtherBack
. The new function that gets a blank collision constraint should be called something like getEmptyCollisionConstraint()
.When feeding a person, sometimes the head is perceived in a very strange orientation. (This is particularly exacerbated if it is a mannequin head, or if the person is wearing a mask.) That perceived head is added to the aikido world. It is not an issue when moving towards the person, because we ignore collision constraints. However, when moving to other poses after feeding the person, we do check for collisions and the wrongly-detected head makes some configurations wrongly infeasible.
See the attached image of a wrongly detected head that is messing up collision checking.
I see two ways to fix this:
This issue was motivated by a few specific cases, although we should come up with a generic way to handle it:
These are all about checking the start config of the robot arm relative to goal/path constraints and adjusting planning accordingly. Which is why I think there should be a unified, generic, elegant solution to it.
The ROS2 ADA feeding system needs face detection. This issue encompasses the following:
The user needs a way to start the entire robot software from the app. This will be relevant for two scenarios:
To enable this, we should do the following:
In this way, as long as the web app is running the user can start the whole system, and if the web app is not running (which is easy to detect) or the Flask server is not running (which we need to display to the user on the web app), the user has someone restart LoveLace and then it should be running.
Currently, the pre_moveto_config
idiom waits for the re-taring service to succeed and then immediately re-enables the watchdog listener. However, the core question is: do the FT sensor readings pause for ~0.75s before the service returns success, or after?
pre_moveto_config
is fine, just modify dummy_ft_sensor
to mimic this behavior.pre_moveto_config
, no need to modify dummy_ft_sensor
.Currently, the MoveToMouth
tree moves to the staging location, waits to detect the mouth, and then moves to the user. The challenge with this is that while the robot is detecting the face, it is unclear to the user what is going on (the app just shows "thinking") and hence it is unclear to them what they should do to facilitate successful robot behaviors (e.g., move their head, teleoperate the arm down, etc.)
Instead, we should separate bite transfer into two separate action calls:
MoveToStagingConfiguration
should move to the staging configuration.MoveToMouth
should take in the results of face detection and move to the mouth.For robustness, MoveToMouth should have some fallbacks in case the detected face is stale, plus it should convert it to the base frame at the timestamp in the message in case the robot has moved since.
As referenced in #102
Write unit tests for ada_feeding.helpers.quat_between_vectors
There are two issues with the way MoveTo feedback is currently structured:
Currently, the AcquireFood
action assumes the camera is in the same pose as it was when the mask was captured from the web app. Although that is mostly true, the one case where it might not be is if the app user refreshes the page while the robot is moving. As the page is unmounting, the app will terminate the action, and then as it is re-mounting, it will call the action with the same parameters as when it was originally called. That is because the React app cannot differentiate between the page mounting when the user just transitions to it, versus the page mounting when it refreshes, and the "robot motion" pages are designed to call the robot action as soon as they mount.
To address this, when AcquireFood computes the food frame, it should first use the TF tree to get the transform from the camera to the base frame at the time included in the action goal's header, and then compute the food frame relative to that.
(Note, there is a separate question of whether it is desirable to have the app re-call the robot action if the page is refreshed, and, if not, how we can architect the app to avoid that. However, the above seems like a good improvement to make to acquisition code regardless, and we can also make the change to app architecture if needed.)
Currently, after ada_feeding
is launched, the watchdog fails until the e-stop button is clicked once. Further, ada_moveit
currently kills itself if the watchdog is failing. As a result, the launch order is:
ada_feeding
ada_moveit
This is nonideal because we have a user-in-the-loop launch sequence. Ideally, we should be able to launch the code once and then the user pushes the e-stop button (for the watchdog startup condition) and then continues with feeding.
I can think of the following ways to address it:
watchdog_listener
that kills the controller starts "off." And should be turned on before any motion. The downside of this is that if the programmer forgets to turn it on because a motion, then that motion is not protected by the watchdog.watchdog_listener
treats startup conditions differently from regular conditions. Specifically, if a startup condition fails it doesn't kill itself. However, this is also dangerous because a programmer could move the robot without it being protected by the watchdog (if the startup conditions haven't passed).sys
comments in Python, does it run in the same process or a different process (the latter is what we want)? (b) what environment variables are set in Python, and does that include all the required environment variables to ros2 launch
a node? But even if we get it to work, this feels a bit non-ideal to me since we are allowing someone to pass in an arbitrary bash command in the yaml that our code will blindly execute.As you can see, all the above approaches have downsides. Thoughts @taylorkf @egordon ?
Some trees make changes (e.g., allowing certain collisions, toggling perception nodes on) that should be undone when the tree ends. The tree can end in a couple of circumstances:
create_action_servers.py
that cleans up any state from the action.This Issue is to implement #3.
MoveIt Servo promises to be an easy way to enable users to teleoperate the robot (while respecting collisions), and might make it easier to do cartesian motions of the robot. This issue exists to track progress on implementing MoveIt Servo for ADA. The corresponding PRs are ada_feeding#118, ada_ros2#23, pr_ros_controllers#28, ada_ros2#24, ada_ros2#25.
Anticipated steps are below. Note that this can only be tested on real, since we will be using velocity control which doesn't exist in sim.
ada_feeding
to send twist commands to MoveIt Servo, and 0 velocity commands when a key is not being pressed.Our RealSense is running on a Jetson Nano. On the Nano itself, we can run at least 4 concurrent ros2 topic echo /camera/color/image_raw
and ros2 topic echo /camera/depth/image_rect_raw
each and not have issues.
On a computer other than the Nano, we can have one concurrent subscriber each. As soon as we add the second subscriber to the color image, all subscribers to the depth image stop receiving images. Switching to compressed images appears to make the problem better, but it doesn't go away entirely.
This may be related to this comment, although our librealsense
is configured for CUDA (I verified that GPU utilization increases when we launch the realsense nodes).
One potential way to address this is to create a republisher that is only subscribed to by nodes running on the same non-Nano machine. However, if the republisher subscribes to too many different camera topics (e.g., color raw, color compressed, depth raw, aligned depth raw) the subscribers in the republisher itself stop receiving images.
Hence, we need to converge upon which ~2 topics all our perception nodes will use (probably compressed color and aligned depth) and develop a republisher for just those. Further, we should consider developing the republisher node such that it only publishers images if there is a subscriber, else doesn't (to save compute power).
YAML has a way to define variables to re-use throughout the file: https://docs.geoserver.org/main/en/user/styling/ysld/reference/variables.html
We should move any variables that are re-used across the file, e.g., staging configuration, above plate configuration, resting configuration, action types, etc. to be YAML variables to simplify the file and make it more readable.
Right now, all actions in src/actions
are undocumented, while the parameters in the FeedingDemo
object are.
We should create a separate PR that focuses on documenting each argument for each action.
Currently, the e-stop condition doesn't close pyaudio or the stream. This can result in the number of channels detected for future calls to pyaudio
being 0, until the machine restarts.
To address this, each watchdog condition should have a terminate
method. The e-stop condition's terminate method specifically should contain:
self.stream.stop_stream()
self.stream.close()
self.audio.terminate()
Then, in ada_watchdog
outside of the spin
, we should call a terminate method of the watchdog, which in turn calls terminate for every condition.
Currently, FaceDetection publishes a single mouth location, taken from the largest face in view. We may in the future want to rework this code so that it returns instead a list of all detected mouths, which can then be used by MoveToMouth to determine which mouth is most relevant.
Currently, there are a lot of warnings during build.
We want a node that can output a probability of whether or not there is Food on Fork.
As it stands, the MvoeFromMouth cartesian motion often doesn't get to a starting pose that is amenable to planning to the resting configuration in the allotted time. Further, I have seen one cases where it stops in a location that is just barely in collision with the in front of wheelchair wall. Therefore, this issue is to lower the tolerance on that motion, so the robot gets closer to the pose and therefore doesn't have the above issues.
(If the issue is instead with cartesian planning only succeeding for e.g., 80% of the cartesian path, we can do the rest in the action itself as opposed to continuing? Gotta keep thinking about that case.)
Currently, when moving away from the mouth, the web app calls the vanilla MoveAbovePlate
and MoveToRestingPosition
. However, the trouble with this is:
wheelchair_collision
object. Therefore, no vanilla action will work. We need an action that will allow collisions with the wheelchair_collision
object, initially (e.g., as the robot moves back to the staging location) and then disallows collisions as it either moves above the plate or moves to the resting position.Note that currently MoveToMouth #42 permanently allows collisions with the wheelchair_collision
object to account for this, which is dangerous. Therefore, addressing this is a top priority.
Currently, most of our nodes use the default callback group, which only lets one topic/service/action/timer process a callback at a time. This is a huge problem, because it slows down execution and can result in some callbacks not getting called for a while because other callbacks are called first.
We should be very intentional with what executor and callback group we are using for each node and each callback within the node. See the below resources for more details:
Currently, the create_move_to_tree
function is in charge of setting every behaviors logger to the node's logger. But it is easy to forget to do so, and if we use an idiom, it becomes unwieldy to do so for all children of the idiom.
Instead, create_action_servers.py
should iterate over the tree as part of setting it up and set every behaviors logger to the node's logger. This also makes sense because create_action_servers.py
is the layer where the ROS connection necessarily comes in, not lower layers.
Currently, we have individual MoveTo calls with set constraints/parameters (e.g., in bite transfer). However, some of these constraints can be eliminated if need be. For example, the planning time can be increased. Or the orientation path constraints can be removed.
This issue is to create a generic idiom that lets us specify what constraints/parameters we want on the tree, but also which are optional, how to relax them, and in what order to relax them. The idiom will then produce a Selector that has the default MoveTo, with relaxed versions of it as fallbacks if the initial one fails.
The MVP Bite Transfer in #42 and #56 results in some scary/dangerous motions, for a few reasons:
The MVP way to fix this is fundamentally moving back to the ROS1 bite transfer:
As part of doing that, we should maybe move back to the center staging location for now, as opposed to the side one. We should also consider moving that staging location down, to account for people who are shorter.
Note that part of the issue is about collision constraints. We really only want to turn off collisions with an expanded mesh close to the head, but we end up turning off collisions with the entire expanded wheelchair collision object (meant to represent the user's body. That is problematic. Even if we do add a separate "wheelchair collision object" (meant to represent a user's body) and a separate "head collision object," the challenge is that people are different sizes and for some people their head will be in the wheelchair collision object. So maybe we have to get the Octomap working to deal with this?
Followup to #62 and #92 . Now that we have revamped the Executors and CallbackGroups, the other part of ROS2's architecture that may be causing issues is our QualityOfService settings. In many cases, we just stick to a pre-implemented QoS, or we just pass in a queue length like 1, without thinking intentionally about what guarentees we want in each dimension of the QoS policy. This Issue is to address that.
Currently, we have dummy behavior trees for all the robot motions. Of these, MoveAbovePlate, MoveToRestingPosition, and MoveToStowLocation are the "easier" ones to implement because they involve moving the arm to a hardcoded configuration.
This issue involves the following. Start with MoveAbovePlate, and only after that works move on to MoveToRestingPosition, and MoveToStowLocation.
behaviors
folder, and the tree should go into the trees
folder and match the interface for ActionServerBT
. Note that you will still need to implement one dummy behavior, for the behavior that actually calls or interacts with MoveIt, but the rest should be final code.As of the above #119, MoveIt Servo works well for local motions, but for large motions like moving from the home configuration to an at-mouth position, it doesn't work well and often gets stuck in a singularity. If we want to enable farther-distance teleoperations like that, we may have to do one or more of the following:
In MoveToMouth (as of #42 ), the robot arm often gets closer than 5cm to the detected mouth, and also oftentimes is in front of the nose rather than the mouth. The face detection img seems to be correct. Further, it does move 5cm away from the mouth of the head mesh in the MoveIt planning scene (it determines the planning scene head location based on the stomion detected from face detection). This makes me think that the stomion location detected in FaceDetection may be wrong (although it is possible that maybe the issue lies in the MoveToMouth behaviors).
This issue is focused on checking, in a very detailed manner, how well the detected mouth location aligns with the real mouth location. One way to do this can be by using depth_image_proc to visualize the depth image as a pointcloud in RVIZ, and seeing how the actual mouth center aligns with where the robot moved the mouth center in the MoveIt planning scene.
(Note that if you use depth image proc, you have to comment out lines 47-49 and 72-74 for the launchfile to work)
Currently, each behavior that needs access to MoveIt (from pymoveit2
) creates its own MoveIt object. This works fine as long as the programmer is very careful, but if they make a mistake (e.g., but two Moveto behaviors in a Parallel composite) it can mess up. Therefore, we should instead have one global MoveIt object. However, the downside of this is that constraints set by one MoveTo action might still be there for another (e.g., if one MoveTo branch of the tree gets terminated), so maybe we need to add a clear constraints
decorator as well.
Currently, many of our installs, particularly in ada_feeding_perception
, use pip
. There are two issues with this:
pip
installs, whereas apt
installs run on a computer level.Therefore, we should systematically go through our pip
installation instructions, determine if there is a apt
version of the package, and if so change the documentation to use that version instead.
As of #50 , the e-stop button and node works on computers with a separate microphone jack. However, lovelace
has a headset jack (TRRS) that doesn't take mono microphone (TS) input. We are currently ordering adapters to allow that, but in the process the mono microphone input may be remapped to a stereo microphone input (TRS). As a result, we may need to modify the e-stop condition of the watchdog to get it to work on lovelace
.
When planning with orientation constraints, the resulting robot arm motion feels jerkier. It is smooth in cartesian space, but when sitting in the wheelchair, it feels like some joints change motion suddenly, causing the wheelchair to vibrate.
This issue involves:
sim
, so if we re-implement it it should only be for real.Reduce movement speed on feeding actions by setting the max velocity scaling factor. This should likely be accomplished at the MoveTo level by accessing the MoveIt settings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.