This is the official repository of MaskUKF, an instance segmentation aided Unscented Kalman Filter for 6D object pose tracking.
NEW: Code/scripts for evaluation and testing available for both real-time and non-real-time scenario.
- Dependencies
- Instructions for evaluation
- Instructions for testing
- Structure of the results data
- Results
Code has been tested on Arch Linux
with the following dependencies with the indicated version. Please note that the indicated version is not the minimum required version.
armadillo (9.500.2-1)
BayesFilters (0.9.100)
Eigen 3 (3.3.7-2)
ICUBcontrib (1.13.0)
mlpack (4.0.0)
OpenCV (3.4.0)
PCL (1.9.1)
YARP (3.2.1)
Note: while we use
Eigen
for all the mathematical computations, the librarymlpack
relies onarmadillo
OpenMP (8.0.0-1)
(optional for faster execution)
We use
OpenMP
for faster evaluation of theUKF
measurement model and for faster evaluation of theADD-S
metric. If possible, you should use a version ofmlpack
compiled againstOpenMP
to obtain faster execution of the outlier rejection procedure.
These instructions allow downloading precomputed results of algorithms MaskUKF
, DenseFusion
and ICP
and evaluating the ADD-S
and RMSE
metrics.
If you need to test the actual algorithm and recompute the results please follow the Instructions for testing section. In case you recomputed the results, you can skip to point (4) for the actual evaluation of the metrics.
-
Clone the repository, build and install
git clone https://github.com/robotology/mask-ukf cd mask-ukf mkdir build cd build cmake -DCMAKE_PREFIX_PATH=<INSTALL_PATH> [-DUSE_OPENMP=ON] ../ make install
Build with
OpenMP
is optional.<INSTALL_PATH>
is the path where the executables will be installed. Please make sure that this path is reachable in your environment, e.g. by settingexport PATH=$PATH:<INSTALL_PATH>/bin
in your environment.
-
Download the zip file containing the results of the algorithms
MaskUKF
,ICP
andDenseFusion
on theYCB Video Dataset
. We provide the output for the algorithmDenseFusion
on all the frames of the dataset (not only on the key frames).wget https://zenodo.org/record/3466491/files/results.zip
-
Extract the zip file
unzip results.zip -d <mask-ukf>/results
where
<mask-ukf>
is the folder where the repository was cloned. More details on the content of the results data here. -
Execute the evaluation using the scripts provided in
<mask-ukf>/results/scripts
(or~/robot-code/mask-ukf/results/scripts
if you followed instructions in the Instructions for testing section):- the
add-s
folder contains scripts for theADD-S
metric both<2 cm
andAUC
- the
rmse
folder contains scripts for theRMSE
metric - the
rmse_velocity
folder contains scripts for theRMSE
for the linear and angular velocity
Each script file name is of the form
eval_<alg>_<scenario>_<segmentation>.sh
where:<alg>
can bemask-ukf
,desnefusion
oricp
<scenario>
can benrt
(i.e. masks available at each frame) orrt
(i.e. masks fromMask R-CNN
at 5 fps)<segmentation>
can begt
(i.e. ground truth),mrcnn
(i.e.Mask R-CNN
) orposecnn
(i.e. masks from segmentation network ofPoseCNN
)
Not all combinations of
<alg>
,<scenario>
and<segmentation>
are available. For example,ADD-S
results forDenseFusion
are available in their repository. - the
These instructions allow building the code implementing the MaskUKF
algorithm and the ICP
procedure used as baseline. Additionally, they allow testing the algorithms and producing the numerical results required to evaluate the ADD-S
and RMSE
metrics.
For ease of retrieval of configuration files and contexts used by the algorithms, in the following we assume that all the relevant code is built and installed with CMake
using the option -DCMAKE_INSTALL_PREFIX=$ROBOT_INSTALL
where $ROBOT_INSTALL
is a folder of your choice. We further assume that an environment variable YARP_DATA_DIRS
pointing to ${ROBOT_INSTALL}/share/ICUBcontrib
exists and that the variable PATH
is extended so as to point to ${ROBOT_INSTALL}/bin
. E.g. your .bashrc
should contain something like
export PATH=${PATH}:${ROBOT_INSTALL}/bin
export YARP_DATA_DIRS=${YARP_DATA_DIRS}:${ROBOT_INSTALL}/share/ICUBcontrib
If these instructions are not clear to you, fell free to fire up an issue.
-
Build and install OR install precompiled version of libraries
armadillo
,Eigen
,mlpack
,OpenCV
andPCL
. -
Build and install
YARP
mkdir -p ~/robot-code cd ~/robot-code git clone https://github.com/robotology/yarp cd yarp git checkout v3.2.1 mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX=$ROBOT_INSTALL ../ make install
-
Install
ICUBcontrib metapackage
mkdir -p ~/robot-code cd ~/robot-code git clone https://github.com/robotology/icub-contrib-common cd icub-contrib-common git checkout v1.13.0 mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX=$ROBOT_INSTALL ../ make install
-
Build and install
BayesFilters
filtering librarymkdir -p ~/robot-code cd ~/robot-code git clone https://github.com/robotology/bayes-filters-lib cd bayes-filters-lib git checkout 6af232e mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX=$ROBOT_INSTALL ../ make install
-
Build and install
MaskUKF
and baselineICP
implementationsmkdir -p ~/robot-code git clone https://github.com/robotology/mask-ukf cd mask-ukf mkdir build && cd build cmake -DCMAKE_INSTALL_PREFIX=$ROBOT_INSTALL -DBUILD_OBJECT_TRACKING=ON [-DUSE_OPENMP=ON] ../ make install
Build with
OpenMP
is optional. -
Download and extract the dataset for non-real-time scenario (15.7 GB).
The dataset consists of a restructured version of the
YCB Video Dataset
containingRGB
images, png masks (ground truth masks,PoseCNN
masks andMask R-CNN
masks) and 6D ground truth poses in accessible formats (noMATLAB .mat
files involved). Extract the dataset as follows:wget https://zenodo.org/record/3466605/files/dataset_nrt.zip unzip dataset_nrt.zip -d ~/robot-code/mask-ukf/datasets
-
Download and extract the dataset for real-time scenario (46.7 GB).
The dataset consists of a restructured version of the
YCB Video Dataset
containingRGB
images, pngMask R-CNN
masks and 6D ground truth poses inYARP data player
compatible format. The player allows simulating a real-time scenario with images at 30 fps and masks at 5 fps (maximum frequency declared by the authors ofMask R-CNN
).wget https://zenodo.org/record/3465685/files/dataset_rt.zip unzip dataset_rt.zip -d ~/robot-code/mask-ukf/datasets
-
Execute the algorithms on the
YCB Video Dataset
using the scripts provided in~robot-code/mask-ukf/testing/<scenario>
where<scenario>
can benrt
for non-real-time orrt
for real-time. At the momentrt
scripts cannot be used as the real-time dataset is in the process of being released. Scripts can be run on all the objects of theYCB Video Dataset
testing setbash test_<alg>.sh <segmentation>
or on a single object
bash test_<alg>_single.sh <segmentation> <class_name>
where
<alg>
can bemask-ukf
oricp
,<segmentation>
can begt
(i.e. ground truth),mrcnn
(i.e.Mask R-CNN
) orposecnn
(i.e. masks from segmentation network ofPoseCNN
) and<class_name>
is the class name (e.g.002_master_chef_can
). Scripts forreal-time-scenario
are available with<segmentation>=mrcnn
only.During testing, a viewer based on the
YARP
library will be available in order to inspect the current estimate of the object (the viewer shows a contour representing the projection onto the camera plane of the 6D pose of the object).Results are saved in
~/robot-code/mask-ukf/results
according to the structure explained in the Structure of the results data section. Each execution of the testing script removes any previously existing results. Evaluation ofADD-S
andRMSE
metrics is described in point (4) of the Instructions for evaluation section.
The results data is organized in folders according to the following structure
<alg>/<scenario>/<segmentation>/validation/<class_name>/<video_id>
where
<alg>
can bemask-ukf
,icp
ordense_fusion
<scenario>
can benrt
(i.e. masks available at each frame) orrt
(i.e. masks fromMask R-CNN
at 5 fps)<segmentation>
can begt
(i.e. ground truth masks),mrcnn
(i.e. masks fromMask R-CNN
) orposecnn
(i.e. masks from segmentation network ofPoseCNN
)<class_name>
is the name of one of the classes belonging to the testing set of theYCB Video Dataset
<video_id>
is the video id of one of the video belonging to the testing set of theYCB Video Dataset
Please note that not all the combinations are available. For example, DenseFusion
is available only in the nrt
scenario.
Within each folder, two files are available:
object-tracking_estimate.txt
contains, for each frame, the Cartesian position, the axis angle representation of the orientation, the Cartesian velocity, the angular rates associated to the Euler 'ZYX' representation and the index of the frameobject-tracking_ground_truth.txt
contains, for each frame, starting from column no. 3, the Cartesian position and the axis angle representation of the orientation
For nrt
data, the index of the frame corresponds to the same index of the corresponding frame in sequence <video_id>
from the YCB Video Dataset. For rt
data, instead, the ID
of the frame corresponds to the number of frames processed from the beginning of the real-time experiment. Frames might be missing in DenseFusion
sequences due to missing frames in the PoseCNN
segmentation.
For dense_fusion
and icp
the velocities are not available and are substituted with zeros.