Downloading and preparing the data

All data used (datasets, models, results, ...) are stored in a directory $HAPPYPOSE_DATA_DIR that you created in the Readsme. We provide the utilities for downloading required data and models. All of the files can also be downloaded manually.

BOP Datasets

For both T-LESS and YCB-Video, we use the datasets in the BOP format. If you already have them on your disk, place them in $HAPPYPOSE_DATA_DIR/bop_datasets. Alternatively, you can download it using :

python -m happypose.toolbox.utils.download --bop_dataset ycbv tless

Additional files that contain information about the datasets used to fairly compare with prior works on both datasets.

python -m happypose.toolbox.utils.download --bop_extra_files ycbv tless

We use pybullet for rendering images which requires object models to be provided in the URDF format. We provide converted URDF files, they can be downloaded using:

python -m happypose.toolbox.utils.download --urdf_models ycbv tless.cad

In the BOP format, the YCB objects 002_master_chef_can and 040_large_marker are considered symmetric, but not by previous works such as PoseCNN, PVNet and DeepIM. To ensure a fair comparison (using ADD instead of ADD-S for ADD-(S) for these objects), these objects must not be considered symmetric in the evaluation. To keep the uniformity of the models format, we generate a set of YCB objects models_bop-compat_eval that can be used to fairly compare our approach against previous works. You can download them directly:

python -m happypose.toolbox.utils.download --ycbv_compat_models

Notes:

  • The URDF files were obtained using these commands (requires meshlab to be installed):

    python -m happypose.pose_estimators.cosypose.cosypose.scripts.convert_bop_ds_to_urdf --ds_name=ycbv
    python -m happypose.pose_estimators.cosypose.cosypose.scripts.convert_bop_ds_to_urdf --ds_name=tless.cad
    
  • Compatibility models were obtained using the following script:

    python -m happypose.pose_estimators.cosypose.cosypose.scripts.make_ycbv_compat_models
    

Models for minimal version

# hope
python -m happypose.toolbox.utils.download --cosypose_models \
          detector-bop-hope-pbr--15246 \
          coarse-bop-hope-pbr--225203 \
          refiner-bop-hope-pbr--955392

# ycbv
python -m happypose.toolbox.utils.download --cosypose_models \
          detector-bop-ycbv-pbr--970850 \
          coarse-bop-ycbv-pbr--724183 \
          refiner-bop-ycbv-pbr--604090

# tless
python -m happypose.toolbox.utils.download --cosypose_models \
          detector-bop-tless-pbr--873074 \
          coarse-bop-tless-pbr--506801 \
          refiner-bop-tless-pbr--233420

Pre-trained models for single-view estimator

The pre-trained models of the single-view pose estimator can be downloaded using:

# YCB-V Single-view refiner
python -m happypose.toolbox.utils.download --cosypose_models ycbv-refiner-finetune--251020

# YCB-V Single-view refiner trained on synthetic data only
# Only download this if you are interested in retraining the above model
python -m happypose.toolbox.utils.download --cosypose_models ycbv-refiner-syntonly--596719

# T-LESS coarse and refiner models
python -m happypose.toolbox.utils.download --cosypose_models tless-coarse--10219 tless-refiner--585928

2D detections

To ensure a fair comparison with prior works on both datasets, we use the same detections as DeepIM (from PoseCNN) on YCB-Video and the same as Pix2pose (from a RetinaNet model) on T-LESS. Download the saved 2D detections for both datasets using

python -m happypose.toolbox.utils.download --detections ycbv_posecnn

# SiSo detections: 1 detection with highest per score per class per image on all images
# Available for each image of the T-LESS dataset (primesense sensor)
# These are the same detections as used in Pix2pose's experiments
python -m happypose.toolbox.utils.download --detections tless_pix2pose_retinanet_siso_top1

# ViVo detections: All detections for a subset of 1000 images of T-LESS.
# Used in our multi-view experiments.
python -m happypose.toolbox.utils.download --detections tless_pix2pose_retinanet_vivo_all

If you are interested in re-training a detector, please see the BOP 2020 section.

Notes:

  • The PoseCNN detections (and coarse pose estimates) on YCB-Video were extracted and converted from these PoseCNN results.
  • The Pix2pose detections were extracted using pix2pose's code. We used the detection model from their paper, see here. For the ViVo detections, their code was slightly modified. The code used to extract detections can be found here.