LabelMaker 🎨

Automatic Semantic Label Generation from RGB-D Trajectories

3DV 2024

1ETH Zürich 2ETH AI Center 3Google 4Microsoft


LabelMaker bundles a collection of state-of-the-art segmentation models with different sets of predicted classes in a neural field. LabelMaker can refine existing annotations and produce highly accurate 2D as well as 3D labels on ScanNet (top-right). At the same time, it opens new possibilities to rapidly label large-scale datasets without human effort such as ARKitScenes (bottom-right).

Abstract

Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire. This work introduces a fully automated 2D/3D labeling framework that, without any human intervention, can generate labels for RGB-D scans at equal (or better) level of accuracy than comparable manually annotated datasets such as ScanNet. Our approach is based on an ensemble of state-of-the-art segmentation models and 3D lifting through neural rendering. We demonstrate the effectiveness of our LabelMaker pipeline by generating significantly better labels for the ScanNet datasets and automatically labelling the previously unlabeled ARKitScenes dataset.

Video

Qualitative results on ARKitScenes

LabelMaker can automatically create highly accurate 2D and 3D semantic segmentations. Below we show 3D semantic segmentation on ARKitScenes reconstructions.
Ashcan Backpack Bag Bannister Basket Bathtub Beanbag Bench Blanket Book Bookshelf Bottle Box Bucket Cabinet Ceiling Chair Chest of drawers Clock Column Curtain Cushion Decoration Display Door Doorframe Floor Footstool Furniture Lamp Laptop Light Microwave Oven Paper Paper cutter Picture Plant Printer Radiator Sack Stairway Table Teddy Tray Wall Window
Apparel Ashcan Bannister Basket Bed Blanket Book Bookshelf Bottle Bowl Box Cabinet Ceiling Chair Chest of drawers Clock Coffee table Curtain Cushion Desk Display Door Doorframe Floor Gat Lamp Light Mat Mirror Picture Radiator Shoe Shower stall Soap dispenser Stand Stool Swivel chair Table Teddy Telephone Towel Vent Wall Window
Ashcan Backpack Bed Blanket Book Bookshelf Bottle Box Cabinet Ceiling Chair Chest of drawers Clock Computer Computer keyboard Countertop Crate Curtain Cushion Decoration Desk Display Door Doorframe Fan Floor Footstool Kettle Lamp Mat Picture Plant Printer Radiator Swivel chair Table Teddy Telephone Towel Tray Wall Wardrobe Window

BibTeX


@inproceedings{Weder2024labelmaker,
  title = {{LabelMaker: Automatic Semantic Label Generation from RGB-D Trajectories}},
  author={Weder, Silvan and Blum, Hermann and Engelmann, Francis and Pollefeys, Marc},
  booktitle = {International Conference on 3D Vision (3DV)},
  year = {2024}
}