Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation

Markus Oberweger, Mahdi Rad, Vincent Lepetit

Publikation: Beitrag in einer FachzeitschriftArtikelForschungBegutachtung

Abstract

We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions. Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method. Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples. Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions. Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps. However, we provide a simple yet effective solution to deal with such ambiguities. We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects. Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation/
Originalspracheenglisch
Seitenumfang26
FachzeitschriftarXiv.org e-Print archive
PublikationsstatusVeröffentlicht - 11 Apr 2018
Veranstaltung2018 European Conference on Computer Vision - München, Deutschland
Dauer: 8 Sep 201814 Sep 2018

Schlagwörter

    Dies zitieren

    Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation. / Oberweger, Markus; Rad, Mahdi; Lepetit, Vincent.

    in: arXiv.org e-Print archive, 11.04.2018.

    Publikation: Beitrag in einer FachzeitschriftArtikelForschungBegutachtung

    @article{bcd24f7bd03b49bd8f9b87a0dcf6b176,
    title = "Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation",
    abstract = "We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions. Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method. Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples. Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions. Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps. However, we provide a simple yet effective solution to deal with such ambiguities. We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects. Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation/",
    keywords = "cs.CV",
    author = "Markus Oberweger and Mahdi Rad and Vincent Lepetit",
    year = "2018",
    month = "4",
    day = "11",
    language = "English",
    journal = "arXiv.org e-Print archive",
    publisher = "Cornell University Library",

    }

    TY - JOUR

    T1 - Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation

    AU - Oberweger, Markus

    AU - Rad, Mahdi

    AU - Lepetit, Vincent

    PY - 2018/4/11

    Y1 - 2018/4/11

    N2 - We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions. Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method. Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples. Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions. Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps. However, we provide a simple yet effective solution to deal with such ambiguities. We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects. Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation/

    AB - We introduce a novel method for robust and accurate 3D object pose estimation from a single color image under large occlusions. Following recent approaches, we first predict the 2D projections of 3D points related to the target object and then compute the 3D pose from these correspondences using a geometric method. Unfortunately, as the results of our experiments show, predicting these 2D projections using a regular CNN or a Convolutional Pose Machine is highly sensitive to partial occlusions, even when these methods are trained with partially occluded examples. Our solution is to predict heatmaps from multiple small patches independently and to accumulate the results to obtain accurate and robust predictions. Training subsequently becomes challenging because patches with similar appearances but different positions on the object correspond to different heatmaps. However, we provide a simple yet effective solution to deal with such ambiguities. We show that our approach outperforms existing methods on two challenging datasets: The Occluded LineMOD dataset and the YCB-Video dataset, both exhibiting cluttered scenes with highly occluded objects. Project website: https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/robust-object-pose-estimation/

    KW - cs.CV

    M3 - Article

    JO - arXiv.org e-Print archive

    JF - arXiv.org e-Print archive

    ER -