Learning to Find Good Correspondences

Kwang Moo Yi; Eduard Trulls; Yuki Ono; Vincent Lepetit; Mathieu Salzmann; Pascal Fua

Learning to Find Good Correspondences

Kwang Moo Yi, Eduard Trulls, Yuki Ono, Vincent Lepetit, Mathieu Salzmann, Pascal Fua

Institute of Computer Graphics and Vision (7100)

Research output: Working paper › Preprint

Abstract

We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data.

Original language	English
Number of pages	13
Publication status	Published - 2018

Publication series

Name	arXiv.org e-Print archive
Publisher	Cornell University Library

Keywords

cs.CV

Access to Document

https://arxiv.org/pdf/1711.05971.pdf

Cite this

@techreport{e07bf7c4bf294215be954d5a759eac61,

title = "Learning to Find Good Correspondences",

abstract = " We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data. ",

keywords = "cs.CV",

author = "Yi, {Kwang Moo} and Eduard Trulls and Yuki Ono and Vincent Lepetit and Mathieu Salzmann and Pascal Fua",

note = "CVPR 2018 (Oral)",

year = "2018",

language = "English",

series = "arXiv.org e-Print archive",

publisher = "Cornell University Library",

type = "WorkingPaper",

institution = "Cornell University Library",

}

TY - UNPB

T1 - Learning to Find Good Correspondences

AU - Yi, Kwang Moo

AU - Trulls, Eduard

AU - Ono, Yuki

AU - Lepetit, Vincent

AU - Salzmann, Mathieu

AU - Fua, Pascal

N1 - CVPR 2018 (Oral)

PY - 2018

Y1 - 2018

N2 - We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data.

AB - We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data.

KW - cs.CV

M3 - Preprint

T3 - arXiv.org e-Print archive

BT - Learning to Find Good Correspondences

ER -