End-to-End Training of Hybrid CNN-CRF Models for Stereo

Patrick Knöbelreiter, Christian Reinbacher, Alexander Shekhovtsov, Thomas Pock

Research output: Contribution to journalArticleResearchpeer-review

Abstract

We propose a novel method for stereo estimation, combining advantages of convolutional neural networks (CNNs) and optimization-based approaches. The optimization, posed as a conditional random field (CRF), takes local matching costs and consistency-enforcing (smoothness) costs as inputs, both estimated by CNN blocks. To perform the inference in the CRF we use an approach based on linear programming relaxation with a fixed number of iterations. We address the challenging problem of training this hybrid model end-to-end. We show that in the discriminative formulation (structured support vector machine) the training is practically feasible. The trained hybrid model with shallow CNNs is comparable to state-of-the-art deep models in both time and performance. The optimization part efficiently replaces sophisticated and not jointly trainable (but commonly applied) post-processing steps by a trainable, well-understood model.
LanguageEnglish
JournalarXiv.org e-Print archive
StatusPublished - 30 Nov 2016

Fingerprint

Neural networks
Linear programming
Support vector machines
Costs
Processing

Keywords

  • cs.CV

Cite this

End-to-End Training of Hybrid CNN-CRF Models for Stereo. / Knöbelreiter, Patrick; Reinbacher, Christian; Shekhovtsov, Alexander; Pock, Thomas.

In: arXiv.org e-Print archive, 30.11.2016.

Research output: Contribution to journalArticleResearchpeer-review

@article{9031cfdd54654eea9d0ab5b18ade286c,
title = "End-to-End Training of Hybrid CNN-CRF Models for Stereo",
abstract = "We propose a novel method for stereo estimation, combining advantages of convolutional neural networks (CNNs) and optimization-based approaches. The optimization, posed as a conditional random field (CRF), takes local matching costs and consistency-enforcing (smoothness) costs as inputs, both estimated by CNN blocks. To perform the inference in the CRF we use an approach based on linear programming relaxation with a fixed number of iterations. We address the challenging problem of training this hybrid model end-to-end. We show that in the discriminative formulation (structured support vector machine) the training is practically feasible. The trained hybrid model with shallow CNNs is comparable to state-of-the-art deep models in both time and performance. The optimization part efficiently replaces sophisticated and not jointly trainable (but commonly applied) post-processing steps by a trainable, well-understood model.",
keywords = "cs.CV",
author = "Patrick Kn{\"o}belreiter and Christian Reinbacher and Alexander Shekhovtsov and Thomas Pock",
year = "2016",
month = "11",
day = "30",
language = "English",
journal = "arXiv.org e-Print archive",
publisher = "Cornell University Library",

}

TY - JOUR

T1 - End-to-End Training of Hybrid CNN-CRF Models for Stereo

AU - Knöbelreiter,Patrick

AU - Reinbacher,Christian

AU - Shekhovtsov,Alexander

AU - Pock,Thomas

PY - 2016/11/30

Y1 - 2016/11/30

N2 - We propose a novel method for stereo estimation, combining advantages of convolutional neural networks (CNNs) and optimization-based approaches. The optimization, posed as a conditional random field (CRF), takes local matching costs and consistency-enforcing (smoothness) costs as inputs, both estimated by CNN blocks. To perform the inference in the CRF we use an approach based on linear programming relaxation with a fixed number of iterations. We address the challenging problem of training this hybrid model end-to-end. We show that in the discriminative formulation (structured support vector machine) the training is practically feasible. The trained hybrid model with shallow CNNs is comparable to state-of-the-art deep models in both time and performance. The optimization part efficiently replaces sophisticated and not jointly trainable (but commonly applied) post-processing steps by a trainable, well-understood model.

AB - We propose a novel method for stereo estimation, combining advantages of convolutional neural networks (CNNs) and optimization-based approaches. The optimization, posed as a conditional random field (CRF), takes local matching costs and consistency-enforcing (smoothness) costs as inputs, both estimated by CNN blocks. To perform the inference in the CRF we use an approach based on linear programming relaxation with a fixed number of iterations. We address the challenging problem of training this hybrid model end-to-end. We show that in the discriminative formulation (structured support vector machine) the training is practically feasible. The trained hybrid model with shallow CNNs is comparable to state-of-the-art deep models in both time and performance. The optimization part efficiently replaces sophisticated and not jointly trainable (but commonly applied) post-processing steps by a trainable, well-understood model.

KW - cs.CV

M3 - Article

JO - arXiv.org e-Print archive

T2 - arXiv.org e-Print archive

JF - arXiv.org e-Print archive

ER -