End-to-End Training of Hybrid CNN-CRF Models for Semantic Segmentation using Structured Learning

Aleksander Colovic; Patrick Knöbelreiter; Alexander Shekhovtsov; Thomas Pock

End-to-End Training of Hybrid CNN-CRF Models for Semantic Segmentation using Structured Learning

Aleksander Colovic, Patrick Knöbelreiter, Alexander Shekhovtsov, Thomas Pock

Institut für Maschinelles Sehen und Darstellen (7100)

Publikation: Konferenzbeitrag › Paper › Begutachtung

Abstract

In this work we tackle the problem of semantic image segmentation with a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs). The CRF takes contrast sensitive weights in a local neighborhood as input (pairwise interactions) to encourage consistency (smoothness) within the prediction and align our segmentation boundaries with visual edges. We model unary terms with a CNN which outperforms non data driven models. We approximate the CRF inference with a fixed number of iterations of a linear-programming relaxation based approach. We experiment with training the combined model end-to-end using a discriminative formulation (structured support vector machine) and applying stochastic subgradient descend to it.
Our proposed model achieves an intersection over union score of 62.4 in the test set of the cityscapes pixel-level semantic labeling task which is comparable to state-of-the-art models.

Originalsprache	englisch
Publikationsstatus	Veröffentlicht - 6 Feb. 2017
Veranstaltung	Computer Vision Winter Workshop: CVWW 2017 - Retz, Retz, Österreich Dauer: 6 Feb. 2017 → 8 Feb. 2017

Konferenz

Konferenz	Computer Vision Winter Workshop
Kurztitel	CVWW 2017
Land/Gebiet	Österreich
Ort	Retz
Zeitraum	6/02/17 → 8/02/17

Zugriff auf Dokument

http://cvww2017.prip.tuwien.ac.at/papers/CVWW2017_paper_14.pdf

Dieses zitieren

@conference{07d3f414446549bea9e20f2050200b14,

title = "End-to-End Training of Hybrid CNN-CRF Models for Semantic Segmentation using Structured Learning",

abstract = "In this work we tackle the problem of semantic image segmentation with a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs). The CRF takes contrast sensitive weights in a local neighborhood as input (pairwise interactions) to encourage consistency (smoothness) within the prediction and align our segmentation boundaries with visual edges. We model unary terms with a CNN which outperforms non data driven models. We approximate the CRF inference with a fixed number of iterations of a linear-programming relaxation based approach. We experiment with training the combined model end-to-end using a discriminative formulation (structured support vector machine) and applying stochastic subgradient descend to it.Our proposed model achieves an intersection over union score of 62.4 in the test set of the cityscapes pixel-level semantic labeling task which is comparable to state-of-the-art models.",

author = "Aleksander Colovic and Patrick Kn{\"o}belreiter and Alexander Shekhovtsov and Thomas Pock",

year = "2017",

month = feb,

day = "6",

language = "English",

note = "Computer Vision Winter Workshop : CVWW 2017, CVWW 2017 ; Conference date: 06-02-2017 Through 08-02-2017",

}

TY - CONF

T1 - End-to-End Training of Hybrid CNN-CRF Models for Semantic Segmentation using Structured Learning

AU - Colovic, Aleksander

AU - Knöbelreiter, Patrick

AU - Shekhovtsov, Alexander

AU - Pock, Thomas

PY - 2017/2/6

Y1 - 2017/2/6

N2 - In this work we tackle the problem of semantic image segmentation with a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs). The CRF takes contrast sensitive weights in a local neighborhood as input (pairwise interactions) to encourage consistency (smoothness) within the prediction and align our segmentation boundaries with visual edges. We model unary terms with a CNN which outperforms non data driven models. We approximate the CRF inference with a fixed number of iterations of a linear-programming relaxation based approach. We experiment with training the combined model end-to-end using a discriminative formulation (structured support vector machine) and applying stochastic subgradient descend to it.Our proposed model achieves an intersection over union score of 62.4 in the test set of the cityscapes pixel-level semantic labeling task which is comparable to state-of-the-art models.

AB - In this work we tackle the problem of semantic image segmentation with a combination of convolutional neural networks (CNNs) and conditional random fields (CRFs). The CRF takes contrast sensitive weights in a local neighborhood as input (pairwise interactions) to encourage consistency (smoothness) within the prediction and align our segmentation boundaries with visual edges. We model unary terms with a CNN which outperforms non data driven models. We approximate the CRF inference with a fixed number of iterations of a linear-programming relaxation based approach. We experiment with training the combined model end-to-end using a discriminative formulation (structured support vector machine) and applying stochastic subgradient descend to it.Our proposed model achieves an intersection over union score of 62.4 in the test set of the cityscapes pixel-level semantic labeling task which is comparable to state-of-the-art models.

M3 - Paper

T2 - Computer Vision Winter Workshop

Y2 - 6 February 2017 through 8 February 2017

ER -

End-to-End Training of Hybrid CNN-CRF Models for Semantic Segmentation using Structured Learning

Abstract

Konferenz

Zugriff auf Dokument

Fingerprint

Dieses zitieren