Temporal Coherence for Active Learning in Videos

Javad Zolfaghari Bengar, Abel Gonzales-Garcia, Gabriel Villalonga, Bogdan Raducanu, Hamed Habibi Aghdam, Mikhail Mozerov, Antonio M. Lopez, Joost van de Weijer

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.
Original languageEnglish
Title of host publicationCVRSUAD 2019
Number of pages10
Publication statusAccepted/In press - 2019
Externally publishedYes
EventCVRSUAD 2019: 7th Workshop on Computer Vision for Road Scence Understanding & Autonomous Driving - Seoul, Korea, Republic of
Duration: 27 Oct 2019 → …

Conference

ConferenceCVRSUAD 2019
CountryKorea, Republic of
CitySeoul
Period27/10/19 → …

Fingerprint

Problem-Based Learning
Personnel
Detectors
Object detection

Cite this

Zolfaghari Bengar, J., Gonzales-Garcia, A., Villalonga, G., Raducanu, B., Habibi Aghdam, H., Mozerov, M., ... van de Weijer, J. (Accepted/In press). Temporal Coherence for Active Learning in Videos. In CVRSUAD 2019

Temporal Coherence for Active Learning in Videos. / Zolfaghari Bengar, Javad; Gonzales-Garcia, Abel; Villalonga, Gabriel; Raducanu, Bogdan; Habibi Aghdam, Hamed; Mozerov, Mikhail; M. Lopez, Antonio; van de Weijer, Joost.

CVRSUAD 2019. 2019.

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Zolfaghari Bengar, J, Gonzales-Garcia, A, Villalonga, G, Raducanu, B, Habibi Aghdam, H, Mozerov, M, M. Lopez, A & van de Weijer, J 2019, Temporal Coherence for Active Learning in Videos. in CVRSUAD 2019. CVRSUAD 2019, Seoul, Korea, Republic of, 27/10/19.
Zolfaghari Bengar J, Gonzales-Garcia A, Villalonga G, Raducanu B, Habibi Aghdam H, Mozerov M et al. Temporal Coherence for Active Learning in Videos. In CVRSUAD 2019. 2019
Zolfaghari Bengar, Javad ; Gonzales-Garcia, Abel ; Villalonga, Gabriel ; Raducanu, Bogdan ; Habibi Aghdam, Hamed ; Mozerov, Mikhail ; M. Lopez, Antonio ; van de Weijer, Joost. / Temporal Coherence for Active Learning in Videos. CVRSUAD 2019. 2019.
@inproceedings{982796ac990e417faad363f401cad17c,
title = "Temporal Coherence for Active Learning in Videos",
abstract = "Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.",
author = "{Zolfaghari Bengar}, Javad and Abel Gonzales-Garcia and Gabriel Villalonga and Bogdan Raducanu and {Habibi Aghdam}, Hamed and Mikhail Mozerov and {M. Lopez}, Antonio and {van de Weijer}, Joost",
year = "2019",
language = "English",
booktitle = "CVRSUAD 2019",

}

TY - GEN

T1 - Temporal Coherence for Active Learning in Videos

AU - Zolfaghari Bengar, Javad

AU - Gonzales-Garcia, Abel

AU - Villalonga, Gabriel

AU - Raducanu, Bogdan

AU - Habibi Aghdam, Hamed

AU - Mozerov, Mikhail

AU - M. Lopez, Antonio

AU - van de Weijer, Joost

PY - 2019

Y1 - 2019

N2 - Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.

AB - Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.

M3 - Conference contribution

BT - CVRSUAD 2019

ER -