CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video

Wei Lin; Anna Kukleva; Kunyang Sun; Horst Possegger; Hilde  Kuehne; Horst Bischof

doi:10.1007/978-3-031-20062-5_40

CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video

Wei Lin^*, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof

^*Corresponding author for this work

Institute of Computer Graphics and Vision (7100)

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle. We evaluate our approach on benchmark datasets for image-to-video as well as for mixed-source domain adaptation achieving state-of-the-art results and demonstrating the benefits of our cyclic adaptation.

Original language	English
Title of host publication	Computer Vision – ECCV 2022
Subtitle of host publication	17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III
Place of Publication	Cham
Publisher	Springer
Pages	698-715
Number of pages	17
ISBN (Electronic)	978-3-031-20062-5
ISBN (Print)	978-3-031-20061-8
DOIs	https://doi.org/10.1007/978-3-031-20062-5_40
Publication status	Published - 2022
Event	2022 European Conference on Computer Vision: ECCV 2022 - Hybrider Event, Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022

Publication series

Name	Lecture Notes in Computer Science
Volume	13663

Conference

Conference	2022 European Conference on Computer Vision
Abbreviated title	ECCV 2022
Country/Territory	Israel
City	Hybrider Event, Tel Aviv
Period	23/10/22 → 27/10/22

Access to Document

10.1007/978-3-031-20062-5_40

https://arxiv.org/abs/2203.16244Licence: Other

Cite this

Lin, W., Kukleva, A., Sun, K., Possegger, H., Kuehne, H., & Bischof, H. (2022). CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video. In Computer Vision – ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III (pp. 698-715). (Lecture Notes in Computer Science; Vol. 13663). Springer. https://doi.org/10.1007/978-3-031-20062-5_40

CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video. / Lin, Wei; Kukleva, Anna; Sun, Kunyang et al.
Computer Vision – ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Cham: Springer, 2022. p. 698-715 (Lecture Notes in Computer Science; Vol. 13663).

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Lin, W, Kukleva, A, Sun, K, Possegger, H, Kuehne, H & Bischof, H 2022, CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video. in Computer Vision – ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Lecture Notes in Computer Science, vol. 13663, Springer, Cham, pp. 698-715, 2022 European Conference on Computer Vision, Hybrider Event, Tel Aviv, Israel, 23/10/22. https://doi.org/10.1007/978-3-031-20062-5_40

@inproceedings{d3c89c9c9e8143f2897ec4ea42ddaa0c,

title = "CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video",

abstract = "Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle. We evaluate our approach on benchmark datasets for image-to-video as well as for mixed-source domain adaptation achieving state-of-the-art results and demonstrating the benefits of our cyclic adaptation. ",

author = "Wei Lin and Anna Kukleva and Kunyang Sun and Horst Possegger and Hilde Kuehne and Horst Bischof",

year = "2022",

doi = "10.1007/978-3-031-20062-5_40",

language = "English",

isbn = "978-3-031-20061-8",

series = "Lecture Notes in Computer Science",

publisher = "Springer",

pages = "698--715",

booktitle = "Computer Vision – ECCV 2022",

note = "2022 European Conference on Computer Vision : ECCV 2022, ECCV 2022 ; Conference date: 23-10-2022 Through 27-10-2022",

}

TY - GEN

T1 - CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video

AU - Lin, Wei

AU - Kukleva, Anna

AU - Sun, Kunyang

AU - Possegger, Horst

AU - Kuehne, Hilde

AU - Bischof, Horst

PY - 2022

Y1 - 2022

N2 - Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle. We evaluate our approach on benchmark datasets for image-to-video as well as for mixed-source domain adaptation achieving state-of-the-art results and demonstrating the benefits of our cyclic adaptation.

AB - Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle. We evaluate our approach on benchmark datasets for image-to-video as well as for mixed-source domain adaptation achieving state-of-the-art results and demonstrating the benefits of our cyclic adaptation.

U2 - 10.1007/978-3-031-20062-5_40

DO - 10.1007/978-3-031-20062-5_40

M3 - Conference paper

SN - 978-3-031-20061-8

T3 - Lecture Notes in Computer Science

SP - 698

EP - 715

BT - Computer Vision – ECCV 2022

PB - Springer

CY - Cham

T2 - 2022 European Conference on Computer Vision

Y2 - 23 October 2022 through 27 October 2022

ER -

CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video

Abstract

Publication series

Conference

Access to Document

Fingerprint

Cite this