Dynamic Scene Recognition with Complementary Spatiotemporal Features

Christoph Feichtenhofer, Axel Pinz, Richard Wildes

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This paper presents Dynamically Pooled Complementary Features (DPCF), a unified approach to dynamic scene recognition that analyzes a short video clip in terms of its spatial, temporal and color properties. The complementarity of these
properties is preserved through all main steps of processing, including primitive feature extraction, coding and pooling. In the feature extraction step, spatial orientations capture static appearance, spatiotemporal oriented energies capture image dynamics and color statistics capture chromatic information. Subsequently, primitive features are encoded into a mid-level representation that has been learned for the task of dynamic scene recognition. Finally, a novel dynamic spacetime pyramid is introduced. This dynamic pooling approach can handle both global as well as local motion by adapting to the temporal structure, as guided by pooling energies. The resulting system provides online recognition of dynamic scenes that is thoroughly evaluated on the two current benchmark datasets and yields best results to date on both datasets. In-depth analysis reveals the benefits of explicitly modeling feature complementarity in
combination with the dynamic spacetime pyramid, indicating that this unified approach should be well-suited to many areas of video analysis.
Original languageEnglish
Pages (from-to)2389 - 2401
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume38
Issue number12
DOIs
Publication statusPublished - 2016

Fingerprint

Pooling
Complementarity
Pyramid
Feature Extraction
Feature extraction
Space-time
Color
Feature Modeling
Video Analysis
Online systems
Energy
Coding
Statistics
Benchmark
Motion
Processing

Keywords

  • Dynamic scenes
  • feature representations
  • visual spacetime
  • image dynamics
  • spatiotemporal orientation

Fields of Expertise

  • Information, Communication & Computing

Cite this

Dynamic Scene Recognition with Complementary Spatiotemporal Features. / Feichtenhofer, Christoph; Pinz, Axel; Wildes, Richard.

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, No. 12, 2016, p. 2389 - 2401.

Research output: Contribution to journalArticleResearchpeer-review

@article{3889c94feda849ccbc73e785c5c4062d,
title = "Dynamic Scene Recognition with Complementary Spatiotemporal Features",
abstract = "This paper presents Dynamically Pooled Complementary Features (DPCF), a unified approach to dynamic scene recognition that analyzes a short video clip in terms of its spatial, temporal and color properties. The complementarity of theseproperties is preserved through all main steps of processing, including primitive feature extraction, coding and pooling. In the feature extraction step, spatial orientations capture static appearance, spatiotemporal oriented energies capture image dynamics and color statistics capture chromatic information. Subsequently, primitive features are encoded into a mid-level representation that has been learned for the task of dynamic scene recognition. Finally, a novel dynamic spacetime pyramid is introduced. This dynamic pooling approach can handle both global as well as local motion by adapting to the temporal structure, as guided by pooling energies. The resulting system provides online recognition of dynamic scenes that is thoroughly evaluated on the two current benchmark datasets and yields best results to date on both datasets. In-depth analysis reveals the benefits of explicitly modeling feature complementarity incombination with the dynamic spacetime pyramid, indicating that this unified approach should be well-suited to many areas of video analysis.",
keywords = "Dynamic scenes, feature representations, visual spacetime, image dynamics, spatiotemporal orientation",
author = "Christoph Feichtenhofer and Axel Pinz and Richard Wildes",
year = "2016",
doi = "10.1109/TPAMI.2016.2526008",
language = "English",
volume = "38",
pages = "2389 -- 2401",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "12",

}

TY - JOUR

T1 - Dynamic Scene Recognition with Complementary Spatiotemporal Features

AU - Feichtenhofer, Christoph

AU - Pinz, Axel

AU - Wildes, Richard

PY - 2016

Y1 - 2016

N2 - This paper presents Dynamically Pooled Complementary Features (DPCF), a unified approach to dynamic scene recognition that analyzes a short video clip in terms of its spatial, temporal and color properties. The complementarity of theseproperties is preserved through all main steps of processing, including primitive feature extraction, coding and pooling. In the feature extraction step, spatial orientations capture static appearance, spatiotemporal oriented energies capture image dynamics and color statistics capture chromatic information. Subsequently, primitive features are encoded into a mid-level representation that has been learned for the task of dynamic scene recognition. Finally, a novel dynamic spacetime pyramid is introduced. This dynamic pooling approach can handle both global as well as local motion by adapting to the temporal structure, as guided by pooling energies. The resulting system provides online recognition of dynamic scenes that is thoroughly evaluated on the two current benchmark datasets and yields best results to date on both datasets. In-depth analysis reveals the benefits of explicitly modeling feature complementarity incombination with the dynamic spacetime pyramid, indicating that this unified approach should be well-suited to many areas of video analysis.

AB - This paper presents Dynamically Pooled Complementary Features (DPCF), a unified approach to dynamic scene recognition that analyzes a short video clip in terms of its spatial, temporal and color properties. The complementarity of theseproperties is preserved through all main steps of processing, including primitive feature extraction, coding and pooling. In the feature extraction step, spatial orientations capture static appearance, spatiotemporal oriented energies capture image dynamics and color statistics capture chromatic information. Subsequently, primitive features are encoded into a mid-level representation that has been learned for the task of dynamic scene recognition. Finally, a novel dynamic spacetime pyramid is introduced. This dynamic pooling approach can handle both global as well as local motion by adapting to the temporal structure, as guided by pooling energies. The resulting system provides online recognition of dynamic scenes that is thoroughly evaluated on the two current benchmark datasets and yields best results to date on both datasets. In-depth analysis reveals the benefits of explicitly modeling feature complementarity incombination with the dynamic spacetime pyramid, indicating that this unified approach should be well-suited to many areas of video analysis.

KW - Dynamic scenes

KW - feature representations

KW - visual spacetime

KW - image dynamics

KW - spatiotemporal orientation

U2 - 10.1109/TPAMI.2016.2526008

DO - 10.1109/TPAMI.2016.2526008

M3 - Article

VL - 38

SP - 2389

EP - 2401

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 12

ER -