Abstract
This article presents foundations, original research and trends in the
field of object categorization by computer vision methods. The research
goals in object categorization are to detect objects in images and to
determine the object’s categories. Categorization aims for the recognition
of generic classes of objects, and thus has also been termed
‘generic object recognition’. This is in contrast to the recognition of
specific, individual objects. While humans are usually better in generic
than in specific recognition, categorization is much harder to achieve
for today’s computer architectures and algorithms. Major problems are
related to the concept of a ‘visual category’, where a successful recognition
algorithm has to manage large intra-class variabilities versus
sometimes marginal inter-class differences. It turns out that several
techniques which are useful for specific recognition can also be adapted
to categorization, but there are also a number of recent developments
in learning, representation and detection that are especially tailored to
categorization.
Recent results have established various categorization methods that
are based on local salient structures in the images. Some of these methods
use just a ‘bag of keypoints’ model. Others include a certain amount
of geometric modeling of 2D spatial relations between parts, or ‘constellations’
of parts. There is now a certain maturity in these approaches
and they achieve excellent recognition results on rather complex image
databases. Further work focused on the description of shape and object
contour for categorization is only just emerging. However, there remain
a number of important open questions, which also define current and
future research directions. These issues include localization abilities,
required supervision, the handling of many categories, online and incremental
learning, and the use of a ‘visual alphabet’, to name a few. These
aspects are illustrated by the discussion of several current approaches,
including our own patch-based system and our boundary fragmentmodel.
The article closes with a summary and a discussion of promising
future research directions.
field of object categorization by computer vision methods. The research
goals in object categorization are to detect objects in images and to
determine the object’s categories. Categorization aims for the recognition
of generic classes of objects, and thus has also been termed
‘generic object recognition’. This is in contrast to the recognition of
specific, individual objects. While humans are usually better in generic
than in specific recognition, categorization is much harder to achieve
for today’s computer architectures and algorithms. Major problems are
related to the concept of a ‘visual category’, where a successful recognition
algorithm has to manage large intra-class variabilities versus
sometimes marginal inter-class differences. It turns out that several
techniques which are useful for specific recognition can also be adapted
to categorization, but there are also a number of recent developments
in learning, representation and detection that are especially tailored to
categorization.
Recent results have established various categorization methods that
are based on local salient structures in the images. Some of these methods
use just a ‘bag of keypoints’ model. Others include a certain amount
of geometric modeling of 2D spatial relations between parts, or ‘constellations’
of parts. There is now a certain maturity in these approaches
and they achieve excellent recognition results on rather complex image
databases. Further work focused on the description of shape and object
contour for categorization is only just emerging. However, there remain
a number of important open questions, which also define current and
future research directions. These issues include localization abilities,
required supervision, the handling of many categories, online and incremental
learning, and the use of a ‘visual alphabet’, to name a few. These
aspects are illustrated by the discussion of several current approaches,
including our own patch-based system and our boundary fragmentmodel.
The article closes with a summary and a discussion of promising
future research directions.
Originalsprache | englisch |
---|---|
Seiten (von - bis) | 255-353 |
Fachzeitschrift | Foundations and trends in computer graphics and vision |
Jahrgang | 1 |
Ausgabenummer | 4 |
DOIs | |
Publikationsstatus | Veröffentlicht - 2006 |
Treatment code (Nähere Zuordnung)
- Review