Abstract
Learning-based methods produce remarkable results on single image depth tasks when trained on well-established benchmarks, however, there is a large gap from these benchmarks to real-world performance that is usually obscured by the common practice of fine-tuning on the target dataset. We introduce a new depth dataset that is an order of magnitude larger than previous datasets, but more importantly, contains an unprecedented gamut of locations, camera models and scene types while offering metric depth (not just up-to-scale). Additionally, we investigate the problem of training single image depth networks using images captured with many different cameras, validating an existing approach and proposing a simpler alternative. With our contributions we achieve excellent results on challenging benchmarks before fine-tuning, and set the state of the art on the popular KITTI dataset after fine-tuning.
The dataset is available at mapillary.com/dataset/depth.
The dataset is available at mapillary.com/dataset/depth.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2020 |
Subtitle of host publication | 16th European Conference, 2020, Proceedings |
Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
Pages | 589-604 |
Number of pages | 16 |
Volume | 12347 |
ISBN (Electronic) | 978-3-030-58536-5 |
DOIs | |
Publication status | Published - 1 Jan 2020 |
Event | 16th European Conference on Computer Vision: ECCV 2020 - Virtual, Glasgow, United Kingdom Duration: 23 Aug 2020 → 28 Aug 2020 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Volume | 12347 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 16th European Conference on Computer Vision |
---|---|
Abbreviated title | ECCV 2020 |
Country/Territory | United Kingdom |
City | Virtual, Glasgow |
Period | 23/08/20 → 28/08/20 |