A method determines a pose of an image capture device. The method includes accessing an image of a scene captured by the image capture device. A semantic segmentation of the image is performed, to generate a segmented image. An initial pose of the image capture device is generated using a three-dimensional (3D) tracker. A plurality of 3D renderings of the scene are generated, each of the plurality of 3D renderings corresponding to one of a plurality of poses chosen based on the initial pose. A pose is selected from the plurality of poses, such that the 3D rendering corresponding to the selected pose aligns with the segmented image.
|Publication status||Published - 2019|