Updated: Mar 27, 2019
Kudanが提供するKudanSLAMに代表されるSLAM (Simultaneous Localisation and Mapping)技術の原理と機能を概説します
Simultaneous localisation and mapping (SLAM) has a long history, and various techniques have been adopted to solve it at different times. In this article, we focus on the current mechanism of SLAM as implemented in KudanSLAM.
By definition, every SLAM system solves two problems at the same time: the localisation of a sensing agent in an environment and the creation of a map of its environment. Thus, we can split the problems into two parts: a tracking mechanism, which solves the localisation problem by comparing the new input data with the currently existing map, and a mapping part, which supports the tracking by providing, maintaining and expanding the map itself based on the information obtained from the input data obtained by the tracking part.
Kudan SLAM is a visual SLAM system, indicating that its main input data are provided by a camera feed, which is possibly augmented by a second camera feed in case of Stereo SLAM, by a depth sensor in case of RGBD SLAM, and/or by non-visual information, such as IMU/GPS data, when available. Kudan SLAM divides the tracking and mapping tasks in two main threads that work in parallel to provide the real-time pose of the main camera and the optimised map of the environment.
The map structure
Kudan SLAM is a feature-based sparse system, indicating that its fundamental mechanism is based on tracking and maintaining a sparse map of feature points that describe the environment. The map comprises the following three main components:
The keyframes are the main data structure in the map. They mainly contain the camera pose at a given frame and the list of feature point observations seen using that pose.
The observations link the keyframes with the map points and mostly comprise a compact binary description of the visual properties of a feature point when observed from a determined camera pose.
The map points contain 3D positions in the map space of the observed feature points. Each map point can be viewed using more than one keyframe; therefore, for each keyframe that can see the point, it has a connected list of observations to the keyframe.
Because the 3D map points can be observed by multiple views, different keyframes can connect with each other via their observations with respect to the same point sets. These keyframes are considered to be co-visible with each other.