<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>machineperception | Kudan global</title>
	<atom:link href="https://www.kudan.io/blog/tag/machineperception/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.kudan.io</link>
	<description>Kudan has been providing proprietary Artificial Perception technologies based on SLAM to enable use cases with significant market potential and impact on our lives such as autonomous driving, robotics, AR/VR and smart cities</description>
	<lastBuildDate>Mon, 19 Jun 2023 23:37:29 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.8.13</generator>

<image>
	<url>https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/05/cropped-NoImage.png?fit=32%2C32&#038;ssl=1</url>
	<title>machineperception | Kudan global</title>
	<link>https://www.kudan.io</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">179852210</site>	<item>
		<title>Visual SLAM: The Basics</title>
		<link>https://www.kudan.io/blog/visual-slam-the-basics/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=visual-slam-the-basics</link>
		
		<dc:creator><![CDATA[user]]></dc:creator>
		<pubDate>Thu, 20 Aug 2020 09:00:08 +0000</pubDate>
				<category><![CDATA[Tech Blog]]></category>
		<category><![CDATA[artificialperception]]></category>
		<category><![CDATA[artisense]]></category>
		<category><![CDATA[computervision]]></category>
		<category><![CDATA[Kudan]]></category>
		<category><![CDATA[KudanSLAM]]></category>
		<category><![CDATA[machineperception]]></category>
		<category><![CDATA[mapping]]></category>
		<category><![CDATA[SLAM]]></category>
		<category><![CDATA[vslam]]></category>
		<guid isPermaLink="false">https://www.kudan.io/?p=433</guid>

					<description><![CDATA[<p>In my last article, we looked at SLAM from a 16km (50,000 feet) perspective, so let’s look at it from 2m. Not close enough to get your hands dirty, but enough to get a good look over someone’s shoulders. SLAM can take on many forms and approaches, but for our purpose, let’s start with feature-based [&#8230;]</p>
<p>The post <a href="https://www.kudan.io/blog/visual-slam-the-basics/">Visual SLAM: The Basics</a> first appeared on <a href="https://www.kudan.io">Kudan global</a>.</p>]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" class="size-full wp-image-451 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Rainbow-Shopping-Point-Cloud-Short.gif?resize=640%2C347&#038;ssl=1" alt="" width="640" height="347" data-recalc-dims="1" /></p>
<p>In my <a href="https://www.kudan.io/archives/413" target="_blank" rel="noopener noreferrer">last article</a>, we looked at SLAM from a 16km (50,000 feet) perspective, so let’s look at it from 2m. Not close enough to get your hands dirty, but enough to get a good look over someone’s shoulders. SLAM can take on many forms and approaches, but for our purpose, let’s start with feature-based visual SLAM. I will cover other SLAM approaches such as direct visual SLAM, and those that use cameras with depth sensors, and LiDAR in subsequent articles.</p>
<p>As the name implies, visual SLAM utilizes camera(s) as the primary source of sensor input to sense the surrounding environment. This can be done either with a single camera, multiple cameras, and with or without an inertial measurement unit (IMU) that measure translational and rotational movements.</p>
<p>Let’s walk through the process chart and see what happens at each stage.</p>
<p><img loading="lazy" class="size-full wp-image-434 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/01-Feature-based-VSLAM-Process.png?resize=700%2C72&#038;ssl=1" alt="" width="700" height="72" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/01-Feature-based-VSLAM-Process.png?w=700&amp;ssl=1 700w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/01-Feature-based-VSLAM-Process.png?resize=300%2C31&amp;ssl=1 300w" sizes="(max-width: 700px) 100vw, 700px" data-recalc-dims="1" /></p>
<h3><strong>Sensor/Camera Measurements: Setup</strong></h3>
<p>In order to make this a bit more concrete, let’s imagine a pair of augmented reality glasses. For simplicity, the glasses have two cameras mounted at the temple and an IMU centered between the cameras. The two cameras provide the stereo vision to make depth estimations easier, and the IMU will help provide better movement estimations.</p>
<p><img loading="lazy" class="size-full wp-image-436 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/02-Glasses-2.png?resize=254%2C127&#038;ssl=1" alt="" width="254" height="127" data-recalc-dims="1" /></p>
<p>There are a couple specifications for the camera that help with SLAM: global shutter and grayscale sensor. Also, the tracking cameras don’t have to be super high-resolution, and typically a VGA (640&#215;480 pixel) camera is sufficient (more pixels, more processing). Let’s also assume there is a 6-axis IMU: motion on the x, y and z-axis, and pitch, yaw and roll. Finally, these sensors should be synchronized against a common clock to match the sensor outputs against each other.</p>
<h3><strong>Let&#8217;s start solving the puzzle</strong></h3>
<p>I find the process of completing a jigsaw puzzle as a good analogy for some major components of the SLAM process. The processes described below are mostly conceptual and simplified to help in understanding the overall mechanisms involved.</p>
<p>When the system is initialized, and the cameras are turned on, you are given your first piece of the puzzle. You don’t know how many pieces there are, and you don’t know what part of the puzzle you are looking at. You have your first stereo images and IMU readings.</p>
<h3><strong>Feature Extraction: Distortion correction</strong></h3>
<p>Most camera lenses will introduce some level of distortion to the captured images. There will be distortion from the design of the lenses, as well as distortion in each lens from minute differences during manufacturing. We can “undistort” the image through a distortion grid that transforms the image close to its original representation.</p>
<p><img loading="lazy" class="size-full wp-image-437 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/03-Distortion-Correction-2.png?resize=596%2C500&#038;ssl=1" alt="" width="596" height="500" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/03-Distortion-Correction-2.png?w=596&amp;ssl=1 596w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/09/03-Distortion-Correction-2.png?resize=300%2C252&amp;ssl=1 300w" sizes="(max-width: 596px) 100vw, 596px" data-recalc-dims="1" /></p>
<h3><strong>Feature Extraction: Feature points</strong></h3>
<p>Features in computer vision can take on a number of forms, and don’t necessarily correspond to what humans think of as features. Features typically take the form of corners, or blobs, a collection of pixels that uniquely stand out, and should be able to be consistently identified from an image, and occasionally edges. The figure below depicts the features detected (left), how they would be represented in a map (right).</p>
<p><img loading="lazy" class="alignnone size-full wp-image-441" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/04-Feature-Points-3.png?resize=960%2C338&#038;ssl=1" alt="" width="960" height="338" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/04-Feature-Points-3.png?w=960&amp;ssl=1 960w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/04-Feature-Points-3.png?resize=300%2C106&amp;ssl=1 300w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/04-Feature-Points-3.png?resize=768%2C270&amp;ssl=1 768w" sizes="(max-width: 960px) 100vw, 960px" data-recalc-dims="1" /></p>
<h3><strong>Feature Extraction: Feature matching and depth estimation</strong></h3>
<p><img loading="lazy" class=" wp-image-442 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/05-Stereo-Features.png?resize=620%2C299&#038;ssl=1" alt="" width="620" height="299" data-recalc-dims="1" /></p>
<p>Given these stereo images, we should be able to see overlapping features between the images. These identical features can then be used to estimate the distance from the sensor. We know the orientation of the cameras and the distance between them. We use this information to perform image rectification &#8211; the mapping pixels between the two images against a common plane. This is then used to determine the disparity of the common features between the two images. Disparity and distance are inversely related, such that as the distance from the camera increases, the disparity decreases.</p>
<p>Now, we can estimate the depth of each of the features using triangulation.</p>
<p><img loading="lazy" class="size-full wp-image-443 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/06-Triangulation.png?resize=265%2C395&#038;ssl=1" alt="" width="265" height="395" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/06-Triangulation.png?w=265&amp;ssl=1 265w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/06-Triangulation.png?resize=201%2C300&amp;ssl=1 201w" sizes="(max-width: 265px) 100vw, 265px" data-recalc-dims="1" /></p>
<p>In a single camera scenario, we cannot infer depth from a single image, but as the camera moves around, depth can be inferred through parallax by comparing the features in subsequent images.</p>
<h3><strong>Data association</strong></h3>
<p>The data association step takes the features detected, along with its estimated location in space, and builds a map of these features with regards to the cameras. As this process continues through subsequent frames, the system continually takes new measurements, and associates features to known elements of the map, and prunes uncertain features.</p>
<p>As we track the motion of the camera, we can start making predictions based on the known features, and how they should change based on the motion.</p>
<p><img loading="lazy" class="size-full wp-image-444 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/07-Observations.png?resize=465%2C310&#038;ssl=1" alt="" width="465" height="310" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/07-Observations.png?w=465&amp;ssl=1 465w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/07-Observations.png?resize=300%2C200&amp;ssl=1 300w" sizes="(max-width: 465px) 100vw, 465px" data-recalc-dims="1" /></p>
<p>The constraint of computing resources and time (especially real-time requirements) creates a forcing function for SLAM, where the process becomes a tradeoff between map accuracy, and processing resource and time. As the measurements of features, and location/pose increase over time, representation of the observed environment has to be constrained and optimized. We’ll take a look at some of these tools, and different approaches to optimizing the model.</p>
<h3><strong>Location, Pose and Map Update: Kalman filters</strong></h3>
<p>As the camera moves through space, there is increasing noise and uncertainty between the images the camera captures and its associated motion. Kalman filters reduce the effects of noise and uncertainty among different measurements to model a linear system more accurately by continually making predictions, updating and refining the model against the observed measurements. For SLAM systems, we typically use extended Kalman filters (EKF), which takes nonlinear systems, and linearizes the predictions and measurements around their mean.</p>
<p><img loading="lazy" class="alignnone size-full wp-image-445" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/08-SLAM-with-Filters-Process.png?resize=907%2C213&#038;ssl=1" alt="" width="907" height="213" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/08-SLAM-with-Filters-Process.png?w=907&amp;ssl=1 907w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/08-SLAM-with-Filters-Process.png?resize=300%2C70&amp;ssl=1 300w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/08-SLAM-with-Filters-Process.png?resize=768%2C180&amp;ssl=1 768w" sizes="(max-width: 907px) 100vw, 907px" data-recalc-dims="1" /></p>
<p>Utilizing a probabilistic approach, Kalman filters take into account all the previous measurements and associate the features to the latest camera pose through the use of a state vector and a covariance matrix for each feature against one another. However, all noise and states are assumed to be Gaussian. As you can imagine as the tracked points grow, the computation becomes quite expensive, and harder to scale.</p>
<p><img loading="lazy" class="size-full wp-image-446 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/09-After-Kalman-Filter.png?resize=487%2C373&#038;ssl=1" alt="" width="487" height="373" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/09-After-Kalman-Filter.png?w=487&amp;ssl=1 487w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/09-After-Kalman-Filter.png?resize=300%2C230&amp;ssl=1 300w" sizes="(max-width: 487px) 100vw, 487px" data-recalc-dims="1" /></p>
<h3><strong>Location, Pose and Map Update: Particle filters</strong></h3>
<p>In contrast to Kalman filters, particle filters treat each feature point as a particle in space with some level of positional uncertainty. At each measurement this uncertainty is updated (normalized and re-weighted) against the predicted position with regard to the camera movement. Unlike Kalman filters, particle filters can handle noise from any distribution, and states can have a multi-modal distribution.</p>
<p><img loading="lazy" class="size-full wp-image-447 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/10-After-Particle-Filter.png?resize=517%2C386&#038;ssl=1" alt="" width="517" height="386" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/10-After-Particle-Filter.png?w=517&amp;ssl=1 517w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/10-After-Particle-Filter.png?resize=300%2C224&amp;ssl=1 300w" sizes="(max-width: 517px) 100vw, 517px" data-recalc-dims="1" /></p>
<h3><strong>Location, Pose and Map Update: Bundle adjustment</strong></h3>
<p>As the number of points being tracked in space along with corresponding camera poses increase, bundle adjustment is an optimization step that performs a nonlinear least squares operation on the current model. Imagine a “bundle” of light rays from all the features connected to each of the camera observations, and “adjusted” to optimize these connections directly to the sensor position and orientation as in the figure below.</p>
<p><img loading="lazy" class="alignnone size-full wp-image-448" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/11-After-Bundle-Adjustment.png?resize=901%2C310&#038;ssl=1" alt="" width="901" height="310" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/11-After-Bundle-Adjustment.png?w=901&amp;ssl=1 901w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/11-After-Bundle-Adjustment.png?resize=300%2C103&amp;ssl=1 300w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/11-After-Bundle-Adjustment.png?resize=768%2C264&amp;ssl=1 768w" sizes="(max-width: 901px) 100vw, 901px" data-recalc-dims="1" /></p>
<p>Bundle adjustment is a batch operation, and not performed on every captured frame.</p>
<h3><strong>Location, Pose and Map Update: Keyframe</strong></h3>
<p>Keyframes are select observations by the camera that capture a “good” representation of the environment. Some approaches will perform a bundle adjustment after every keyframe.  Filtering becomes extremely computationally expensive as the map model grows, however keyframes enable more feature points or larger maps, with a balanced tradeoff between accuracy and efficiency.</p>
<p><img loading="lazy" class="size-full wp-image-449 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/12-After-Keyframe-Selection.png?resize=487%2C355&#038;ssl=1" alt="" width="487" height="355" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/12-After-Keyframe-Selection.png?w=487&amp;ssl=1 487w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/12-After-Keyframe-Selection.png?resize=300%2C219&amp;ssl=1 300w" sizes="(max-width: 487px) 100vw, 487px" data-recalc-dims="1" /></p>
<h3><strong>Post-update</strong></h3>
<p>Once the update step completes, the 3D map of the current environment is updated, and the position and orientation of the sensor within this map is known. There are two important concepts that loosely fit into this final step &#8211; a test to see if the system has been here before, and what happens when the system loses tracking or gets lost.</p>
<h3><strong>Post update: Loop closure</strong></h3>
<p>As the system continues to move through space and build a model of its environment, the system will continue to accumulate measurement errors and sensor drift, which will be reflected in the map being generated. Loop closure occurs when the system recognizes that it is revisiting a previously mapped area, and connects previously unconnected parts of the map into a loop, correcting the accumulated errors in the map.</p>
<p><img loading="lazy" class="alignnone size-full wp-image-450" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/13-Loop-Closure.png?resize=960%2C344&#038;ssl=1" alt="" width="960" height="344" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/13-Loop-Closure.png?w=960&amp;ssl=1 960w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/13-Loop-Closure.png?resize=300%2C108&amp;ssl=1 300w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/13-Loop-Closure.png?resize=768%2C275&amp;ssl=1 768w" sizes="(max-width: 960px) 100vw, 960px" data-recalc-dims="1" /></p>
<h3><strong>Post update: Relocalization</strong></h3>
<p>The term localization in SLAM is the awareness of the system’s orientation and position within the given environment and space. Relocalization occurs when a system loses tracking (or initialized in a new environment), and needs to assess its location based on currently observable features. If the system is able to match the features it observes against the available map, it will localize itself to the corresponding pose in the map, and continue the SLAM process.</p>
<h3><strong>Final words</strong></h3>
<p>With the goal of trying not to be too mathy and technical, and yet conceptually descriptive enough to help get a depth of understanding of the processes that take place within one type of SLAM system, this goes a bit beyond my “5 minute read” target, but I think it’s essential to cover these fundamental concepts for visual SLAM to help with future topics.</p>
<p>Let me know your thoughts, comments and questions.</p><p>The post <a href="https://www.kudan.io/blog/visual-slam-the-basics/">Visual SLAM: The Basics</a> first appeared on <a href="https://www.kudan.io">Kudan global</a>.</p>]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">433</post-id>	</item>
		<item>
		<title>Simultaneous Localization Mapping (SLAM): An Introduction</title>
		<link>https://www.kudan.io/blog/lidar-simultaneous-localization-mapping-an-introduction/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=lidar-simultaneous-localization-mapping-an-introduction</link>
		
		<dc:creator><![CDATA[user]]></dc:creator>
		<pubDate>Wed, 05 Aug 2020 04:36:24 +0000</pubDate>
				<category><![CDATA[Tech Blog]]></category>
		<category><![CDATA[artificialperception]]></category>
		<category><![CDATA[artisense]]></category>
		<category><![CDATA[computervision]]></category>
		<category><![CDATA[Kudan]]></category>
		<category><![CDATA[KudanSLAM]]></category>
		<category><![CDATA[lidarslam]]></category>
		<category><![CDATA[machineperception]]></category>
		<category><![CDATA[mapping]]></category>
		<category><![CDATA[SLAM]]></category>
		<category><![CDATA[vslam]]></category>
		<guid isPermaLink="false">https://www.kudan.io/?p=413</guid>

					<description><![CDATA[<p>The technology industry is inundated with references to AI (artificial intelligence), ML (machine learning), DNN’s (deep neural networks), CV (computer vision), CNN’s (convolutional neural networks), RNN’s (recurrent neural networks), etc.. What these acronyms represent are some of the components that make up the field of Artificial Intelligence. Imagine an artificial being, and what it needs [&#8230;]</p>
<p>The post <a href="https://www.kudan.io/blog/lidar-simultaneous-localization-mapping-an-introduction/">Simultaneous Localization Mapping (SLAM): An Introduction</a> first appeared on <a href="https://www.kudan.io">Kudan global</a>.</p>]]></description>
										<content:encoded><![CDATA[<p>The technology industry is inundated with references to AI (artificial intelligence), ML (machine learning), DNN’s (deep neural networks), CV (computer vision), CNN’s (convolutional neural networks), RNN’s (recurrent neural networks), etc..</p>
<p><img loading="lazy" class="size-full wp-image-415 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/AI-Technology-Map.png?resize=290%2C316&#038;ssl=1" alt="" width="290" height="316" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/AI-Technology-Map.png?w=290&amp;ssl=1 290w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/AI-Technology-Map.png?resize=275%2C300&amp;ssl=1 275w" sizes="(max-width: 290px) 100vw, 290px" data-recalc-dims="1" /></p>
<p>What these acronyms represent are some of the components that make up the field of Artificial Intelligence. Imagine an artificial being, and what it needs to successfully interact with the world around it &#8211; the ability to sense and perceive its environment (machine perception), the ability to understand speech (natural language processing), the ability to remember information, learn new things, and make inferences (machine learning, knowledge management and reasoning), the ability to plan and execute actions (automated planning), and the ability to interact with its environment (robotics).</p>
<p>Machine perception encompasses the capabilities enabling machines to understand the input from the 5 senses &#8211; visual, auditory, tactile, olfactory, and gustatory. (Yes, they do have machines that analyze smell and taste).</p>
<p><strong>Computer Vision and SLAM</strong></p>
<p>Buried among these acronyms, you may have come across references to computer vision and SLAM. Let’s dive into the arena of computer vision and where SLAM fits in. There are a number of different flavors of SLAM, such as topological, semantic and various hybrid approaches, but we’ll start with an illustration of metric SLAM.</p>
<p><img loading="lazy" class="size-full wp-image-417 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/CV-Technology-Hierarchy.png?resize=550%2C358&#038;ssl=1" alt="" width="550" height="358" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/CV-Technology-Hierarchy.png?w=550&amp;ssl=1 550w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/CV-Technology-Hierarchy.png?resize=300%2C195&amp;ssl=1 300w" sizes="(max-width: 550px) 100vw, 550px" data-recalc-dims="1" /></p>
<p>As the name suggests (intuitively or not), Simultaneous Localization and Mapping is the capability for a machine agent to sense and create (and constantly update) a representation of its surrounding environment (this is the mapping part), and understand its position and orientation within that environment (this is the localization part). Most humans do this well enough without much effort, but trying to get a computer to do this is another matter.</p>
<p>There are many types of sensors that can detect the surrounding environment, including camera(s), <a href="https://en.wikipedia.org/wiki/Lidar" target="_blank" rel="noopener noreferrer">Lidar</a>, <a href="https://en.wikipedia.org/wiki/Radar" target="_blank" rel="noopener noreferrer">radar</a>, and <a href="https://en.wikipedia.org/wiki/Sonar" target="_blank" rel="noopener noreferrer">sonar</a>. As the machine agent with the sensors (such as the ones listed prior) moves through space, a snapshot of the environment is created, while the relative position of the machine agent within that space is tracked. Thus, a picture is formed by features represented as points in space including the relative distance from the observer and with each other.</p>
<p><img loading="lazy" class="size-full wp-image-416 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Camera-Moving-Through-Space.png?resize=347%2C222&#038;ssl=1" alt="" width="347" height="222" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Camera-Moving-Through-Space.png?w=347&amp;ssl=1 347w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Camera-Moving-Through-Space.png?resize=300%2C192&amp;ssl=1 300w" sizes="(max-width: 347px) 100vw, 347px" data-recalc-dims="1" /></p>
<p>Over time, this collection of feature points and their registered position in space grow together to form a point cloud, a 3-dimensional representation of the environment. This is the “mapping” part in SLAM.</p>
<p>As the map is being created, the machine agent tracks its relative position and orientation within that point cloud, enabling the “localization” part in SLAM. Once a map is available, then any arbitrary machine agent using the map would be able to “relocalize” within that space &#8211; ie. determine its location on the map from what it perceives around it.</p>
<div id="attachment_420" style="width: 810px" class="wp-caption aligncenter"><img aria-describedby="caption-attachment-420" loading="lazy" class="wp-image-420" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/スクリーンショット-2020-08-20-13.45.56-1024x529.png?resize=800%2C413&#038;ssl=1" alt="" width="800" height="413" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/スクリーンショット-2020-08-20-13.45.56.png?resize=1024%2C529&amp;ssl=1 1024w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/スクリーンショット-2020-08-20-13.45.56.png?resize=300%2C155&amp;ssl=1 300w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/スクリーンショット-2020-08-20-13.45.56.png?resize=768%2C397&amp;ssl=1 768w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/スクリーンショット-2020-08-20-13.45.56.png?w=1490&amp;ssl=1 1490w" sizes="(max-width: 800px) 100vw, 800px" data-recalc-dims="1" /><p id="caption-attachment-420" class="wp-caption-text">Point cloud model created using Lidar scans</p></div>
<p><strong>Sounds simple enough</strong></p>
<p>That doesn’t sound too hard, but let’s think about this from a processing perspective.</p>
<p>Let’s assume that we have a stereo camera system performing feature-based visual SLAM.</p>
<p><img loading="lazy" class="size-full wp-image-418 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Feature-based-VSLAM-Process.png?resize=700%2C72&#038;ssl=1" alt="" width="700" height="72" srcset="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Feature-based-VSLAM-Process.png?w=700&amp;ssl=1 700w, https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/Feature-based-VSLAM-Process.png?resize=300%2C31&amp;ssl=1 300w" sizes="(max-width: 700px) 100vw, 700px" data-recalc-dims="1" /></p>
<p>Now imagine a machine agent needing to keep track of hundreds or thousands of points, each with a level of error and drift that needs to be tracked and corrected, and the cameras continue to deliver 30 (or more) frames per second. With each frame, the agent estimates the depth or distance based on the disparity of images between your stereo camera images, looks for features within that image, matches it to previously tracked features, checks to see if the map can be looped/closed-ended, adds new features that are captured, and localizes the agent’s new position with regards to all the track features.</p>
<p>The figure below shows SLAM in action with the feature points highlighted in the central image as the video is captured, and the view of the map being constructed with those feature points in the top left corner along with the camera trajectory.</p>
<p><img loading="lazy" class="size-full wp-image-419 aligncenter" src="https://i0.wp.com/www.kudan.io/wp-content/uploads/2020/08/KudanSLAM.gif?resize=320%2C175&#038;ssl=1" alt="" width="320" height="175" data-recalc-dims="1" /></p>
<p>As you can imagine, this quickly becomes an optimization and approximation exercise for SLAM to run in real-time, or near real-time. Many of the initial applications of SLAM revolved around autonomous vehicles and autonomous robots, where the ability to navigate in an unfamiliar environment, while avoiding obstacles and collisions, in real-time was a critical requirement. As the series continues, I’ll explore various aspects of SLAM, such as Kalman filters, loop closure, bundle adjustment, etc., and delve into what Kudan does with our approach to SLAM. We will also dive into use cases and applications of SLAM and its challenges.</p>
<p>&nbsp;</p>
<p>For further reading:</p>
<p>These are some of the early seminal works that defined SLAM in the 1980’s and 90’s.</p>
<p>Smith, R.C.; Cheeseman, P. (1986). &#8220;On the Representation and Estimation of Spatial Uncertainty&#8221; (<a href="https://frc.ri.cmu.edu/~hpm/project.archive/reference.file/Smith&amp;Cheeseman.pdf" target="_blank" rel="noopener noreferrer">PDF</a>).</p>
<p>Smith, R.C.; Self, M.; Cheeseman, P. (1986). &#8220;Estimating Uncertain Spatial Relationships in Robotics&#8221; (<a href="https://web.archive.org/web/20100702155505/http://www-robotics.usc.edu/~maja/teaching/cs584/papers/smith90stochastic.pdf" target="_blank" rel="noopener noreferrer">PDF</a>).</p>
<p>Leonard, J.J.; Durrant-whyte, H.F. (1991). &#8220;Simultaneous map building and localization for an autonomous mobile robot&#8221; (<a href="https://marinerobotics.mit.edu/sites/default/files/Leonard91iros.pdf" target="_blank" rel="noopener noreferrer">PDF</a>).</p><p>The post <a href="https://www.kudan.io/blog/lidar-simultaneous-localization-mapping-an-introduction/">Simultaneous Localization Mapping (SLAM): An Introduction</a> first appeared on <a href="https://www.kudan.io">Kudan global</a>.</p>]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">413</post-id>	</item>
	</channel>
</rss>
