2Fast-2Lamaa: Large-scale lidar-inertial localization and mapping with continuous distance fields

Abstract

This paper introduces 2Fast-2Lamaa, a lidar-inertial state estimation framework for odometry, mapping, and localization. Its first key component is the optimization-based undistortion of lidar scans, which uses continuous IMU preintegration to model the system’s pose at every lidar point timestamp. The continuous trajectory over 100–200 ms is parameterized only by the initial scan conditions (linear velocity and gravity orientation) and IMU biases, yielding eleven state variables. These are estimated by minimizing point-to-line and point-to-plane distances between lidar-extracted features without relying on previous estimates, resulting in a prior-less motion-distortion correction strategy. Because the method performs local state estimation, it directly provides scan-to-scan odometry. To maintain geometric consistency over longer periods, undistorted scans are used for scan-to-map registration. The map representation employs Gaussian Processes to form a continuous distance field, enabling point-to-surface distance queries anywhere in space. Poses of the undistorted scans are refined by minimizing these distances through non-linear least-squares optimization. For odometry and mapping, the map is built incrementally in real time; for pure localization, existing maps are reused. The incremental map construction also includes mechanisms for removing dynamic objects. We benchmark 2Fast-2Lamaa on over 750 km of public and self-collected datasets from both automotive and handheld systems. The framework achieves state-of-the-art performance across diverse and challenging scenarios, reaching odometry and localization errors as low as 0.22% and 0.06 m, respectively. The real-time implementation is publicly available at https://github.com/clegenti/2fast2lamaa.

Keywords

localization mapping state estimation lidar-inertial navigation

1. Introduction

Over the past decades, the robotics community has put a lot of effort into solving odometry, that is, estimating a system’s ego-motion in unknown environments. In our previous work (Le Gentil et al., 2025), we showed that modern sensing capabilities provide highly accurate odometry even in challenging automotive scenarios. For the majority of real-world robot deployments, odometry alone is not sufficient to estimate the system’s state. With the example of self-driving vehicles or autonomous conveying, the robot is expected to navigate between known locations in a previously mapped environment. In this context, the critical enabling component for autonomous operations is localization, not odometry. The latter is simply used as an initial guess/prior between localization steps. In this paper, we present 2Fast-2Lamaa, which stands for Fast Field-based Agent-subtracted Truly coupled Lidar Localization and Mapping with Accelerometer and Angular-rate. This convoluted acronym refers to a lidar-inertial framework that addresses both mapping and localization by tightly coupling lidar and inertial measurement unit (IMU) measurements and leveraging continuous distance fields to represent the environment.

The first step of most lidar-based systems is motion-distortion correction. Commonly used lidars do not capture instantaneous snapshots of the environment; instead, they sweep one or more laser beams through the scene to collect 3D scans. Accordingly, any motion of the sensor during a scan’s duration creates motion distortion in the data. Many lidar-based state estimation algorithms leverage data from an IMU to address this issue (Lee et al., 2024). The most naive approach consists of using the latest pose and velocity estimates and integrating the inertial measurements to approximate the system’s trajectory for the duration of the incoming scan. With that movement prediction, the scan can be undistorted before being used in a non-linear optimization for scan-to-scan or scan-to-map rigid registration, alongside inertial constraints between consecutive scans. While this approach has been qualified as ‘tightly-coupled’ lidar-inertial estimation, for example, by Ye et al. (2019), this undistortion strategy is a ‘one-off’ operation that can be thought of as an ‘open-loop’ process. Because it freezes the scans based on prior information, it decouples the problem of lidar data undistortion from the estimation of scan-to-scan motion. This can lead to unrecoverable errors if the prior state estimate is not accurate enough due to accumulated drift or outlier data. Other approaches to lidar-inertial state estimation consider the lidar points individually through some continuous motion representation. There, the IMU measurements can be used as residuals to constrain the continuous state (Talbot et al., 2025) or directly used to locally parameterize the trajectory (Le Gentil et al., 2018). These methods estimate the system’s trajectory at the same time as correcting motion distortion in a truly coupled manner. Building upon our previous work (Le Gentil et al., 2024a), 2Fast-2Lamaa falls into the latter category by fully characterizing the trajectory during a scan based on IMU measurements without the need for any explicit motion model.

Once a scan is undistorted, 2Fast-2Lamaa performs localization by estimating a rigid transformation through scan-to-map registration. Note that when used for odometry or mapping, 2Fast-2Lamaa builds the map incrementally, whereas it leverages a previously built map when performing pure localization. A key component of a localization framework is the choice of map representation. The majority of lidar-based state estimation frameworks model the environment with point clouds (with or without normal vectors). In such a case, scan registration is generally performed with a variant of the iterative closest point (ICP) algorithm (Besl and McKay, 1992). 2Fast-2Lamaa differs from this paradigm as it leverages distance fields. These are continuous functions that can be queried at any location and that return an approximation of the Euclidean distance to the closest object/surface in the environment. Accordingly, the registration process consists of directly minimizing distance queries in a non-linear least-squares formulation, neither requiring explicit geometric primitives to model the environment nor any data association steps. The distance field presented in this work is based on Gaussian process (GP) regression (Rasmussen and Williams, 2006), similarly to Le Gentil et al. (2024b). While the concept of a GP-based distance field is not new, 2Fast-2Lamaa is the first real-time state estimation framework that successfully builds upon this idea for large-scale operations (sequences over 10 km-long), as illustrated in Figure 1. Note that this integration is not trivial, as standard GP regression suffers from cubic computational complexity. By leveraging efficient data structures and computing GPs locally, 2Fast-2Lamaa’s novel distance field approximation displays a log(N) computational complexity. The efficient data structures also enable real-time dynamic object removal through simple ray tracing.

Figure 1.

2Fast-2Lamaa performs odometry, mapping, and localization over large-scale environments. It relies on GP-based distance fields for scan-to-map registration. Thanks to efficient data structures, the map can contain details of the environment’s geometry while allowing large-scale operations. The images here are visualizations of the map created with an 8 km-long Suburbs sequence from the Boreas-RT dataset. Despite the length of the trajectory, the map can represent all the observed geometry (a), without sacrificing details (b).

Regardless of the choice of geometric representation (point cloud, mesh, distance field, etc.), mapping and localization frameworks can opt for different high-level mapping strategies. The most natural option is to use a single globally consistent map. However, it has been demonstrated that accurate localization does not require geometrically consistent global maps (Baumgartner and Skaar, 1994; Brooks, 1987). Using a topometric approach that only requires local geometric consistency, Furgale and Barfoot (2010) performs navigation by moving in a set of topologically connected submaps, alleviating the need for perfect odometry/trajectory estimation when building the map. This approach is especially suited for Teach and Repeat tasks, where the robot is manually piloted to map the environment (teach/mapping) before performing autonomous path tracking to follow the original trajectory (repeat/localization). Our proposed framework can operate with either a globally consistent map or in a topometric manner, utilizing a succession of submaps to localize the robot’s repeating trajectories.

As any odometry pipeline is bound to drift, building maps purely on odometry estimates lead to inconsistencies at the global scale when considering trajectories that include loops that revisit previously explored areas. While not the core focus of the present work, 2Fast-2Lamaa can also be used for globally consistent trajectory estimation by performing loop-closure detection and batch pose-graph optimization, similarly to simultaneous localisation and mapping (SLAM) frameworks. This process is performed online when running odometry with submaps (topometric mapping mode). The proposed loop-closure detection and correction pipeline relies on the projection of each submap into 2D image-like data structures that represent the environment’s elevation changes. Similar to the work by Giubilato et al. (2022), visual features are extracted from these image-like structures, matched to detect loop-closures, and then used to provide the associated rough SE(2) geometric transformation between submaps. After SE(3) submap-to-submap pose refinement with the aforementioned GP-based distance fields, the global pose of each submap is estimated in a batch pose-graph optimization.

To summarise, our contributions are:

• The integration of truly coupled lidar-inertial motion-distortion correction in a localization and mapping framework named 2Fast-2Lamaa.

• The development of an efficient GP-based distance field for large-scale online mapping in the presence of dynamic objects.

• The integration of an online loop-closure detection and correction mechanism to extend 2Fast-2Lamaa’s capabilities beyond odometry and incremental mapping.

• An extensive odometry and localization evaluation of 2Fast-2Lamaa with more than 750 km of automotive and handheld data.

• An open-source real-time implementation.

2. Related work

2.1. Lidar state estimation

There are many different approaches to state estimation in the robotics literature. In this section, we mainly consider optimization-based lidar(-inertial) frameworks. The majority of such methods can be classified into three distinct state estimation categories illustrated in Figure 2: (a) discrete, (b) continuous, and (c) hybrid. As mentioned in the introduction, motion distortion is addressed differently depending on the state representation. The first one (a) dissociates the act of undistortion from state estimation. Thus, motion correction is an open-loop process that preprocesses the scans based on previous estimates, and state estimation is discretely solved at the starting timestamp of each scan. With (b) and (c), state estimation and motion distortion are coupled in a single problem that estimates the continuous trajectory of the sensor (at least locally). Fully continuous approaches (b) generally assume a certain motion model to represent the sensor’s movement with a continuous function. IMU measurements can be used as residuals or control inputs of the system’s trajectory. Hybrid approaches (c) attempt to leverage the best of both (a) and (b) by locally parameterizing the trajectory based on inertial information, while keeping discrete variables at the global scale for simplicity and computational efficiency. The rest of this subsection provides a brief literature review of the three types of methods.

Figure 2.

Most optimization-based lidar state estimation frameworks can be classified into three categories. (a) represents discrete-time estimation where the system’s pose is estimated at a finite set of timestamps. The trajectory between timestamps is not estimated. (b) leverages a function over the whole duration of operations to represent the motion continuously. Such a method generally relies on some motion model and is parameterized by a set of supporting points. (c) shows a hybrid approach that locally characterizes the trajectory continuously based on inertial data, but uses discrete state variables at the global scale. This approach does not require an explicit motion model, but the global trajectory shows discontinuities when switching to a new timestamp. 2Fast-2Lamaa is built on the latter paradigm.

Considered as state-of-the-art odometry frameworks, Fast-LIO2 (Xu et al., 2022) and DLIO (Chen et al., 2023) are examples of one-off undistortion processes based on discrete-state estimation. The former uses an iterated Kalman filter, with the core registration step consisting of optimizing point-to-plane residuals between the open-loop-undistorted scans and a global map built incrementally. DLIO introduces a more precise way to undistort the point clouds using a constant-jerk motion model for continuous integration of the IMU measurements. After motion-distortion correction, DLIO performs a variant of Generalized-ICP (Segal et al., 2009) and uses a ‘hierarchical geometric observer’ (Lopez, 2023) to estimate the system’s pose efficiently. Note that DLIO and Fast-LIO2 only optimize for the last sensor pose. Other works leverage a similar undistortion process but optimize over a window of poses in a factor-graph formulation (Shan et al., 2020; Ye et al., 2019). Note that the aforementioned frameworks require sufficient geometric features in the system’s surroundings. To address geometrically degenerated environments, COIN-LIO (Pfreundschuh et al., 2024) proposes to leverage intensity information from the lidar data to extract distinctive features even on flat surfaces. More discrete-state lidar-inertial pipelines are discussed in a recent survey paper (Lee et al., 2024).

As shown in another recent survey paper (Talbot et al., 2025), there is a wide variety of continuous-time state formulations for robotics. Some of the early lidar(-inertial) work built upon continuous-time representations that make strong assumptions about the motion. For example, Zebedee (Bosse et al., 2012) uses piece-wise linear functions to model the system’s trajectory. This corresponds to a strict assumption of constant velocity between control points. Using inertial residuals, it demonstrated the ability to map a 3D environment using a randomly oscillating 2D lidar. Later, LOAM (Zhang and Singh, 2014) used the same constant velocity assumption with both actuated 2D and 3D lidars. This assumption is still used in recent frameworks like CT-LIO (Dellenbach et al., 2022) or MOLA (Blanco-Claraco, 2025). A key contribution of LOAM was the introduction of planar and edge features extracted from the raw lidar data for efficient registration. This concept has been reused in many works, including LeGo-LOAM (Shan and Englot, 2018) and the present work (using a different extraction method).

Throughout the years, other continuous representations have been used to model more complex lidar trajectories. A recurrent formulation is based on splines (Cao et al., 2025; Droeschel and Behnke, 2018). The benefit of splines with respect to discrete state variables has been demonstrated in the context of vision-based SLAM by Cioffi et al. (2022) at the cost of a higher computational cost. Another branch of continuous-time estimation is based on a particular class of sparse GPs (Barfoot, 2024). This approach elegantly models the trajectory based on a probabilistic motion model. For both lidars and radars, continuous GP states with a white-noise-on-acceleration motion prior have demonstrated high levels of accuracy and real-time computation (Burnett et al., 2025a). Note that different motion priors are possible, as demonstrated by Tang et al. (2019) with a white-noise-on-jerk prior. And inertial data can also be used as control inputs (Burnett et al., 2025b; Lilge and Barfoot, 2025).

The hybrid approach shown in Figure 2(c) relies on IMU measurements to locally represent the system’s trajectory. An issue with inertial-based motion prediction/characterization is the strong dependence on initial conditions and biases when integrating and double-integrating the gyroscope and accelerometer measurements. To prevent the reintegration of the IMU data every time the initial conditions are updated during the state optimization process, Lupton and Sukkarieh (2012) introduced the concept of preintegration. It corresponds to the creation of pseudo-measurements that combine the information from multiple IMU measurements without the need to know the initial conditions. Numerous works have improved on the original preintegration method (Eckenhoff et al., 2019; Forster et al., 2017; Yang et al., 2020). The inertial chapter of the SLAM Handbook provides an overview and comparison of several of these works (Huang et al., 2026). Note that, unlike the original preintegration use case, the hybrid state estimation approach does not attempt to combine IMU measurements into a small number of pseudo-measurements but rather requires an ‘upsampling’ of the inertial information for it to be available at any timestamp. This approach was originally introduced to address the issue of lidar-IMU extrinsic calibration by first upsampling the raw inertial readings before performing preintegration at a higher frequency (Le Gentil et al., 2018). Later approaches elegantly addressed preintegration under the scope of continuous state representation (Le Gentil et al., 2020b; Le Gentil and Vidal-Calleja, 2023), enabling full-batch lidar-inertial localization and mapping (Le Gentil et al., 2021). As this hybrid approach is well-suited to asynchronous inertial-aided estimation, other works with event cameras (Le Gentil et al., 2020a; Li et al., 2024) and radars (Hatleskog et al., 2025) have adopted preintegration as their state representation. 2Fast-2Lamaa also uses continuous preintegration to characterize the system’s motion during each lidar scan. The corresponding discrete state is the gravity vector orientation, the linear velocity at the beginning of the scan, and the IMU biases, resulting solely in 11-degree-of-freedom (DoF) to optimize.

2.2. Map representation

Traditionally, robotic maps consist of geometric landmarks jointly estimated with the system’s pose (Dissanayake et al., 2001). Over the past decades, we have seen a constant increase in sensor bandwidth and computational power. Naturally, the robotics community has harnessed these new hardware capabilities with algorithms that build and use denser and denser maps of the environment. Nowadays, the most common representation for lidar state estimation consists of voxel maps where each cell stores the centroid and normal vector of all the points that occurred in the cell. Fast-LIO2 (Xu et al., 2022) is an example of such methods, leveraging point-to-plane registration constraints. Other works, such as KISS-ICP (Vizzo et al., 2023) and KISS-SLAM (Guadagnino et al., 2025), chose to use point-to-point residuals, but suffer from significantly lesser performance on benchmarks such as the Newer College dataset (Ramezani et al., 2020). Other frameworks keep more information in each cell, for example, the local distribution of the points with normal distributions (Biber and Strasser, 2003; Magnusson, 2009). Another way to model the environment is to use sets of surface primitives. Surfels might be the simplest of these primitives and have been used in numerous lidar frameworks (Bosse et al., 2012; Droeschel and Behnke, 2018; Park et al., 2018). Some recent works deal with more complex primitives by directly building and using a mesh of the environment (Lin et al., 2023; Ruan et al., 2023).

The aforementioned representations (point clouds, surfels, meshes, etc.) store information about the environment’s surface. While memory-efficient, they contain limited information about the rest of the space explored by the system. Volumetric approaches, on the other hand, store information over the whole space and can track occupancy, truncated signed distance field (TSDF), etc. These are less explored for lidar-based estimation, but are crucial for autonomous navigation (e.g. to plan safe trajectories through the environment). Voxblox (Oleynikova et al., 2017) and FIESTA (Han et al., 2019) are prime examples of such mapping techniques. Leveraging advances in computer graphics, many mapping frameworks have been built atop the OpenVDB structure (Museth, 2013; Museth et al., 2013). Some fuse information from multiple scans at the TSDF level (Vizzo et al., 2022), where others use the occupancy (Zhu et al., 2021), or the Euclidean distance field (Wu et al., 2025). Focusing on state estimation, VoxGraph (Reijgwart et al., 2020) extends VoxBlox (Oleynikova et al., 2017) and achieves real-time performance. It leverages SDF submaps to compute submap-to-submap pose constraints, which are then integrated into a pose-graph optimization framework. More specific to lidar systems, D-LIO (Coto-Elena et al., 2026) uses a fast truncated distance field stored in a multi-level hashmap to perform scan-to-map registration. While the authors claim scalability to large-scale environments, their experiments on the VBR dataset (Brizi et al., 2024) show a large RAM requirement that cannot be fulfilled without removing information from memory. Another interesting work is shown by Boche et al. (2025) with OKVIS2-X, where vision-based estimation can be complemented with lidar-to-occupancy-map factors (Boche et al., 2024) using the Supereight2 data structure (Funk et al., 2021). Focusing on exploration, Schmid et al. (2021) also leverage volumetric mapping with a submap-based strategy to account for large state estimate drift.

With the democratization of GPU computing, many robotics works have explored the use of neural representations and Gaussian splatting for map representation. Point-SLAM (Sandström et al., 2023) and Gs-icp SLAM (Ha et al., 2024) are examples of both approaches for SLAM with RGB-D sensors. DeepSDF (Park et al., 2019) proposes an auto-decoder architecture to learn and infer the signed distance function at object-scale. Later, iSDF (Ortiz et al., 2022) demonstrates the ability to model and learn such a field online for room-scale environments. While less popular, neural representations can be used for lidar-based state estimation. LocNDF (Wiesmann et al., 2023) addresses the localization problem with a neural field close to a TSDF, but cannot be run at sensor framerate, even in small-scale environments, while Nerf-LOAM (Deng et al., 2023) performs odometry and mapping in large-scale environments. More recently, PIN-SLAM (Pan et al., 2024) showcases top performance on the VBR dataset (Brizi et al., 2024) via TSDF modelling with real-time operations. PINGS (Pan et al., 2025) builds upon PIN-SLAM by coupling Gaussian splats with neural distance fields for improved rendering capacities using both lidar and vision-based sensing.

2Fast-2Lamaa follows a line of work that mixes both surface-based and volumetric information. Similar to the former, it only stores information on the surface of elements in the environment, but allows for the query of a Euclidean distance field approximation over the whole space without large memory requirements. This trend originated from the goal of surface reconstruction with GP implicit surfaces (Williams and Fitzgibbon, 2006), which models a signed distance field close to the objects’ surface. Later, Wu et al. (2021) show the ability to approximate the distance to the closest surface anywhere in space by applying a non-linear operation over a GP-inferred field. It is demonstrated with offline experiments that such GP-based approaches enable lidar state estimation and planning (Wu et al., 2023). Le Gentil et al. (2024b) significantly improve the distance approximation while alleviating the original tradeoff between accuracy and surface interpolation. It represents the foundation of the fast distance field derived in the present work, which does not consider a single and inefficient GP, but breaks down the distance field modelling into many small local GPs. Consequently, our new field formulation enables registration in kilometre-long maps in real time. In contrast, the original work by Le Gentil et al. (2024b) can barely run object-level odometry at 2 Hz.

2.3. Dynamic object rejection

A common assumption of many robotic state estimation algorithms is to consider the system’s environment to be static. This assumption is very rarely verified in real-world applications. While robust techniques like RANSAC or m-estimators enable a robot to deal with a certain level of dynamicity in the scene (considering dynamic object points as outliers), there has been a growing interest in performing dynamic object detection to create maps that only contain static elements (Duberg et al., 2024; Falque et al., 2023; Jia et al., 2024a; Schmid et al., 2023; Wu et al., 2024; Yoon et al., 2019). However, most methods that focus on state estimation in dynamic environments first classify the points before performing standard scan registration (Pfreundschuh et al., 2021), and optionally tracking objects (Jia et al., 2024b). Other methods simply detect dynamic objects after lidar data registration (Le Gentil et al., 2024a; Lichtenfeld et al., 2024), relying on robust estimators to ‘ignore’ dynamic objects during the scan-alignment process. An exception to these two-step approaches is BTSA (Chen et al., 2025), which introduces a dynamic-aware ICP algorithm that couples the problems of dynamic point detection and state estimation in a single process. Another method, HiMo (Zhang et al., 2025), does not address the problem of ego-motion distortion, but tackles the moving object point cloud undistortion through scene flow estimation. We refer the reader to the relevant chapter of the SLAM Handbook (Schmid et al., 2026) for a deeper dive into state estimation in dynamic and deformable environments.

As the problem of dynamic object detection is not the core focus of 2Fast-2Lamaa, the proposed optimization-based motion-distortion correction step relies on robust loss functions to consider dynamic objects as outlier information. However, after scan-to-map registration, dynamic points can be removed from the map in an online or offline manner through some sort of ray tracing, similarly to the work from Pomerleau et al. (2014). In our ablation study, we evaluate the impact of the presence of dynamic points in the map on localization.

3. Framework overview

Let us consider a 6-DoF IMU (3-axis gyroscope and 3-axis accelerometer) and a 3D lidar rigidly mounted together. The lidar collects 3D points denoted as $x_{L}^{j}$ . The gyroscope measures the IMU angular velocity ω, and the accelerometer the proper acceleration f. The homogeneous transformation $T_{I}^{L}$ represents the extrinsic calibration between the two sensors. We aim to estimate the pose of the IMU $T_{W}^{I_{t}}$ at time t with respect to a world frame $F_{W}$ , with or without a prior map of the environment. The timestamp of the start of a lidar scan is denoted τ_i, and $F_{I_{t}}$ refers to the IMU frame at time t. As shown in Figure 3, the proposed framework consists of two main components: one module for tightly-coupled lidar-inertial motion-distortion correction and a second one for scan-to-map localization.

Figure 3.

2Fast-2Lamaa consists of two functional blocks. The first one is an optimization-based motion correction module that undistorts lidar scans using continuous IMU preintegration to characterize the system motion with only 11-DoFs. Once corrected, the scans are used for scan-to-map registration to estimate the global pose of the system.

To undistort lidar scans, 2Fast-2Lamaa first extracts lidar features from the raw scans and performs continuous preintegration with the IMU data. Using continuous IMU preintegration, the system’s trajectory is parameterized by the gravity vector $g_{τ_{i}}$ in $F_{I_{τ_{i}}}$ , the velocity $v_{I_{τ_{i}}}^{τ_{i}}$ at time τ_i in $F_{I_{τ_{i}}}$ , and both the accelerometer and gyroscope biases $b_{f}^{i}$ and $b_{ω}^{i}$ (considered constant for the duration of the scan). After associating features together, the motion during the lidar scans is estimated by minimizing lidar feature distances (point-to-plane and point-to-line, depending on the feature’s nature) in a non-linear least-squares formulation. The output of this module consists of motion-corrected lidar point clouds $U_{i}$ and associated scan-to-scan incremental motion ${\tilde{T}}_{I_{τ_{i}}}^{I_{τ_{i + 1}}}$ .

For trajectory estimation, the localization module integrates the aforementioned incremental motion estimates and refines the global pose ${\tilde{T}}_{W}^{I_{τ_{i}}}$ of $U_{i}$ with scan-to-map registration. The proposed framework can build the map $M$ incrementally (mapping mode) or directly leverage an existing map of the environment (localization mode). Using GP regression, the map provides an approximation of the scene’s Euclidean distance field. With the ability to efficiently query the distance d(x) to the closest surface for any $x \in R^{3}$ , the registration is the direct minimization of d for a subset of points from $U_{i}$ . Once an undistorted scan is registered to the map, it can be used to update the underlying data structure: extending the area covered by $M$ or removing elements that are no longer present in the environment (such as dynamic objects or structural changes). Note that for repeating trajectories, both localization and mapping can be performed in a topometric manner by considering a chain of connected submaps instead of a unique global map. When mapping the environment with submaps, an optional online loop-closure detection and correction mechanism, presented in Section 7, can be used for globally consistent trajectory estimation.

4. Undistortion-based odometry

The undistortion module in 2Fast-2Lamaa is based on previous work for map-less and initialization-free dynamic object detection (Le Gentil et al., 2024a). The main differences are a novel approach for feature extraction and a change in the definition of the temporal window used for motion estimation. 2Fast-2Lamaa offers a faster front-end and reduces the time delay in the state computation by estimating the motion for the last incoming lidar scan, not waiting for subsequent scans as originally done.

4.1. IMU preintegrated continuous state

To undistort the incoming lidar data, let us consider the lidar points and IMU readings that have been collected over a short period of time. For convenience and implementation efficiency, this temporal window is chosen to span over two lidar scans (each defined as a full revolution for spinning sensors), with the previous scan from τ_i-1 to τ_i and the current scan from τ_i to τ_i+1. As formulated for SE(3) by Forster et al. (2017), the rotation, velocity and translation preintegrated measurements from time τ_i-1, $Δ R_{τ_{i}}^{t}$ and $Δ p_{τ_{i}}^{t}$ , correspond to

\begin{aligned} Δ R_{τ_{i - 1}}^{t} = \prod_{τ_{i - 1}}^{t} {(Exp (ω (t) - b_{ω} (t)))}^{d t}, \\ Δ v_{τ_{i - 1}}^{t} = \int_{τ_{i - 1}}^{t} Δ R_{τ_{i - 1}}^{s} (f (t) - b_{f} (t)) d s \\ Δ p_{τ_{i - 1}}^{t} = \int_{τ_{i - 1}}^{t} \int_{τ_{i - 1}}^{t} Δ R_{τ_{i - 1}}^{s} (f (t) - b_{f} (t)) d s d t, \end{aligned}

(1)

with b_ω and b_f slow-varying additive biases. Using the linear preintegrated measurements (LPMs) (Le Gentil and Vidal-Calleja, 2023) to compute (1) based on piece-wise linear continuous representation of the inertial data, the IMU pose and velocity through time are defined as

\begin{aligned} T_{I_{τ_{i - 1}}}^{I_{t}} = [\begin{matrix} R_{I_{τ_{i - 1}}}^{t} & p_{I_{τ_{i - 1}}}^{t} \\ 0 & 1 \end{matrix}] with R_{I_{τ_{i - 1}}}^{t} = Δ R_{τ_{i - 1}}^{t}, \\ v_{I_{τ_{i - 1}}}^{I_{t}} = v_{I_{τ_{i - 1}}}^{I_{τ_{i - 1}}} + (t - τ_{i - 1}) g_{τ_{i - 1}} + Δ v_{τ_{i - 1}}^{t} . \\ p_{I_{τ_{i - 1}}}^{I_{t}} = (t - τ_{i - 1}) v_{I_{τ_{i - 1}}}^{I_{τ_{i - 1}}} + \frac{{(t - τ_{i - 1})}^{2}}{2} g_{τ_{i - 1}} + Δ p_{τ_{i - 1}}^{t} . \end{aligned}

(2)

Note that the preintegrated measurements $Δ R_{τ_{i - 1}}^{t}$ and $Δ p_{τ_{i - 1}}^{t}$ are computed using a prior knowledge of the biases ${\bar{b}}_{f}^{i - 1}$ and ${\bar{b}}_{ω}^{i - 1}$ . Similar to the original on-manifold preintegration work (Forster et al., 2017), the preintegrated measurements are corrected through a first-order Taylor expansion to account for the unknown nature of the biases:

\begin{align} Δ R_{τ_{i - 1}}^{t} \approx {\bar{Δ R}}_{τ_{i - 1}}^{t} Exp (\frac{\partial Δ R_{τ_{i - 1}}^{t}}{\partial b_{ω}^{i - 1}} Δ b_{ω}^{i - 1}), \\ Δ v_{τ_{i - 1}}^{t} \approx {\bar{Δ v}}_{τ_{i - 1}}^{t} + \frac{\partial Δ v_{τ_{i - 1}}^{t}}{\partial b_{ω}^{i - 1}} Δ b_{ω}^{i - 1} + \frac{\partial Δ v_{τ_{i - 1}}^{t}}{\partial b_{f}^{i - 1}} Δ b_{f}^{i - 1}, \\ Δ p_{τ_{i - 1}}^{t} \approx {\bar{Δ p}}_{τ_{i - 1}}^{t} + \frac{\partial Δ p_{τ_{i - 1}}^{t}}{\partial b_{ω}^{i - 1}} Δ b_{ω}^{i - 1} + \frac{\partial Δ p_{τ_{i - 1}}^{t}}{\partial b_{f}^{i - 1}} Δ b_{f}^{i - 1}, \end{align}

(3)

with

Δ b_{f}^{i - 1} = b_{f}^{i - 1} - {\bar{b}}_{f}^{i - 1}

, and

Δ b_{ω}^{i - 1} = b_{ω}^{i - 1} - {\bar{b}}_{ω}^{i - 1}

. Thus, the system’s trajectory during the temporal window is fully characterized by

S = \{g_{τ_{i - 1}}

v_{I_{τ_{i - 1}}}^{I_{τ_{i - 1}}}

b_{f}^{i - 1}

b_{ω}^{i - 1}\}

. Accordingly, a lidar point x_L collected at time

t

can be projected to the

F_{I_{τ_{i - 1}}}

frame as

[\begin{matrix} x_{I_{τ_{i - 1}}} \\ 1 \end{matrix}] = T_{I_{τ_{i - 1}}}^{I_{t}} T_{I}^{L} [\begin{matrix} x_{L} \\ 1 \end{matrix}] .

(4)

4.2. Lidar features and data association

Using the raw lidar point clouds, features are extracted independently in the two scans that constitute the temporal window [τ_i-1, τ_i+1]. Similarly to the work by Zhang and Singh (2014), features can be planar or edge points. However, the feature point selection does not follow any type of curvature/roughness score. The planar points $P_{i - 1}$ and $P_{i}$ are obtained by voxel-subsampling the corresponding scans and retaining only one point per voxel (random selection among the points in each voxel). For the edge points, the proposed method considers each lidar channel/ring independently and detects jumps in range values between consecutively collected points. The latter form the sets of edges $E_{i - 1}$ and $E_{i}$ . Note that the sets $P_{i}$ and $E_{i}$ are mutually exclusive. Figure 4 shows an example of features obtained over 3 s of data in a suburban environment.

Figure 4.

Accumulated lidar features (planar in green, edge in magenta) over 3 s of data in a Suburbs sequence from the Boreas-RT dataset.

Then, the features from the two scans are associated together, between $P_{i - 1}$ and $P_{i}$ or between $E_{i - 1}$ and $E_{i}$ , with a k-closest-neighbour search that enforces some conditions that depend on the feature type. For planar points, each point in $P_{i}$ is associated with three features from $P_{i - 1}$ , enforcing a minimum spread in terms of distance and angle, to avoid colinearity (a minimum of 0.05 m and 75° in our implementation). Similarly, each edge feature in $E_{i}$ is matched with two sufficiently spaced points from $E_{i - 1}$ (typically 0.05 m). For each point, the association begins with a radius search that provides the set of points $A$ . Starting with the closest point in $A$ , the process iteratively tests the type-specific conditions for each point in $A$ by order of distance, until the required number of valid points is reached (successful association) or when there are no more points to test in $A$ (failed association).

4.3. Motion correction

The motion is estimated via a non-linear least-squares optimization of point-to-plane and point-to-line distances

\tilde{S} = \underset{S}{a r g m i n} \sum_{x_{L}^{j} \in P_{i}} r_{P_{j}} j^{2} + \sum_{x_{L}^{j} \in E_{i}} r_{E_{j}} j^{2},

(5)

where

r_{P_{j}} j

is the point-to-plane distance between a planar feature

x_{L}^{j}

and the associated trio

x_{L}^{k}

x_{L}^{l}

, and

x_{L}^{m}

, and

r_{E_{j}} j

is the point-to-line distance between an edge feature

x_{L}^{j}

and the associated pair

x_{L}^{k}

and

x_{L}^{l}

. Using (2) and (4) for point projection, the residuals are computed as

r_{P_{j}} = \frac{{(x_{I_{τ_{i - 1}}}^{j} - x_{I_{τ_{i - 1}}}^{k})}^{⊤} ((x_{I_{τ_{i - 1}}}^{k} - x_{I_{τ_{i - 1}}}^{l}) \times (x_{I_{τ_{i - 1}}}^{k} - x_{I_{τ_{i - 1}}}^{m}))}{‖ (x_{I_{τ_{i - 1}}}^{k} - x_{I_{τ_{i - 1}}}^{l}) \times (x_{I_{τ_{i - 1}}}^{k} - x_{I_{τ_{i - 1}}}^{m}) ‖},

(6)

and

r_{E_{j}} = \frac{‖ (x_{I_{τ_{i - 1}}}^{j} - x_{I_{τ_{i - 1}}}^{k}) \times (x_{I_{τ_{i - 1}}}^{j} - x_{I_{τ_{i - 1}}}^{l}) ‖}{‖ x_{I_{τ_{i - 1}}}^{k} - x_{I_{τ_{i - 1}}}^{l} ‖} .

(7)

As the magnitude of the gravity vector

g_{τ_{i - 1}}

is known, the optimization (5) leverages the 2-sphere manifold, resulting in a total of eleven DoFs. It is solved using the Levenberg-Marquardt algorithm. After convergence, the estimated state

\tilde{S}

and (2) allow the computation of

{\tilde{T}}_{I_{τ_{i}}}^{t}

to correct the current scan’s motion distortion and provide the scan-to-scan pose increment

{\tilde{T}}_{I_{τ_{i}}}^{I_{τ_{i + 1}}}

. The undistorted point cloud

U_{i}

and

{\tilde{T}}_{I_{τ_{i}}}^{I_{τ_{i + 1}}}

are passed to the localization module (next section) for global pose estimation. The velocity and gravity estimates projected to τ_i, as well as an average of the past bias estimates, are used as the state’s initial guess for the next temporal window.

The undistortion process presented in this section is performed for every incoming scan and estimates the continuous trajectory between τ_i-1 and τ_i+1. It is done without fixing the motion between τ_i-1 and τ_i in any way. Computing the motion for the duration of two scans at every scan creates an overlap in the consecutive temporal windows. That is why, at each step, the module solely outputs the transformation between τ_i and τ_i+1, ignoring the estimate between τ_i-1 and τ_i. It is important to note that the trajectory over a small temporal window (typically 200 ms) is often close to translation-only movement, or can be close to rotation-only in some scenarios. These motions are not informative enough to make the accelerometer observable, as there is an ambiguity with gravity, as detailed by Tereshkov (2015). To address this issue, a weak zero-mean prior residual $r_{b_{f}} c b_{f}^{i - 1}$ , with c a constant, is added to prevent unrealistically high accelerometer bias estimates. Thus, (5) becomes

\tilde{S} = \underset{S}{a r g m i n} \sum_{x_{L}^{j} \in P_{i}} r_{P_{j}}^{2} + \sum_{x_{L}^{j} \in E_{i}} r_{E_{j}}^{2} + ‖ r_{b_{f}} ‖^{2} .

(8)

5. Localization

5.1. Scan-to-map registration

Given a motion-corrected point cloud $U_{i}$ , 2Fast-2Lamaa estimates the scan-to-map transformation $T_{W}^{I_{τ_{i}}}$ using a map $M$ (given or incrementally built as detailed in Section 6), which provides a distance field d(x) that can be queried for any point x. As the undistortion step provides decent pose increment estimates, we do not estimate $T_{W}^{I_{τ_{i}}}$ directly, but perform a right-hand perturbation

T_{W}^{I_{τ_{i}}} = {\tilde{T}}_{W}^{I_{τ_{i - 1}}} {\tilde{T}}_{I_{τ_{i - 1}}}^{I_{τ_{i}}} [\begin{matrix} Exp (Δ r) & Δ p \\ 0 & 1 \end{matrix}],

(9)

with

{\tilde{T}}_{W}^{I_{τ_{i - 1}}}

the global pose estimate of the previous scan,

{\tilde{T}}_{I_{τ_{i - 1}}}^{I_{τ_{i}}}

the pose increment from the undistortion step, and Δr and Δp the rotational and translational perturbation, respectively. Thus, using (9), the global registration estimates

\tilde{Δ r}, \tilde{Δ p} = \underset{Δ r, Δ p}{a r g m i n} \sum_{x_{j} \in U_{i}} {(d (T_{W}^{I_{τ_{i}}} [\begin{matrix} x_{j} \\ 1 \end{matrix}]))}^{2}

(10)

using the Levenberg-Marquardt algorithm. The key of 2Fast-2Lamaa is the computation of the distance field d. It is detailed in Section 6.

5.2. Gravity-alignment residual

Inspired by Noh et al. (2025), the scan-to-map registration (10) can be complemented with a gravity-alignment residual $r_{g_{i}}$ , when used for odometry or mapping, as opposed to pure localization. First the gravity vector g_W in the original sensor frame at τ₀ is estimated alongside the system’s linear velocities $v_{W}^{I_{τ_{i}}}$ and IMU biases b_f and b_ω, using the first M estimated poses $T_{W}^{I_{τ_{i}}}$ , obtained through scan-to-map registration. M is programmatically chosen to guarantee that enough translation and rotation occurred in the trajectory to make the accelerometer biases observable. Defining the optimized state $S_{g} = {g_{W}, v_{W}^{I_{τ_{0}}}, \dots, v_{W}^{I_{τ_{M - 1}}}, b_{f}, b_{ω}}$ , the corresponding optimization is

{\tilde{S}}_{g} = \underset{S_{g}}{\arg \min} \sum_{i = 0}^{M - 2} ‖ r_{I_{i}} ‖^{2},

(11)

with the residual

r_{I_{i}}

r_{I_{i}} = [\begin{matrix} Log ({R_{W}^{I_{τ_{i}}}}^{⊤} R_{W}^{I_{τ_{i + 1}}} {Δ R_{τ_{i}}^{τ_{i + 1}}}^{⊤}) \\ v_{W}^{I_{τ_{i + 1}}} - v_{W}^{I_{τ_{i}}} - Δ τ_{i} g_{W} - R_{W}^{I_{τ_{i}}} Δ v_{τ_{i}}^{τ_{i + 1}} \\ p_{W}^{I_{τ_{i + 1}}} - p_{W}^{I_{τ_{i}}} - Δ τ_{i} v_{W}^{I_{τ_{i}}} - \frac{Δ τ_{i}}{2} g_{W} - R_{W}^{I_{τ_{i}}} Δ p_{τ_{i}}^{τ_{i + 1}} . \end{matrix}]

(12)

Then, for any incoming undistorted scan and the associated linear body velocity $v_{I_{τ_{i}}}^{I_{τ_{i}}}$ , a measurement of the gravity $g_{τ_{i}}$ in the current frame is obtained as

g_{τ_{i}} = \frac{v_{I_{τ_{i}}}^{I_{τ_{i}}} - {Δ R_{τ_{i - 1}}^{τ_{i}}}^{⊤} (v_{I_{τ_{i - 1}}}^{I_{τ_{i - 1}}} - Δ v_{τ_{i - 1}}^{τ_{i}})}{Δ τ_{i - 1}} .

(13)

Finally, the gravity residual $r_{g_{i}}$ is defined as

r_{g_{i}} = \arccos (\frac{g_{W}^{⊤} R_{W}^{I_{τ_{i}}} g_{τ_{i}}}{‖ g_{W} ‖ ‖ g_{τ_{i}} ‖}),

(14)

and its squared norm is added to the right-hand side of (10).

5.3. Topometric localization

Atop localization in a globally consistent map, 2Fast-2Lamaa enables topometric localization (and mapping) for repeated robot operations along the same path. This is inspired by the Teach and Repeat framework (Furgale and Barfoot, 2010), which represents the environment with a graph of overlapping geometric submaps topologically linked based on a demonstrated robot trajectory. The localization process has two ‘layers’. At the high level, it consists of moving from submap to submap (node to node) in the topological graph. At the low level, it performs scan-to-submap geometric registration. This approach alleviates the need for a globally consistent map while allowing high-precision localization for autonomous navigation. We have integrated this mapping and localization strategy into the proposed framework. In its current version, 2Fast-2Lamaa only considers the simplest type of graph: an undirected chain. Concretely, the localization in that chain consists of performing scan-to-submap registration using (10) and the submap associated with the current node in the chain. Then, if the state estimate reaches the edge of the submap, the next node (or previous one, depending on the direction of motion) becomes the current node. Figure 5 illustrates a topometric map in the Suburbs environment of our self-collected dataset. While the trajectory used to create the map drifted from the ground-truth, the topometric nature of the localization enables accurate localization for repeating trajectories (<3 cm lateral position root mean squared error (RMSE); see Section 9.3).

Figure 5.

Example of topometric map obtained on a Suburbs sequence of the Boreas-RT dataset. It consists of a succession of submaps (shown as point clouds of different colours) and a topological graph (shown in red, with the sphere being the submaps’ centroids) that connects the submaps. This topometric map does not require global consistency to enable state-of-the-art localization in repeating trajectories.

6. Mapping

This section presents the proposed GP-based distance field for efficient scan-to-map registration in Section 5. The map can be provided with a single point cloud of the robot’s environment or incrementally built according to the scan-to-map odometry estimates. The incremental map creation can be done using a single map or a collection of submaps for the topometric localization mode. The process presented in this section applies to all of these mapping strategies. The only difference for the topometric approach is that a new submap is created after a user-defined distance travelled since the last submap creation.

6.1. GP-based distance field

The proposed mapping process is based on GP-based distance fields (Le Gentil et al., 2024b). Using a point cloud as input, it performs standard GP regression to obtain a latent field o(x) that can be seen as an occupancy field equal to 1 on object surfaces (where the lidar points lie), and that decreases to zero further away from it. Applying a specific non-linear function over o(x) provides an approximation of the Euclidean distance between the query location x and the closest surface. Formally, o is modelled with a zero-mean GP

o (x) \sim G P (0, k (x, x^{'})),

(15)

with k(x, x′) the covariance kernel function that specifies the covariance between two instances of the field o(x) and o(x′). Given a set of N observations/points (equal to one) at locations x, and following standard GP regression (Rasmussen and Williams, 2006), the mean and variance of the latent field can be inferred as

\begin{align} \hat{o} (x) = K_{x, X} {(K_{X, X} + σ I)}^{- 1} 1, \\ var (\hat{o} (x)) = k (x, x) - k_{x, X} {(K_{X, X} + σ I)}^{- 1} k_{x, X}^{⊤}, \end{align}

(16)

where k_x,x is the vector of covariance kernels between the query point x and the input observation locations x, K the kernel-based covariance matrix of x, and σ the observation noise. Following Le Gentil et al. (2024b), the use of a ‘reverting function’ r over o approximates the Euclidean distance field

d (x) = r (o (x)) .

(17)

The definition of r depends on the kernel function k

r (k (x, x^{'})) ≜ ‖ x - x^{'} ‖ .

(18)

With an unscaled, stationary, isotropic kernel (that can be written as a function of the distance k(x, x′) → k(‖x − x′‖)), the reverting function is the inverse of k. Note that due to the non-linear nature of r, the distance field d is not a GP. Accordingly, the inferred variance from (16) is not really informative about the distance estimate’s uncertainty. While not used in 2Fast-2Lamaa, we propose a novel uncertainty proxy in Appendix A.1.

6.2. Efficient large-scale GP distance field

An issue with naively applying standard GP regression is the cubic computational complexity O(N³) with respect to the number N of points in the map due to the matrix inversion in (16). In this work, we propose a novel and efficient strategy for GP-based distance field by computing local GPs instead of a single global one. The core principle behind this strategy is the spatially limited impact of each input point due to the lengthscale of the kernel.¹ In other words, performing GP regression considering a large-enough local neighbourhood of points will (locally) yield a similar inferred mean when compared with using all the available observations.

The key to our approach is to leverage both a sparse voxelized representation $V$ of the dense 3D point cloud of the environment, and a spatial index $I$ that allows for efficient point insertion/removal and closest/radius neighbour searches.² The voxelized representation consists of a hashmap that maps 3-integer tuples to cells, which store the centroid of all the lidar/map points that occurred in the voxel. The tuples are the cell’s grid index that can be computed directly from the 3D coordinates of a point and the fixed cell size.

Integrating new points in the map simply consists of checking if the corresponding cell index is already present in the hashmap. If yes, the cell’s centroid is updated incrementally with the new point location. Otherwise, a new cell is created, and its position is added to the spatial index. Thanks to the hashmap properties, accessing an existing cell or inserting a new one is constant-time O(1) on average. The insertion of a new cell in a spatial index is slower with an O(log(N)) average complexity at best. Fortunately, thanks to the voxelized nature of the stored information, only a small portion of the incoming lidar points lead to cell creation (e.g. a 7.9 km-long suburban trajectory in the Boreas-RT dataset contains over 1.8 trillion lidar points, but the final number of cells in the map is around 10 million).

To query d(x) at any location x, the closest cell v_i in $V$ is queried using the spatial index $I$ . Then, the local neighbourhood around v_i is obtained through a radius search in $I$ . This set of points represents the surface around the closest element to x in the map. An estimate of d(x) can therefore be obtained using (16) and (17) with the aforementioned local neighbourhood as input observations. Considering the voxel nature of $V$ and a fixed radius for the local neighbour search, there is a maximum of points M used for GP inference. Knowing that a query of $I$ is O(log(N)) on average, the overall distance field inference is O(log(N) + M³). As M is fixed, the global complexity for large-scale mapping only grows logarithmically with the size of the observed environment.

There are two main caveats to the proposed field. The first is linked to the use of local GPs, creating a discrete behaviour when potentially switching from one local GP to the next while querying two locations that are side by side. In other words, the global field is not guaranteed to be continuous over the full space $R^{3}$ . However, the ‘jumps’ are generally small enough, in terms of value and gradient, that this does not prevent the use of the field in gradient-based optimization problems, as illustrated in this work. The second drawback is that the voxelized representation degrades the overall probabilistic properties of GP-based distance inference. Simply put, naively feeding the voxels’ centroids to the GP inference using σI as the measurement noise gives the same importance to each cell. However, not all cells provide the same amount of information: a cell in which only one point occurred should not have the same impact as a cell that averaged the position of 100 points. Figure 6(a) shows the corresponding negative impact on the field: non-smooth surface and thicker walls. To alleviate this issue, we propose to ‘weight’ the cell centroid observations based on the number of points that occurred in each voxel. Let us denote the per-voxel number of observations as the counter c_i of a cell. Accordingly, in (16), we replace the measurement uncertainty model σI with diag(w) where w_i (components of w) are computed as a function of the cell’s counter and the maximum cell counter in the neighbourhood. Note that a lower w_i means that the cell can be trusted more than a cell with a higher w_i. Thus, the function that transforms c_i to w_i has to be decreasing. We picked a decreasing sigmoid function in the shape of 1/1 + exp(−c_i/c_max). Figure 6(b) illustrates the improved efficient field inference with the proposed weighting mechanism. Appendix A.2 presents a brief quantitative analysis of the proposed efficient GP-based distance field with and without cell weighting.

Figure 6.

Illustration of the impact of the proposed weighting mechanism for our efficient GP-based distance field inference with sparse voxelized observations. The cell centroids are in red, and the colourmap represents the distance field (the colour is purposely saturated at 1 m for the sake of visibility. (a) is the inference with equal weight for each voxel observation. (b) is the inference with the proposed weighting based on the number of lidar points that occurred in each cell. One can clearly see the improvement in terms of smoothness and ‘wall thickness’, thus the accuracy of the field.

6.3. Free-space carving

In self-driving scenarios (among others), it is impossible to guarantee that the environment is static at the time of mapping. Thus, a mechanism for rejecting dynamic objects from the map can help build cleaner maps for later reuse in a localization-only phase. 2Fast-2Lamaa includes a free-space carving step to remove dynamic points previously inserted in the map and account for changes in the environment, such as parked cars that are no longer present. The principle of the proposed mechanism relies on the comparison between the spherical projection of the current scan and that of the map cells in the vicinity of the current sensor position. First, the scan is projected onto an image-like data structure by discretizing the azimuth and elevation of each point using a resolution of around 1°. Each pixel retains the range value of the closest point that is projected into it. Then, the map points within a radius around the sensor position are queried and projected in the aforementioned image-like data structure using spherical coordinates. If the range of a map point is smaller than the corresponding scan range, using a margin equal to the voxel resolution, it is removed from the map as it occurred between a surface currently observed and the sensor. Figure 7 illustrates this process with dense simulated data for the sake of clarity. Free-space carving can be performed online before inserting a freshly registered point cloud into the map, and offline by revisiting the whole trajectory and scans.

Figure 7.

Illustration of the free-space carving process of the proposed mapping framework (using Handa et al. (2014)’s living room simulated environment).

7. Online pose-graph optimization

When used in mapping mode with a single global map incrementally built based on odometry estimates, the proposed scan-to-map strategy cannot guarantee global consistency when the robot’s trajectory forms loops due to the inherent drift in any odometry algorithm. While using submaps in a topometric approach alleviates this issue for any robotic application that involves repeated motion (teach-and-repeat-like), one might be interested in obtaining a globally consistent trajectory for globally consistent mapping. Toward this goal, 2Fast-2Lamaa also integrates an optional^3‡ online loop-closure detection and correction mechanism based on feature descriptor matching and pose-graph optimization, similar to the work by Giubilato et al. (2022).

Figure 8 shows the proposed pipeline. First, 2Fast-2Lamaa’s topometric mapping mode is leveraged to obtain submaps and submap-to-submap transformation estimates ${\tilde{T}}^{s_{i}} s_{i + 1}$ . The voxelized representation of each gravity-aligned submap is projected in individual image-like data structures which correspond to the elevation gradient of the environment. Unlike the work by Giubilato et al. (2022), no GP is used to generate the image-like data. 2Fast-2Lamaa uses a simple top-down orthographic projection. The pixel size is one and a half times the voxel resolution. Then SIFT features (Lowe, 1999) are extracted and matched between the current and past submaps as illustrated in Figure 9. Loop-closure candidates are obtained by matching and registering features between submaps. To avoid considering wrong associations and limit the computational burden, the brute-force feature matching is only performed between submaps for which pose-to-pose distances are under a threshold computed based on the typical odometry drift in percentage (user-defined) and the distance travelled between pairs. Given the SIFT associations of a submap pair, an SE(2) transformation between the two submaps is estimated with RANSAC (Fischler and Bolles, 1981) and elevation correction is performed so that the mean elevation is the same in both submaps. Combining these together, a rough loop-closure SE(3) transformation is obtained. Using the proposed continuous distance field (10), the SE(3) loop-closure constraint is refined individually between four scans of the oldest of the two matched submaps and the distance field of the current submap. Finally, the odometry relative pose estimates and the loop-closure transformations are used in a pose-graph optimization. This non-linear least-squares problem is solved with the Levenberg-Marquardt algorithm.

Figure 8.

Overview of the proposed loop-closure detection and correction based on the output of 2Fast-2Lamaa in topometric mapping mode. First, the 3D submaps are projected into image-like elevation gradient maps (2.5D). Using visual features, the 2.5D submaps are coarsely aligned by SE(2) RANSAC registration and elevation alignment. Thresholds on the number of inliers and the elevation coherence are used to discard non-loop-closure submap pairs. Based on the proposed GP-based distance field, coarse loop-closure transforms are refined before being used as residuals in a pose-graph optimization that leverages the odometry pose estimates between consecutive submaps.

Figure 9.

Illustration of the loop-closure detection and SE(2) registration used in our online pose-graph optimization of 2Fast-2Lamaa’s submaps. The proposed pipeline relies on the extraction, matching, and registration of visual features in image-like elevation representations of each submap.

8. Implementation

In this section, we present and discuss some specific details of our ROS2 C++ implementation.

8.1. Low-level data structures

The hashmaps in the work use the ankerl::unordered_dense::map implementation (Leitner-Ankerl, 2022a). It has been benchmarked against numerous C++ hashmap implementations and displayed the best overall performance (Leitner-Ankerl, 2022b). The cells are stored via pointers in the hashmap to ensure the compactness of the hashmap storage.

The constraints on the choice of the spatial index $I$ are as follows:

• Fast and scalable insertion of new elements.

• Possibility to remove points without the need to recreate the index.

• Allowing for N-closest neighbour and radius search.

The ikd-Trees (Cai et al., 2021), PH-Trees (Zäschke et al., 2014), and i-Octrees (Zhu et al., 2024) match the aforementioned requirements. Appendix A.3 shows a toy-example evaluation. We observed a significant advantage for the i-Octree. For efficiency, the spatial indexing structure is not updated with the changing centroid of the voxels, but the hashmap is. The spatial index relies on the coordinates of the first point that occurred in each voxel. Accordingly, the closest neighbour or radius searches performed with the ‘out-of-date’ spatial index are only approximations. We found this not to be a problem empirically, as the GPs are computed with the up-to-date voxel centroids of the local neighbourhood that is more than twice as large as the voxel resolution. For some specific scenarios, such as queries in the middle of a corridor, the query for the closest voxel in the spatial index, whether it is out-of-date or up-to-date, can erroneously trigger GP computations using points from the furthest of the two sides. A simple workaround is to perform K closest neighbour searches with associated local GP-based distance field computation for each query. The final distance would be the smallest of the K resulting GP-inferred distances. Note that this corridor-like scenario is unlikely when performing scan-to-map registration thanks to the accurate initial guess from 2Fast-2Lamaa’s undistortion module. Thus, our implementation uses K = 1 for registration. For other queries, the default is K = 2.

8.2. Non-linear optimizations

All the optimizations in this work (lidar-inertial undistortion and scan-to-map registration) are based on Ceres, an open-source non-linear least-squares solver (Agarwal and Mierle, 2022). Both (5) and (10) leverage Cauchy loss functions to attenuate the impact of outliers. The analytical Jacobians of the residuals are provided to the solver. To lower the computational cost of the overall pipeline, we adopted a basic keyframing strategy for the scan-to-map registration, performing (10) only when the sensor has moved sufficiently or after a fixed time period.

8.3. GP kernel and distance field

While it was shown that the square-exponential provides the best accuracy for distance field estimation (Le Gentil et al., 2024b), 2Fast-2Lamaa’s implementation uses the rational-quadratic kernel ${(1 - ‖ x - x^{'} ‖^{2} / 2 α l^{2})}^{- α}$ with α = 2 for efficiency, as it avoids the use of operations such as exponential or logarithm, which are computationally costly. The lengthscale l is defined as twice the voxel resolution. To compute the local GP around a voxel, the neighbourhood search uses a radius of two and a half times the voxel resolution.

8.4. Point-cloud filtering

Lidar data is generally not affected by different levels of luminosity in the environment. However, it can be degraded by floating particles in the air, such as dust or precipitation. To increase robustness in more challenging scenarios, 2Fast-2Lamaa offers the possibility to filter incoming point clouds based on the sensor-provided per-point intensity. While not thoroughly analyzed in this paper, we have observed a significant improvement of 2Fast-2Lamaa’s odometry and localization performance on the Farm sequences of the Boreas-RT dataset (Lisus et al., 2026), which were collected on an extremely dusty road. Additionally, a ‘density check’ is always performed on each undistorted point cloud in case the intensity filtering does not sufficiently remove amorphous point clusters. Using the user-provided map resolution, the scan is discretized into voxels. Points that fall into voxels that have more than twelve neighbours are ignored for registration.

8.5. Submapping strategy

When used with submaps, as opposed to building a single global map, 2Fast-2Lamaa uses a very simple submapping strategy: a scan is integrated in a submap if the sensor has travelled less than a user-defined distance γ since the submap creation. Note that the submaps overlap by a ratio β (equal to 0.2 in our experiments). Concretely, it means that a new submap is created when the distance travelled goes above (1 − β)γ. During the overlap, each scan is added to both submaps. When performing topometric localization, the change from one submap to the next is done when the current pose estimate is closer to the poses in the second half of the overlap than to the first half.

9. Experiments

We validate the proposed framework for both localization and odometry using various public datasets as follows: Section 9.2 uses the Boreas dataset; Section 9.3 is based on part of the Boreas-RT dataset; and Section 9.4 the Newer College dataset. We also leverage the VBR dataset for SLAM evaluation in Section 9.5. On top of this extensive benchmarking, we conduct an ablation using part of the Boreas-RT dataset in Section 9.6. Finally, Section 9.7 provides the reader with information about the computational requirements of 2Fast-2Lamaa.

9.1. Parameters

Aside from hardware-related variables such as extrinsic calibrations or maximum lidar range, 2Fast-2Lamaa can be tailored to specific applications or environments through a set of user-defined parameters. Table 1 provides an overview of the parameters that differ from dataset to dataset. Notably, the voxel size is smaller for the Newer College dataset, and a single global map is used as the scale of the environment is smaller than in other datasets, and the trajectory does not form large loops with loss of co-visibility. As the VBR dataset contains some indoor and handheld sequences, we use a shorter submap length when compared to the Boreas datasets to increase the potential for submap-to-submap loop closures. Not that the VBR dataset evaluation is the only set of experiments that leverages the proposed loop-closure mechanism. As the Boreas and Boreas-RT datasets’ IMU are of good quality, the optimization-based undistortion provides good odometry information. 2Fast-2Lamaa thus leverages a keyframing strategy to limit the number of scan-to-map registrations, allowing for the use of a higher number of points for registration. Finally, free-space carving is disabled in the Newer College dataset evaluation as very few dynamic objects are present in the scene. Please note that 2Fast-2Lamaa exposes many more parameters for the sake of generality, but only the ones mentioned here have been changed in our evaluation. Performance improvements could be attained by tuning parameters for each specific sequence type.

Table 1.

List and description of the different parameters that differ between the different datasets.

	Voxel size	Global/submaps	Keyframing	Num pts registration	Planar only	Free-space carving
Description	Size of the voxels for lidar feature extraction and map	Selection between the use of a single global map or submaps of given length γ	Travelled distance, orientation, and time threshold for keyframing scan-to-map registration	Maximum number of points for scan-to-map registration	Using only planar features	Radius for free-space carving
Boreas/Boreas-RT	0.30 m	Submaps 300 m	10 m/15°/0.5 s	8000	No	50 m
VBR	0.30 m	Submaps 50 m + online pose-graph	No	2000	No	50 m
Newer College	0.15 m	Global map	No	2000	Yes	No

9.2. Boreas dataset

9.2.1. Dataset description

The Boreas dataset (Burnett et al., 2023) consists of 44 sequences collected along a single 8 km suburban route with an automotive sensing platform over the course of a year. The particularity of this dataset is the wide range of weather conditions in between different sequences, from bright sunny days to heavy rainfall and snowstorms. The vehicle is equipped with a Velodyne Alpha Prime lidar, a Navtech RAS6 imaging radar, a FLIR Blackfly S camera, and an Applanix RTK-GNSS-IMU ground-truthing solution. Figure 10 illustrates the impact of extreme weather conditions on lidar data. The Boreas dataset withholds the ground-truth trajectory for 13 of the 44 sequences, corresponding to more than 100 km of data collection. These are used to benchmark various state estimation methods with a public leaderboard. While not optimal due to data contamination between the ground-truth solution and the input of the various algorithms, the Applanix IMU is used in conjunction with the lidar in the rest of that section, as it is the only inertial sensor present aboard the vehicle.

Figure 10.

Impact of a snowstorm on the lidar data in the Boreas dataset (sequences 2020-12-04-14-00 and 2021-01-26-10-59). Ice has formed on the front-facing part of the lidar, completely blocking part of the sensor’s FoV. Additionally, falling snow creates many points floating in the air (z-coloured), making state estimation challenging.

9.2.2. Baselines and metrics

For odometry, we report the leaderboard’s KITTI odometry metric. Succinctly, this metric aligns trajectory chunks (every 10 lidar frames) of length (100 m, 200 m, …, 800 m) based on the first pose of each chunk, computes the error in position and orientation at the end of each segment, and reports the average translation and rotation error relative to the distance travelled. By submitting our results to the leaderboard, we benchmark our method against LTR (Burnett et al., 2022), STEAM-LIO (Burnett et al., 2025a), and OG (Le Gentil et al., 2025). Both LTR and STEAM-LIO are based on continuous-time ICP for lidar state estimation. The continuous trajectory formulation leverages efficient GPs with noise-on-acceleration motion priors. A major difference is the integration of inertial measurements in STEAM-LIO. The OG baseline does not use the lidar but solely the 3D gyroscope and wheel encoder. It represents the simplest and most efficient method for automotive odometry. Nonetheless, OG provides odometry accuracy on par with state-of-the-art exteroceptive frameworks.

To assess the localization ability of 2Fast-2Lamaa, we leverage the localization part of the leaderboard. It consists of creating a map of the route given one sequence, and localizing within that map using 10 other sequences. The metric is the RMSE of the relative pose between the trajectory of the mapping sequence and the localization ones. Note that only LTR has been submitted to the Boreas benchmark (more baselines are used with a subset of the Boreas-RT dataset in Section 9.3). LTR’s strategy for localization is performing scan-to-local-map registration while accounting for past estimates and odometry in the form of a prior on the state.

9.2.3. Odometry benchmark

Table 2 reports the odometry accuracy of 2Fast-2Lamaa and the different baselines. Except for OG, which displays a very small rotational error, 2Fast-2Lamaa is the most accurate method, significantly outperforming the other frameworks. In this experiment, 2Fast-2Lamaa uses submaps (topometric mapping) as it provides slightly better odometry results than the global map version on the Boreas-RT dataset. Section 9.6 provides more details about this difference. As shown in Table 3, 2Fast-2Lamaa’s accuracy is consistent throughout the different sequences, except for 2021-01-26-10-59, which has been collected in a snowstorm and ice accumulated on the front-facing part of the lidar, thus blocking a significant portion of the sensor’s field of view (cf. Figure 10). As all the methods provide decent odometry estimates, it is difficult to draw any strong conclusion on why our method performs better. Note that STEAM-LIO and LTR elegantly perform motion-distortion correction as part of their continuous-time state estimation without ‘fixing’ the undistortion in an open-loop fashion based on previous state estimates. Thus, the difference in the undistortion strategy is unlikely to explain the difference in the global performance. We believe that our continuous map representation makes a difference, as LTR and STEAM-LIO rely on discrete downsampled point clouds with normals to represent the environment. The impact of the continuous map is investigated in Section 9.6.

Table 2.

Average relative pose accuracy of the proposed method and several baselines on the Boreas dataset leaderboard (Burnett et al., 2023).

Method	Trans. (%)	Rot. (°/100 m)
OG (Le Gentil et al., 2025)	0.54	0.07
LTR (Burnett et al., 2022)	0.54	0.16
STEAM-LIO (Burnett et al., 2025a)	0.45	0.15
2Fast-2Lamaa (ours)	0.27	0.10

Table 3.

Per-sequence odometry error (KITTI metric) of 2Fast-2Lamaa on the Boreas dataset leaderboard.

9.2.4. Localization benchmark

For consistency with the odometry benchmark and to match the topometric nature of LTR’s maps, we used 2Fast-2Lamaa with submaps (as opposed to a single global map). 2Fast-2Lamaa’s submaps correspond to 300 m of trajectory while LTR uses a threshold of 30 m to create a new submap. Table 4 provides the leaderboard results. LTR provides slightly better performance over most of the axes. We believe that the state prior in LTR’s formulation helps stabilize the pose estimate between consecutive scans, thus providing more accurate localization abilities. This is supported by the higher roll and pitch rotational error of 2Fast-2Lamaa: our pure scan-to-map registration does not enforce any motion model nor leverage information from previous state estimates. Analyzing the per-sequence results for 2Fast-2Lamaa, there is no strong correlation between both the lateral and longitudinal error and the weather conditions. However, sequences with snow display a notably higher error along the vertical axis. This is expected as the mapping sequence boreas-2020-11-26-13-58 is free from snow. Accordingly, the snow cover on the ground creates a vertical offset in the estimated poses. Overall, despite slightly worse performance compared to LTR, both methods offer high levels of accuracy with under 5 cm lateral RMSE, which is less than the width of any road lane marking.

Table 4.

RMSE localization error for 2Fast-2Lamaa and LTR on the Boreas dataset leaderboard.

9.3. Boreas-RT

9.3.1. Dataset description

To evaluate the proposed framework, we use data from the Boreas-RT dataset (Lisus et al., 2026). It uses the same sensing platform as the Boreas dataset with the main hardware differences being a different radar firmware and an additional 6-DoF Silicon Sensing DMU41 IMU to prevent data contamination between ground-truth and raw sensor data. In this section, we leverage 16 sequences in four different environments, totalling 120 km of data for benchmarking against various baselines. The different routes are Suburbs, Regional, Tunnel, and Skyway, listed in increasing level of difficulty. We also provide 2Fast-2Lamaa’s results on all of the 60 sequences of Boreas-RT (≈650 km) in Appendix A.4.

Suburbs sequences follow the same route as in the Boreas dataset. The road traffic varies between sequences, but a lot of geometric features are present in the surroundings. The Regional data is collected at a higher speed with fewer geometric features to constrain the vehicle’s pose. The Tunnel route goes through a kilometre-long tunnel with very few geometric features to leverage. Finally, the Skyway sequences go over a bridge/skyway with only a few lampposts as static features. A lot of the non-ground lidar points correspond to moving vehicles. Figure 11 provides images from the onboard camera as well as the OpenStreetMap overlay of the trajectories. Both the Suburbs and Skyway routes form loops, whereas Regional and Tunnel sequences do not revisit previously driven areas. The Regional data present an extra challenge for localization: the route is driven twice in each direction, thus, when mapping with one sequence and localizing with the other three, two of the localization runs experience quite different viewpoints than the one of the mapping step (cf. Figure 12). Similarly, the tunnel sequences are collected while driving in both directions. However, the tunnel consists of two separate tubes (one for each traffic direction) without co-visibility. Thus, our Tunnel localization experiments are run for each driving direction independently (two times: one sequence for mapping, one sequence for localization), and the quantitative results are the average of both.

Figure 11.

Images from the onboard camera of the Boreas-RT dataset and trajectory overlays (blue lines) on OpenStreetMap.

Figure 12.

The four Regional sequences have been collected in both directions (two each way). When mapping with one sequence, part of the environment is only visible from the opposite traffic lane, thus leaving gaps in the map (as illustrated here under a bridge, with the map in grey, and the mapping trajectory in green). This makes localization challenging: the second line represents the localization trajectory of the second sequence and is coloured with the localization error (purple low and red high).

9.3.2. Baselines and metrics

The metrics used for odometry and localization are the same as for the Boreas experiments in Section 9.2. Regarding the odometry baselines, we benchmark 2Fast-2Lamaa against MOLA (Blanco-Claraco, 2025), Fast-LIO2 (Xu et al., 2022), and LTR with gyroscope (Burnett et al., 2022). Each of these method address motion distortion in a different manner: MOLA assumes strict constant velocity during each lidar scan (optimizing for the velocity at each ICP step), Fast-LIO2 undistorts the incoming data based on the previous estimate and the open-loop propagation of the IMU readings, and LTR performs fully continuous-time estimation with GPs. Note that the LTR version here is different from the one present on the Boreas leaderboard. The main difference is the integration of angular velocity readings from the gyroscope as extra factors in the continuous-time optimization. For evaluating 2Fast-2Lamaa’s localization abilities, we use LTR and Fast-LIO-localization (HViktorTsoi, 2020) as baselines. The latter is an open-source project that performs localization in maps built with Fast-LIO2.

Sensor-specific parameters are updated for each baseline. Our experiments with MOLA use the default algorithmic parameters from the public implementation. Fast-LIO2 does not expose algorithmic parameters to the user. For LTR, the parameters have been tuned for the best average performance. Fast-LIO localization has been tuned to limit the number of critical failures.

9.3.3. Odometry benchmark

The odometry errors obtained with our method and the baselines are reported in Table 5. Both MOLA and Fast-LIO2 experienced failures in at least one sequence (large drift above 10 %, or complete state divergence), with MOLA failing in half of the total sequences. For fairness, we only display the average across the successful runs. When both methods succeed in estimating the vehicle’s trajectory, they reach a similar level of accuracy. LTR performs well when the environment is rich in geometric features (Suburbs and Regional), but its odometry error significantly increases for more challenging routes. 2Fast-2Lamaa is the only method that provides consistent results across all environments without failures. Its translation accuracy outperforms all the baselines. The Suburbs and Regional results of both LTR and 2Fast-2Lamaa highlight the accuracy gain of elegantly modelling the continuous motion of the sensors, thus addressing the problem of motion-distortion correction in a principled way. We believe that the increased difference in accuracy for more challenging sequences can be explained by the use of our continuous dense distance fields that better represent the environment.

Table 5.

Average relative pose accuracy of the proposed method and several baselines on a subset of the Boreas-RT dataset.

Sequence type (length, avg./max. vel.)	MOLA	Fast-LIO2	LTR w/gyro	2Fast-2Lamaa
Sequence type (length, avg./max. vel.)	(Blanco-Claraco, 2025)	(Xu et al., 2022)	(Burnett et al., 2022)	Topometric (ours)
Suburbs (4 × 7.9 km, 8.1/18.6 m/s)	0.74/0.27	0.74/0.20	0.29/0.09	0.22/0.08
Regional (4 × 9.3 km, 11.1/27.0 m/s)	0.81/0.23^a	0.88/0.19^b	0.35/0.09	0.20/0.08
Tunnel (4 × 1.9 km, 8.9/28.2 m/s)	1.72/0.30^a	0.83/0.17	2.49/0.09	0.34/0.11
Skyway (4 × 11.1 km, 18.2/31.7 m/s)	X^a	0.98/0.20	0.59/0.08	0.30/0.09

KITTI odometry metric reported as XX/YY with XX (%) and YY (°/100 m) the translation and orientation errors, respectively.

^aMola displayed high drift (error $> 10 %$ ) on three sequences from Regional, one from Tunnel, and all from Skyway. Average computed by omitting these failed sequences.

^bFast-LIO2 failed on one of the Regional sequences (trajectory divergence). Average computed by omitting that sequence.

9.3.4. Localization benchmark

Table 6 shows the results of our evaluation. On the Suburbs sequences, 2Fast-2Lamaa and LTR perform similarly with a slight advantage for LTR, as in the Boreas dataset. However, for all the other routes, 2Fast-2Lamaa is the best performing method with the smallest translational errors. When considering the more challenging environments (Tunnel and Skyway), the difference between our method and the baselines is larger. 2Fast-2Lamaa’s error stays contained with RMSEs under 0.20 m, while LTR and Fast-LIO-localization display errors of a couple of metres. Note that overall Fast-LIO-localization performs poorly and fails on many sequences (RMSE

> 10 m

). Figure 13 shows the localization (position) for all methods and sequences as a function of the distance travelled. We believe that the Fast-LIO-localization implementation does not fuse the odometry information and the regular localization steps appropriately, and potential dropouts exacerbate the issue due to high computational load. This is supported by the noisy nature of Fast-LIO-localization’s plots, which oscillate between centimetre- and metre-level errors. Eventually, most sequences end by drifting away from the original trajectory.

Table 6.

RMSE localization error for 2Fast-2Lamaa and different baselines on a subset of the Boreas-RT dataset.

Seq. type and method	Long. (m)	Lat. (m)	Vert. (m)	Roll (°)	Pitch (°)	Yaw (°)
Suburbs
Fast-LIO-localization (HViktorTsoi, 2020)^a	X	X	X	X	X	X
LTR w/gyro (Burnett et al., 2022)	0.032	0.021	0.032	0.013	0.011	0.021
2Fast-2Lamaa topometric (ours)	0.024	0.021	0.034	0.036	0.025	0.026
Regional
Fast-LIO-localization (HViktorTsoi, 2020)^a	2.031	0.510	0.603	0.310	0.660	1.183
LTR w/gyro (Burnett et al., 2022)	0.056	0.055	0.061	0.037	0.039	0.035
2Fast-2Lamaa topometric (ours)	0.042	0.035	0.029	0.038	0.036	0.033
Tunnel
Fast-LIO-localization (HViktorTsoi, 2020)^a	2.212	0.100	0.048	0.205	0.192	0.715
LTR w/gyro (Burnett et al., 2022)	2.960	0.111	0.044	0.063	0.061	0.071
2Fast-2Lamaa topometric (ours)	0.178	0.024	0.026	0.087	0.020	0.026
Skyway
Fast-LIO-localization (HViktorTsoi, 2020)^a	6.105	1.734	0.888	0.482	0.538	0.968
LTR w/gyro (Burnett et al., 2022)^b	7.502	0.203	0.085	0.054	0.042	0.175
2Fast-2Lamaa topometric (ours)	0.059	0.026	0.035	0.050	0.030	0.028

^aFast-LIO-localization succeeded (RMSE < 10 m) only on 2 out of 3 Regional sequences, 1 out of 2 Tunnel, and 1 out of 3 Skyway.

^bOnly 1 out of 3 Skyway sequences ran to completion for LTR.

Figure 13.

Localization error (position [m] with logarithmic scale) against travelled distance of 2Fast-2Lamaa and the different baselines on the Boreas-RT dataset. The different sequences are shown with different shades of colour.

We also want to remind the reader that the Regional sequences are run in both directions for localization, while the mapping is done only in one, as previously illustrated in Figure 12. When considering the sequence where the car is travelling in the same direction as the mapping run, the RMSEs are: Long. = 0.030 m, Lat. = 0.025 m, Vert. = 0.029 m, Roll = 0.029°, Pitch = 0.022°, and Yaw = 0.028°. These values are similar to those obtained in the easiest environment (Suburbs), demonstrating robustness to high-velocity trajectories.

9.4. Newer College dataset

9.4.1. Dataset description

To demonstrate the versatility of 2Fast-2Lamaa, we benchmark it using a dataset collected with a handheld device, the Newer College Dataset (Ramezani et al., 2020). It consists of five sequences collected while walking around the New College at the University of Oxford (UK) with a sensor suite equipped with an Intel RealSense D435i stereo camera and an Ouster OS1-64 lidar. Both sensors possess an embedded IMU. In our experiments, we only use the lidar and its internal IMU. The dataset also provides a detailed map of the environment as a coloured point cloud acquired with a Leica BLK360 survey lidar. The ground-truth trajectory of the sensors is obtained by registering the individual lidar scans to the map with ICP.

9.4.2. Baselines and metrics

For odometry, we adopt the standard absolute trajectory error (ATE) to benchmark against various lidar-inertial frameworks. The ATE is defined as the RMSE between the positions of the estimated trajectory and the ones of the ground-truth after alignment using the Umeyama algorithm (Umeyama, 1991). Similarly to the evaluation with the Boreas-RT dataset, we use Fast-LIO2 (Xu et al., 2022) and STEAM-LIO (Burnett et al., 2025a) as baselines. We also include DLIO (Chen et al., 2023) as it is one of the best performing methods on the Newer College dataset.

It is important to note that the ground-truth provided in the Newer College dataset is not as accurate as that of the other datasets in our experiments. In their paper, Ramezani et al. (2020) showed that when the system is static at the start of the dataset, the ground-truth position jitters in a sphere of around 15 cm in diameter. Additionally, we empirically found a non-negligible rotational error (up to 2°) that hinders the use of the Boreas localization metric, thus preventing the evaluation of localization from one sequence to the other.⁴ Accordingly, we only perform localization against the provided map with 2Fast-2Lamaa as a demonstration of its ability to perform localization within an existing global map. We compute the localization RMSE directly between the provided ground-truth positions and the estimated ones as a sanity check.

9.4.3. Odometry benchmark

Table 7 shows the ATE obtained for each of the sequences. Note that the baseline results are the ones reported by Burnett et al. (2025a) for STEAM-LIO and by Chen et al. (2023) for Fast-LIO2 and DLIO. Overall, all the methods provide a similar level of accuracy, with a slight advantage for 2Fast-2Lamaa for three out of the five sequences. This benchmark demonstrates that our method competes with state-of-the-art frameworks even when considering handheld sensor suites. It also shows that our method can be used with a low-cost IMU, such as the one embedded in the Ouster lidar.

Table 7.

ATE and RMSE localization error (m) for 2Fast-2Lamaa and various baselines on the Newer College dataset sequences.

Metric and method	01_short (1609 m, 1530 s)	02_long (3063 m, 2180 s)	05_quad (479 m, 398 s)	06_spinning (97 m, 120 s)	07_parkland (696 m, 500 s)
ATE
Fast-LIO2 (Xu et al., 2022)	0.378	0.332	0.088	0.078	0.148
DLIO (Chen et al., 2023)	0.360	0.327	0.084	0.061	0.120
STEAM-LIO (Burnett et al., 2025a)	0.304	0.337	0.109	0.082	0.144
2Fast-2Lamaa global map (ours)	0.262	0.266	0.078	0.085	0.115
Localization RMSE
2Fast-2Lamaa global map (ours)	0.128	0.154	0.089	0.091	0.155

9.4.4. Localization benchmark

The localization RMSE of 2Fast-2Lamaa is reported in the last line of Table 7. As mentioned previously, the ground-truth accuracy level does not allow for a thorough analysis of the proposed pipeline’s localization performance. Accordingly, it is difficult to draw any conclusion from these figures. However, these results demonstrate the soundness of the proposed approach and its ability to perform localization in a map generated with another sensor and mapping algorithm.

9.5. VBR-SLAM dataset

9.5.1. Dataset description

To further demonstrate the versatility of 2Fast-2Lamaa and the proposed online loop-closure detection and correction mechanism, we benchmark our method on the VBR-SLAM dataset (Brizi et al., 2024). It consists of 16 sequences collected with a lidar, stereo vision, and an RTK-GPS-IMU solution in 6 different environments. Half of the sequences use an automotive platform with an Ouster OS0-64 lidar at 20 Hz (environments Campus and Ciampino). The other half is collected using a handheld sensor suite comprising an Ouster OS1-128 lidar, which collects scans at 10 Hz (environments Colosseo, Pincio, Spagna, and DIAG). In both scenarios, 2Fast-2Lamaa uses the lidar’s embedded IMU. Out of the 16 sequences, only 8 have publicly available ground-truth. The ground-truth of the other sequences is held out for use in the dataset’s public leaderboard. In this section, we only leverage sequences with provided ground-truth.⁵

9.5.2. Baselines and metrics

As the trajectories form large loops, we use 2Fast-2Lamaa with the online loop-closure detection and correction mechanism from Section 7. We use PIN-SLAM (Pan et al., 2024), SMLE (Bhandari et al., 2024), and KISS-ICP (Vizzo et al., 2023) as lidar-based baselines. PIN-SLAM is especially designed for state estimation with loop closures. As done in the VBR-SLAM public benchmark, we report both the RMSE ATE, and the relative pose error (RPE) as computed with the VBR-provided script. The RPE corresponds to the KITTI odometry metric, but computed with trajectory chunks of different sizes proportional to the whole trajectory length. The VBR-SLAM dataset also provides an overall score that aggregates the results throughout all the sequences to rank the different methods (the higher the better).

9.5.3. SLAM benchmark

Table 8 shows the results obtained with 2Fast-2Lamaa and the different baselines on both the training and testing sequences. While the ranking of the methods varies from one sequence to another, on average, our method displays the best performance, as shown by the overall score. Note that the second-best method, PIN-SLAM, requires a high-grade GPU to perform close to real-time operations. As discussed later in Section 9.7, 2Fast-2Lamaa only uses a fraction of a consumer-grade laptop CPU to achieve better performance. Our future work includes the formulation of a full-batch optimization built on point-to-surface residuals, with the proposed distance field, for better global map accuracy.

Table 8.

ATE obtained using the VBR-SLAM dataset.

Method	Colosseo (m)	Campus 0 (m)	Campus 1 (m)	Pincio (m)	Spagna (m)	DIAG (m)	Ciamp. 0 (m)	Ciamp. 1 (m)	Overall score
Training sequences
PIN-SLAM (Pan et al., 2024)	0.506	0.555	0.392	0.647	0.480	0.362	1.366	1.253	74.44
SMLE (Bhandari et al., 2024)	1.411	1.196	0.994	0.793	0.696	0.457	1.495	1.308	71.65
KISS-ICP (Vizzo et al., 2023)	1.517	1.026	1.039	0.785	1.048	1.397	2.808	2.153	68.23
2Fast-2Lamaa_{pose-graph (ours)}	0.634	0.900	0.584	0.624	0.442	0.359	1.141	0.831	74.49
Testing sequences
PIN-SLAM (Pan et al., 2024)	0.422	0.570	0.864	1.063	0.373	0.394	1.790	1.521	73.00
SMLE (Bhandari et al., 2024)	0.454	0.897	1.022	1.071	0.247	0.456	1.350	1.396	73.11
KISS-ICP (Vizzo et al., 2023)	0.525	1.019	1.289	0.864	0.271	0.463	1.700	2.566	71.30
2Fast-2Lamaa_{pose-graph (ours)}	0.402	0.484	0.874	0.756	0.350	0.405	1.154	1.558	74.02

9.5.4. Odometry benchmark

For completeness, we also provide the RPE metrics as reported on the VBR-SLAM leaderboard in Table 9. SMLE displays the best performance on this benchmark. 2Fast-2Lamaa’s results are, respectively, second and last on the training and testing sequences. It is important to note that the ground-truth of the VBR-SLAM dataset is obtained as a full-batch optimization of the lidar data using the RTK-GNSS measurements as priors. This approach does not provide an independent ground-truth and highly depends on the registration algorithm used. While sufficient for global trajectory consistency analysis (via the ATE), we believe it creates biases in the RPE evaluation. This latter metric is highly sensitive to the rotational accuracy of the ground-truth due to the lever effect of orientation inaccuracies when using long trajectory chunks as done in this leaderboard. For example, a rotation error of 0.1° corresponds to a translation error of around 0.17 m after 100 m travelled, which represents a minimum relative error of 0.17 %. To the best of our knowledge, at the time of writing, the VBR-SLAM ground-truth is computed without accounting for motion distortion. Accordingly, the rotational accuracy of the ground-truth is probably suboptimal, especially on handheld sequences where the sensors’ orientation can change a lot during each lidar scan (≫0.1°). This correlates with the fact that the best performing method, SMLE, does not perform motion-distortion correction, and that 2Fast-2Lamaa’s performance is significantly lower on the handheld sequences as opposed to the automotive ones.

Table 9.

RPE obtained using the VBR-SLAM dataset.

Method	Colosseo (%)	Campus 0 (%)	Campus 1 (%)	Pincio (%)	Spagna (%)	DIAG (%)	Ciamp. 0 (%)	Ciamp. 1 (%)	Overall score
Training sequences
PIN-SLAM (Pan et al., 2024)	0.490	0.243	0.155	0.454	11.61	0.468	0.738	0.648	66.80
SMLE (Bhandari et al., 2024)	0.431	0.613	0.312	0.461	0.310	0.457	0.424	0.293	76.70
KISS-ICP (Vizzo et al., 2023)	0.453	0.515	0.300	0.486	0.399	1.791	0.650	0.395	75.01
2Fast-2Lamaa_{pose-graph (ours)}	0.905	0.599	0.390	0.821	0.581	0.718	0.525	0.372	75.09
Testing sequences
PIN-SLAM (Pan et al., 2024)	0.496	0.228	0.526	1.082	0.456	0.550	1.018	0.769	74.88
SMLE (Bhandari et al., 2024)	0.393	0.251	0.639	0.557	0.323	0.515	0.612	0.282	76.43
KISS-ICP (Vizzo et al., 2023)	0.453	0.250	0.655	0.488	0.410	0.776	0.761	0.475	75.73
2Fast-2Lamaa_{pose-graph (ours)}	0.969	0.356	0.703	0.932	0.701	1.025	0.653	0.629	74.03

9.6. Ablation study

2Fast-2Lamaa is a fairly modular framework. Thus, numerous combinations of components/features can be used (global map vs submaps, using edge features or not, cleaning the map or not, …) to tailor the performance to a specific environment or application. In this section, we perform an ablation study using our 16-sequence subset of Boreas-RT (presented in Section 9.3). We refer to the full framework used to generate 2Fast-2Lamaa’s results on Boreas and Boreas-RT datasets (Sections 9.2 and 9.3, respectively) as the ‘baseline’. Each variant corresponds to a single difference from the baseline. All the results are presented in Tables 10 and 11 for the odometry and localization modes, respectively, and are discussed by theme in the following subsections.

Table 10.

Ablation study of 2Fast-2Lamaa based on the Boreas-RT dataset and using the average translational KITTI odometry metric (%).

Variant	Suburbs	Regional	Tunnel	Skyway
Baseline	0.22	0.20	0.34	0.30
No edge	0.22	0.20	0.46	0.26
No field	0.29	0.24	0.35	0.31
Global map	0.27	0.20	0.32	0.32
No map	0.55	0.66	0.56	0.85
No carving	0.31	0.20	0.33	0.31^a
No acc.	0.23	0.21	0.33	0.30
No IMU	0.25	0.25	0.36	0.33
No keyframe	0.25	0.26	0.31	0.36

Average computed with successful runs.

^a2Fast-2Lamaa failed on one sequence.

Table 11.

Ablation study of 2Fast-2Lamaa based on the Boreas-RT dataset and using the RMSE position error (m).

Seq. type and variantact	Long.	Lat.	Vert.
Suburbs
Baseline	0.023	0.021	0.034
Global map	0.024	0.022	0.034
Online-carved map	0.024	0.022	0.034
Uncarved map	0.024	0.022	0.034
Regional
Baseline	0.042	0.035	0.029
Global map	0.044	0.033	0.028
Online-carved map	0.042	0.034	0.030
Uncarved map	0.042	0.033	0.029
Tunnel
Baseline	0.178	0.024	0.026
Global map	0.216	0.025	0.025
Online-carved map	0.174	0.025	0.026
Uncarved map	0.147	0.025	0.025
Skyway
Baseline	0.059	0.026	0.035
Global map	0.061	0.034	0.060
Online-carved map	0.057	0.026	0.036
Uncarved map	0.052	0.026	0.036

9.6.1. Features

2Fast-2Lamaa’s undistortion modules extract planar and edge features from the raw data, and these are used for point-to-plane and point-to-line distance residuals in the optimization. As a reminder, the first type is obtained by simply subsampling the incoming data, while the second corresponds to jumps in range. As per the discrete nature of lidar data collection, limited by a certain angular resolution, the likelihood of acquiring a point exactly on the edge of an object in the real world is null. On the other hand, a point that belongs to a plane in the real world is not affected by the sensor resolution: multiple consecutive points are likely to belong to the same planar patch (given that the observed plane is large enough). Thus, when considering the reality of lidar data collection and the downstream distance-based residuals, edge features create noisier constraints in the optimization process. However, they provide more constraints due to a single DoF, where planar residuals have two. In environments that are rich with structural elements, we expect the use of both feature types to perform similarly or worse than using solely planar points. Inversely, edge features provide crucial complementary information in geometrically challenging environments such as tunnels, where planes alone do not constrain the longitudinal motion. This reasoning is empirically verified in Table 10 with the No edge variant that performs better than the baseline for Skyway and significantly worse on Tunnel sequences. Note that even without the edge features on these sequences, 2Fast-2Lamaa still outperforms the methods benchmarked in Table 5.

9.6.2. Continuous distance field

In this paragraph, we discuss the contribution of the continuous distance field map towards the global performance of the framework. 2Fast-2Lamaa allows for the direct use of point-to-point distances with the centroids of $V$ in the scan-to-map registration step, instead of the GP-based distance field. The voxel size is the same as for the baseline underlying voxelized data structure: 0.3 m in all our automotive experiments. The corresponding No field variant in Table 10 shows a drop in performance with the relative translation error increasing by 20 to 30% in the more structured environments. While the difference is limited, using the proposed distance field has a non-negligible positive impact on the overall performance. Note that the planar motion in the Skyway sequences is mostly constrained by lampposts along the road. We believe that the fairly dense voxel-based data structure provides enough points along the posts, making the point-to-point residuals sufficient in effectively constraining the motion. Section 9.7 provides information on the difference in computation time between the baseline and the No field variant.

9.6.3. Global map versus submaps

As discussed in Appendix A.3, the use of i-Octree enables fast distance queries in incrementally built large-scale maps. If the system’s trajectory does not form large loops that accumulate too much drift, using a single map (as opposed to a succession of submaps) enables the estimation of globally consistent trajectories and maps. If the system’s trajectory does not form large loops, the drift of the estimated state when revisiting previously mapped locations might be small enough to enable the use of a single global map instead of a succession of submaps. Figure 14 illustrates the trajectory obtained when using an incrementally built global map on a Suburbs sequence. Note that 2Fast-2Lamaa is the only method that can provide a consistent trajectory (and map) as the other frameworks end up ‘forgetting’ previously traversed areas. Looking at the odometry metrics for Global map in Table 10, ‘single-way’ routes (Regional and Tunnel) are not affected by the change from submap to global map for scan-to-map registration, as expected. Interestingly, the routes that loop before heading back to the starting position are impacted negatively. This is due to the high sensitivity of the KITTI odometry metric to the orientation estimates. When the trajectory revisits a previously mapped area, due to some drift, the current state estimate can deviate from the locally consistent odometry to align the current data to the ‘old part’ of the map. This creates small ‘jumps’ in the trajectory estimate. As the KITTI odometry errors are computed after aligning a single pose per trajectory chunk, this deviation, especially in orientation due to a significant lever arm effect for trajectory chunks from 100 to 800 m, locally degrades the metrics.

Figure 14.

Estimated trajectories with 2Fast-2Lamaa and the different baselines on a Suburbs sequence. 2Fast-2Lamaa, with an incrementally built global map, is the only method that produces a trajectory that ends at the same location as the ground-truth. Note that none of the frameworks perform explicit loop-closure correction.

This last phenomenon does not really impact the localization performance within the global map in well-structured environments. As shown in Table 11, the results do not significantly change between the baseline and the Global map variant except for the Tunnel and Skyway sequences. We believe that the inherent drift of 2Fast-2Lamaa when building the global map hinders the sharpness of the few features present in the Tunnel and sequences (mostly corresponding to sparse service doors). Having ‘blurred’ features in the map prevents high accuracy along the longitudinal axis. For the Skyway sequences, the vertical axis is significantly impacted, while the lateral error is slightly higher. This is due to the drift of the mapping trajectory creating ‘double grounds and walls’ in areas of the map. Note that the estimated pose errors stay similar for the other axes. While not really demonstrated here, an advantage of building globally consistent maps for later localization is the simplicity of the localization process. For 2Fast-2Lamaa, it is a simple scan-to-map registration. There is no need for a complex module to perform topometric graph navigation.

9.6.4. Map cleaning

Another component of our ablation study concerns the removal of dynamic objects from the submaps when performing odometry and localization. For odometry, the baseline includes an online free-space carving mechanism. The No carving variant does not possess this feature. The results in Table 10 show no difference on Regional and Tunnel, but the Suburbs and Skyway are negatively impacted, respectively, with a higher error and a failure case.

As mentioned in the methodology part of this paper, the free-space carving can also be performed offline, given the mapping trajectory estimate, the map(s), and the undistorted scans. The localization baseline leverages both online and offline free-space carving during the mapping stage. The Online-carved map variant refers to the use solely of the online carving, and the No carving one does not perform any dynamic object removal. Figure 15 illustrates the difference between the uncarved, online-carved, and offline-carved maps. Overall, the presence of dynamic points in the map does not seem to have a significant impact on localization on the Boreas-RT dataset. A deeper analysis of the impact of dynamic points for localization with more datasets and sensors is required before drawing any final conclusion.

Figure 15.

Illustration of the dynamic object removal results in a map slice from a Suburbs sequence. Most of the dynamic objects are removed by the online free-space carving process.

9.6.5. IMU versus constant velocity

2Fast-2Lamaa’s motion-distortion correction is based on continuous preintegration to characterize the trajectory of the system. This continuous state can be replaced with a strict motion model during the interval [τ_i-1, τ_i+1], similarly to MOLA. In this setup, we test two combinations of motion model: gyroscope preintegration with constant linear velocity (noted No acc.), and both linear and angular constant velocities (noted No IMU). The results in Table 10 show a drop in performance of both variants when compared to the baseline while still staying competitive. As expected, the performance of No acc. is better than that of No IMU. It is important to note that this analysis is based on automotive data. Thus, the movement of the platform is close to a constant velocity model. We expect a larger difference in performance with handheld or drone-mounted sensors.

9.6.6. Keyframing

To provide a deeper insight into our framework’s parameters, we evaluate the 2Fast-2Lamaa’s performance without keyframing and a lower number of points (2000) for scan-to-map registration, matching the parameters used on the VBR and Newer College datasets. This variant is denoted No keyframe in Table 10. Our method still outperforms state-of-the-art baselines from Table 5 on average. The total computational load is moderately impacted by this change of parameters, with 9.60 % total CPU load for the Baseline and 15.4 % for No keyframe.

9.7. Computational requirements

All the variants of 2Fast-2Lamaa presented in this paper run in real time on a laptop equipped with an Intel i7-1370p CPU and 32 GB of RAM. Table 12 shows the CPU load and final RAM usage for a selection of variants from our ablation study. All the variants are lightweight and do not use more than 16 % of the CPU. Whether it is for odometry or localization, the main difference between using a global map or submaps resides in the larger RAM usage when using a global map. The similar CPU load in both scenarios highlights the efficiency of the i-Octree used for spatial indexing. Not using GPs or free-space carving results in a non-negligible drop in computational load. An even lighter version could be run by using neither of the two features. The No keyframe variant leads to the highest computational load. However, this is still moderate and leaves a lot of free CPU resources for other components in a standard robot navigation stack. In terms of computation time for the Baseline, on a typical Suburbs sequence, the undistortion/odometry optimization takes 41 ms on average, the scan-to-maps registration (for each keyframe) requires 61 ms, and the subsequent map update lasts 54 ms with free-space carving and 19 ms without. Note that these numbers can be changed at will by changing the number of threads allocated to each one of the different modules. Note that the baselines (Fast-LIO2, MOLA, and LTR) all required a more powerful computer to run in real time (Intel i9-12900K).

Table 12.

Computational load of various configurations of 2Fast-2Lamaa. The ‘core usage’ corresponds to the average number of CPU cores used by the specific component.

Variant	Undist. core usage	Loc. core usage	Total CPU (%)	Undist. RAM (GB)	Loc. RAM (GB)
Odometry
Baseline	1.19	0.73	9.60	0.17	0.33
Global map	1.14	0.88	10.4	0.17	1.57
No field	1.09	0.36	7.24	0.17	0.31
No carving	1.04	0.59	8.12	0.19	0.33
No keyframe	1.42	1.66	15.4	0.18	0.50
Localization
Baseline	1.06	0.71	8.86	0.17	0.32
Global map	1.05	0.78	9.16	0.16	2.24

10. Conclusion

In this paper, we present 2Fast-2Lamaa, a lidar-inertial mapping and localization framework that consists of two main components: an optimization-based motion-distortion correction module and a scan-to-map localization algorithm that can build maps incrementally. Given sufficient geometric features in the lidar’s surroundings, the undistortion step estimates the continuous trajectory of the sensor through point-to-line and point-to-plane distance minimization between consecutive scans. The local trajectory is characterized by the inertial data directly through continuous IMU preintegration. Thus, the initial conditions (gravity direction, linear velocity, and biases) are the only state variables to estimate, making the problem efficient to solve. For motion consistency over longer horizons, the scan-to-map registration is performed using GP-based distance fields. Thanks to the use of efficient data structures and locally computed GPs, the scan alignment can be done in real time. In odometry/mapping mode, the map is incremented after every scan-to-map registration. An optional online loop-closure detection and correction step, based on a pose-graph optimization, gives 2Fast-2Lamaa the ability to estimate globally consistent trajectories if required.

Throughout an extensive experimental analysis using data from automotive and handheld platforms, 2Fast-2Lamaa demonstrates state-of-the-art accuracy for odometry, localization, and SLAM. Results were consistent across the different test environments, even with challenging routes going through feature-limited tunnels where baselines’ performance significantly declined. Note that these environments are not fully feature-deprived, as regular features such as service doors are present along the traffic lane. To address fully feature-deprived scenarios, our future work includes three different directions that are: the integration of intensity information in both the undistortion and localization modules, the fusion with vision sensing, and the exploration of Doppler velocity measurement with frequency modulated continuous wave (FMCW) lidars.

Footnotes

ORCID iDs

Cedric Le Gentil

Raphael Falque

Daniil Lisus

Timothy D. Barfoot

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

Appendix

References

Agarwal

Mierle

(2022) Ceres solver. URL. https://ceres-solver.org/

Barfoot

(2024) State Estimation for Robotics. 2nd edition. Cambridge University Press.

Baumgartner

Skaar

(1994) An autonomous vision-based mobile robot. IEEE Transactions on Automatic Control 39(3): 493–502. https://doi.org/10.1109/9.280748

Bentley

(1975) Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9): 509–517. https://doi.org/10.1145/361002.361007

Besl

McKay

(1992) A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(2): 239–256. https://doi.org/10.1109/34.121791

Bhandari

Phillips

McAree

(2024) Minimal configuration point cloud odometry and mapping. International Journal of Robotics Research 43(11): 1831–1850. https://doi.org/10.1177/02783649241235325

Biber

Strasser

(2003) The normal distributions transform: a new approach to laser scan matching. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vol. 3, pp. 2743–2748. https://doi.org/10.1109/IROS.2003.1249285

Blanco-Claraco

(2025) A flexible framework for accurate LiDAR odometry, map manipulation, and localization. International Journal of Robotics Research 44(9): 1553–1599. https://doi.org/10.1177/02783649251316881

Boche

Laina

Leutenegger

(2024) Tightly-coupled lidar-visual-inertial slam and large-scale volumetric occupancy mapping. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 18027–18033. https://doi.org/10.1109/ICRA57147.2024.10610460

10.

Boche

Jung

Laina

, et al. (2025) Okvis2-x: open keyframe-based visual-inertial slam configurable with dense depth or lidar, and gnss. IEEE Transactions on Robotics 41: 6064–6083. https://doi.org/10.1109/TRO.2025.3619051

11.

Bosse

Zlot

Flick

(2012) Zebedee: design of a spring-mounted 3-D range sensor with application to mobile mapping. IEEE Transactions on Robotics 28(October): 1–15. https://doi.org/10.1109/tro.2012.2200990

12.

Brizi

Giacomini

Giammarino

, et al. (2024) Vbr: a vision benchmark in Rome. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 15868–15874. https://doi.org/10.1109/ICRA57147.2024.10611395

13.

Brooks

(1987) Visual Map Making for a Mobile Robot. Morgan Kaufmann Publishers Inc., 438–443.

14.

Burnett

Yoon

, et al. (2022) Are we ready for radar to replace lidar in all-weather mapping and localization? IEEE Robotics and Automation Letters 7(4): 10328–10335. https://doi.org/10.1109/lra.2022.3192885

15.

Burnett

Yoon

, et al. (2023) Boreas: a multi-season autonomous driving dataset. International Journal of Robotics Research 42(1-2): 33–42. https://doi.org/10.1177/02783649231160195

16.

Burnett

Schoellig

Barfoot

(2025a) Continuous-time radar-inertial and lidar-inertial odometry using a Gaussian process motion prior. IEEE Transactions on Robotics 41: 1059–1076. https://doi.org/10.1109/TRO.2024.3521856

17.

Burnett

Schoellig

Barfoot

(2025b) Imu as an input versus a measurement of the state in inertial-aided state estimation. Robotica 43(2): 680–700. https://doi.org/10.1017/S0263574724002121

18.

Cai

Zhang

(2021) ikd-tree: an incremental kd tree for robotic applications. arXiv preprint .

19.

Cao

Talbot

(2025) Resple: recursive spline estimation for lidar-based odometry. IEEE Robotics and Automation Letters 10(10): 10666–10673. https://doi.org/10.1109/LRA.2025.3604758

20.

Chen

Nemiroff

Lopez

(2023) Direct lidar-inertial odometry: lightweight lio with continuous-time motion correction. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 3983–3989. https://doi.org/10.1109/ICRA48891.2023.10160508

21.

Chen

Le Gentil

Lin

, et al. (2025) Breaking the static assumption: a dynamic-aware lio framework via spatio-temporal normal analysis. IEEE Robotics and Automation Letters 10(12): 12636–12643. https://doi.org/10.1109/LRA.2025.3623436

22.

Cioffi

Cieslewski

Scaramuzza

(2022) Continuous-time vs. discrete-time vision-based slam: a comparative study. IEEE Robotics and Automation Letters 7(2): 2399–2406. https://doi.org/10.1109/LRA.2022.3143303

23.

Coto-Elena

Maese

Merino

, et al. (2026) D-lio: 6dof direct lidar-inertial odometry based on simultaneous truncated distance field mapping. IEEE Robotics and Automation Letters 11(1): 169–176. https://doi.org/10.1109/LRA.2025.3632615

24.

Dellenbach

Deschaud

Jacquet

, et al. (2022) Ct-icp: Real-time elastic lidar odometry with loop closure. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 5580–5586. https://doi.org/10.1109/ICRA46639.2022.9811849

25.

Deng

Chen

, et al. (2023) Nerf-loam: neural implicit representation for large-scale incremental lidar odometry and mapping. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

26.

Dissanayake

Newman

Clark

, et al. (2001) A solution to the simultaneous localization and map building (slam) problem. IEEE Transactions on Robotics and Automation 17(3): 229–241. https://doi.org/10.1109/70.938381

27.

Droeschel

Behnke

(2018) Efficient continuous-time SLAM for 3D Lidar-based online mapping. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 5000–5007.

28.

Duberg

Zhang

Jia

, et al. (2024) Dufomap: efficient dynamic awareness mapping. IEEE Robotics and Automation Letters 9(6): 5038–5045. https://doi.org/10.1109/LRA.2024.3387658

29.

Eckenhoff

Geneva

Huang

(2019) Closed-form preintegration methods for graph-based visual–inertial navigation. International Journal of Robotics Research 38(5): 563–586. https://doi.org/10.1177/0278364919835021

30.

Falque

Le Gentil

Sukkar

(2023) Dynamic object detection in range data using spatiotemporal normals. In: Proceedings of the Australasian Conference on Robotics and Automation (ACRA).

31.

Fischler

Bolles

(1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6): 381–395. https://doi.org/10.1145/358669.358692

32.

Forster

Carlone

Dellaert

, et al. (2017) On-manifold preintegration for real-time visual-inertial odometry. IEEE Transactions on Robotics 33(1): 1–21. https://doi.org/10.1109/tro.2016.2597321

33.

Funk

Tarrio

Papatheodorou

, et al. (2021) Multi-resolution 3D mapping with explicit free space representation for fast and accurate mobile robot motion planning. IEEE Robotics and Automation Letters 6(2): 3553–3560. https://doi.org/10.1109/lra.2021.3061989

34.

Furgale

Barfoot

(2010) Visual teach and repeat for long-range rover autonomy. Journal of Field Robotics 27(5): 534–560. https://doi.org/10.1002/rob.20342

35.

Giubilato

Le Gentil

Vayugundla

, et al. (2022) Gpgm-slam: a robust slam system for unstructured planetary environments with Gaussian process gradient maps. Field Robotics 2: 1721–1753. https://doi.org/10.55417/fr.2022053

36.

Guadagnino

Mersch

Gupta

, et al. (2025) Kiss-slam: a simple, robust, and accurate 3d lidar slam system with enhanced generalization capabilities. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5363–5370. https://doi.org/10.1109/IROS60139.2025.11246613

37.

Yeon

(2024) Rgbd GS-ICP slam. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer-Verlag, p. 180–197. https://doi.org/10.1007/978-3-031-72764-1_11

38.

Han

Gao

Zhou

, et al. (2019) Fiesta: fast incremental Euclidean distance fields for online motion planning of aerial robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 4423–4430.

39.

Handa

Whelan

McDonald

, et al. (2014) A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 1524–1531. https://doi.org/10.1109/ICRA.2014.6907054

40.

Hatleskog

Nissov

Alexis

(2025) Imu-preintegrated radar factors for asynchronous radar-lidar-inertial slam. In: Advanced Robotics.

41.

Huang

Le Gentil

Vidal-Calleja

, et al. (2026) Inertial odometry for SLAM. In: Carlone

Kim

Barfoot

, et al. (eds) SLAM Handbook from Localization and Mapping to Spatial Intelligence. Cambridge University Press.

42.

HViktorTsoi (2020) Fast-LIO Localization: A Simple Localization Framework that can re-localize in Built Maps Based on Fast-LIO. github.com. https://github.com/HViktorTsoi/FAST_LIO_LOCALIZATION

43.

Jia

Zhang

Yang

, et al. (2024a) Beautymap: Binary-encoded adaptable ground matrix for dynamic points removal in global maps. IEEE Robotics and Automation Letters 9: 6256–6263. https://doi.org/10.1109/lra.2024.3402625

44.

Jia

Wang

Chen

, et al. (2024b) Trlo: an efficient lidar odometry with 3d dynamic object tracking and removal. arXiv preprint .

45.

Le Gentil

Vidal-Calleja

(2023) Continuous latent state preintegration for inertial-aided systems. International Journal of Robotics Research 42(10): 874–900. https://doi.org/10.1177/02783649231199537

46.

Le Gentil

Vidal-Calleja

Huang

(2018) 3d lidar-IMU calibration based on upsampled preintegrated measurements for motion distortion correction. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 2149–2155. https://doi.org/10.1109/ICRA.2018.8460179

47.

Le Gentil

Tschopp

Alzugaray

, et al. (2020a) IDOL: a Framework for IMU-DVS Odometry using lines. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5863–5870.

48.

Le Gentil

Vidal-Calleja

Huang

(2020b) Gaussian process preintegration for inertial-aided state estimation. IEEE Robotics and Automation Letters 5(2): 2108–2114. https://doi.org/10.1109/lra.2020.2970940

49.

Le Gentil

Vidal-Calleja

Huang

(2021) In2laama: inertial lidar localization autocalibration and mapping. IEEE Transactions on Robotics 37(1): 275–290. https://doi.org/10.1109/TRO.2020.3018641

50.

Le Gentil

Falque

Vidal-Calleja

(2024a) Real-time truly-coupled lidar-inertial motion correction and spatiotemporal dynamic object detection. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 12565–12572.

51.

Le Gentil

Ouabi

, et al. (2024b) Accurate Gaussian-process-based distance fields with applications to echolocation and mapping. IEEE Robotics and Automation Letters 9(2): 1365–1372. https://doi.org/10.1109/lra.2023.3346759

52.

Le Gentil

Lisus

Barfoot

(2025) Do we still need to work on odometry for autonomous driving? In: International Conference on Robotics and Automation 2025 Workshop on Field Robotics.

53.

Lee

Jung

Yang

, et al. (2024) Lidar odometry survey: recent advancements and remaining challenges. Intelligent Service Robotics 17: 1–24. https://doi.org/10.1007/s11370-024-00515-8

54.

Leitner-Ankerl

(2022a) A fast & densely stored Hashmap and Hashset based on robin-hood backward shift deletion. URL. https://github.com/martinus/unordered_dense

55.

Leitner-Ankerl

(2022b) Comprehensive C++ Hashmap Benchmarks 2022. URL: https://martin.ankerl.com/2022/08/27/hashmap-bench-01/

56.

Wang

Liu

, et al. (2024) Asynchronous event-inertial odometry using a unified gaussian process regression framework. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7773–7778. https://doi.org/10.1109/IROS58592.2024.10802357

57.

Lichtenfeld

Daun

von Stryk

(2024) Efficient dynamic lidar odometry for mobile robots with structured point clouds. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 10137–10144.

58.

Lilge

Barfoot

(2025) Incorporating control inputs in continuous-time Gaussian process state estimation for robotics. Robotica 43(3): 1067–1086. https://doi.org/10.1017/S0263574725000116

59.

Lin

Yuan

Cai

, et al. (2023) Immesh: an immediate lidar localization and meshing framework. IEEE Transactions on Robotics 39(6): 4312–4331. https://doi.org/10.1109/TRO.2023.3321227

60.

Lisus

Papais

Gentil

, et al. (2026) Boreas road trip: a multi-sensor autonomous driving dataset on challenging roads. URL. https://arxiv.org/abs/2602.16870

61.

Lopez

(2023) A contracting hierarchical observer for pose-inertial fusion. arXiv preprint .

62.

Lowe

(1999) Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision, Volume 2. IEEE, pp. 1150–1157.

63.

Lupton

Sukkarieh

(2012) Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions. IEEE Transactions on Robotics 28(1): 61–76. https://doi.org/10.1109/tro.2011.2170332

64.

Magnusson

(2009) The Three-Dimensional Normal-Distributions Transform: An Efficient Representation for Registration, Surface Analysis, and Loop Detection. PhD Thesis. Örebro Universitet.

65.

Museth

(2013) Vdb: high-resolution sparse volumes with dynamic topology. ACM Transactions on Graphics.

66.

Museth

Lait

Johanson

, et al. (2013) Openvdb: an open-source data structure and toolkit for high-resolution volumes. In: Acm Siggraph 2013 Courses, pp. 1.

67.

Noh

Yang

Jung

, et al. (2025) Garlio: gravity enhanced radar-lidar-inertial odometry. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 9869–9875. DOI: 10.1109/ICRA55743.2025.11128334.

68.

Oleynikova

Taylor

Fehr

, et al. (2017) Voxblox: incremental 3d euclidean signed distance fields for on-board Mav planning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

69.

Ortiz

Clegg

Dong

, et al. (2022) Isdf: real-time neural signed distance fields for robot perception. In: Robotics: Science and Systems.

70.

Pan

Zhong

Wiesmann

, et al. (2024) PIN-SLAM: LiDAR SLAM using a point-based implicit neural representation for achieving global map consistency. IEEE Transactions on Robotics 40: 4045–4064. https://doi.org/10.1109/tro.2024.3422055

71.

Pan

Zhong

Jin

, et al. (2025) PINGS: Gaussian splatting meets distance fields within a point-based implicit neural map. In: Proceedings of Robotics: Science and Systems (RSS).

72.

Park

Moghadam

Kim

, et al. (2018) Elastic LiDAR fusion: dense map-centric continuous-time SLAM. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA).

73.

Park

Florence

Straub

, et al. (2019) Deepsdf: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

74.

Pfreundschuh

Hendrikx

Reijgwart

, et al. (2021) Dynamic object aware lidar slam based on automatic generation of training data. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA). IEEE, pp. 11641–11647.

75.

Pfreundschuh

Oleynikova

Cadena

, et al. (2024) Coin-LIO: complementary intensity-augmented lidar inertial odometry. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA). IEEE, pp. 1730–1737.

76.

Pomerleau

Krüsi

Colas

, et al. (2014) Long-term 3d map maintenance in dynamic environments. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA). IEEE, pp. 3712–3719.

77.

Ramezani

Wang

Camurri

, et al. (2020) The newer college dataset: Handheld lidar, inertial and vision with ground truth. In: Proceedings of the IEEE/RSL International Conference on Intelligent Robots and Systems (IROS), pp. 4353–4360. https://doi.org/10.1109/IROS45743.2020.9340849

78.

Rasmussen

Williams

CKI

(2006) Gaussian Processes for Machine Learning. The MIT Press.

79.

Reijgwart

Millane

Oleynikova

, et al. (2020) Voxgraph: globally consistent, volumetric mapping using signed distance function submaps. IEEE Robotics and Automation Letters 5(1): 227–234. https://doi.org/10.1109/LRA.2019.2953859

80.

Ruan

Wang

, et al. (2023) Slamesh: real-time lidar simultaneous localization and meshing. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 3546–3552. https://doi.org/10.1109/ICRA48891.2023.10161425

81.

Sandström

Van Gool

, et al. (2023) Point-slam: dense neural point cloud-based slam. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

82.

Schmid

Reijgwart

Ott

, et al. (2021) A unified approach for autonomous volumetric exploration of large scale environments under severe odometry drift. IEEE Robotics and Automation Letters 6(3): 4504–4511. https://doi.org/10.1109/LRA.2021.3068954

83.

Schmid

Andersson

Sulser

, et al. (2023) Dynablox: real-time detection of diverse dynamic objects in complex environments. IEEE Robotics and Automation Letters 8(10): 6259–6266. https://doi.org/10.1109/LRA.2023.3305239

84.

Schmid

Montiel

JMM

Huang

, et al. (2026) Dynamic and deformable SLAM. In: Carlone

Kim

Barfoot

, et al. (eds) SLAM Handbook. From Localization and Mapping to Spatial Intelligence. Cambridge University Press.

85.

Segal

Haehnel

Thrun

(2009) Generalized-ICP. Robotics: Science and Systems 5: 168–176.

86.

Shan

Englot

(2018) LeGO-LOAM: lightweight and ground-optimized lidar odometry and mapping on variable terrain. In: Proceedings of the IEEE/RSL International Conference on Intelligent Robots and Systems (IROS), pp. 4758–4765.

87.

Shan

Englot

Meyers

, et al. (2020) Lio-sam: tightly-coupled lidar inertial odometry via smoothing and mapping. In: Proceedings of the IEEE/RSL International Conference on Intelligent Robots and Systems (IROS), pp. 5135–5142. https://doi.org/10.1109/IROS45743.2020.9341176

88.

Talbot

Nubert

Tuna

, et al. (2025) Continuous-time state estimation methods in robotics: a survey. IEEE Transactions on Robotics 41: 4975–4999. https://doi.org/10.1109/TRO.2025.3593079

89.

Tang

Yoon

Barfoot

(2019) A white-noise-on-jerk motion prior for continuous-time trajectory estimation on se(3). IEEE Robotics and Automation Letters 4(2): 594–601. https://doi.org/10.1109/lra.2019.2891492

90.

Tereshkov

(2015) A simple observer for gyro and accelerometer biases in land navigation systems. Journal of Navigation 68(04): 635–645. https://doi.org/10.1017/s0373463315000016

91.

Umeyama

(1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(4): 376–380. https://doi.org/10.1109/34.88573

92.

Vizzo

Guadagnino

Behley

, et al. (2022) Vdbfusion: flexible and efficient tsdf integration of range sensor data. Sensors 22(3): 1296. https://doi.org/10.3390/s22031296

93.

Vizzo

Guadagnino

Mersch

, et al. (2023) KISS-ICP: in defense of point-to-point ICP – simple, accurate, and robust registration if done the right way. IEEE Robotics and Automation Letters 8(2): 1029–1036. https://doi.org/10.1109/LRA.2023.3236571

94.

Wiesmann

Guadagnino

Vizzo

, et al. (2023) LocNDF: neural distance field mapping for robot localization. IEEE Robotics and Automation Letters 8(8): 4999–5006. https://doi.org/10.1109/LRA.2023.3291274

95.

Williams

Fitzgibbon

(2006) Gaussian Process Implicit Surfaces.

96.

Lee

KMB

Liu

, et al. (2021) Faithful euclidean distance field from log-gaussian process implicit surfaces. IEEE Robotics and Automation Letters 6(2): 2461–2468. https://doi.org/10.1109/LRA.2021.3061356

97.

Lee

KMB

Le Gentil

, et al. (2023) Log-gpis-mop: a unified representation for mapping, odometry, and planning. IEEE Transactions on Robotics 39(5): 4078–4094. https://doi.org/10.1109/TRO.2023.3296982

98.

Pang

, et al. (2024) Observation time difference: an online dynamic objects removal method for ground vehicles. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 17997–18003.

99.

Le Gentil

Vidal-Calleja

(2025) Vdb-gpdf: online Gaussian process distance field with vdb structure. IEEE Robotics and Automation Letters 10(1): 374–381. https://doi.org/10.1109/LRA.2024.3505814

100.

Cai

, et al. (2022) Fast-lio2: fast direct lidar-inertial odometry. IEEE Transactions on Robotics 38(4): 2053–2073. https://doi.org/10.1109/TRO.2022.3141876

101.

Yang

Pious

Babu

, et al. (2020) Analytic combined IMU integration (ACI2) for visual inertial navigation. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA).

102.

Chen

Liu

(2019) Tightly coupled 3D Lidar inertial odometry and mapping. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA) 2019-May, pp. 3144–3150. https://doi.org/10.1109/ICRA.2019.8793511

103.

Yoon

Tang

Barfoot

(2019) Mapless online detection of dynamic objects in 3d lidar. In: Proceedings of the Conference on Robots and Vision. IEEE, pp. 113–120.

104.

Zäschke

Zimmerli

Norrie

(2014) The ph-tree: a space-efficient storage structure and multi-dimensional index. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD ’14. Association for Computing Machinery, pp. 397–408. https://doi.org/10.1145/2588555.2588564

105.

Zhang

Singh

(2014) LOAM: lidar odometry and mapping in real-time. Robotics: Science and Systems: 7–15 41: 401–416. https://doi.org/10.1007/s10514-016-9548-2

106.

Zhang

Khoche

Yang

, et al. (2025) Himo: high-speed objects motion compensation in point clouds. IEEE Transactions on Robotics 41: 5896–5911. https://doi.org/10.1109/TRO.2025.3619042

107.

Zhu

Wang

, et al. (2021) Vdb-edt: an efficient Euclidean distance transform algorithm based on vdb data structure. arXiv preprint .

108.

Zhu

Wang

, et al. (2024) i-octree: a fast, lightweight, and dynamic octree for proximity search. In: Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pp. 12290–12296. https://doi.org/10.1109/ICRA57147.2024.10611019