In our work [], we fuse high-rate pose (3D position and orientation) estimates of visual-inertial odometry (VIO, []) with velocity estimates from foot contacts and leg kinematics of a quadruped robot (Solo12). The kinematics allows for measuring the height above the ground, while the visual-inertial odometry complementarily estimates lateral motion and yaw rotation with respect to a local map.
Implementing dynamic locomotion behaviors on legged robots requires a high-quality state estimation module. Especially when the motion includes flight phases, state-of-the-art approaches fail to produce reliable estimation of the robot posture, in particular base height. In this paper, we propose a novel approach for combining visual-inertial odometry (VIO) with leg odometry in an extended Kalman filter (EKF) based state estimator. The VIO module uses a stereo camera and IMU to yield low-drift 3D position and yaw orientation and drift-free pitch and roll orientation of the robot base link in the inertial frame. However, these values have a considerable amount of latency due to image processing and optimization, while the rate of update is quite low which is not suitable for low-level control. To reduce the latency, we predict the VIO state estimate at the rate of the IMU measurements of the VIO sensor. The EKF module uses the base pose and linear velocity predicted by VIO, fuses them further with a second high-rate IMU and leg odometry measurements, and produces robot state estimates with a high frequency and small latency suitable for control. We integrate this lightweight estimation framework with a nonlinear model predictive controller and show successful implementation of a set of agile locomotion behaviors, including trotting and jumping at varying horizontal speeds, on a torque-controlled quadruped robot.
Cameras and inertial measurement units are complementary sensors for
ego-motion estimation and environment mapping. Their combination makes
visual-inertial odometry (VIO) systems more accurate and robust. For
globally consistent mapping, however, combining visual and inertial
information is not straightforward. To estimate the motion and geometry
with a set of images large baselines are required. Because of that,
most systems operate on keyframes that have large time intervals
between each other. Inertial data on the other hand quickly degrades
with the duration of the intervals and after several seconds of
integration, it typically contains only little useful information.
In this paper, we propose to extract relevant information for
visual-inertial mapping from visual-inertial odometry using non-linear
factor recovery. We reconstruct a set of non-linear factors that make
an optimal approximation of the information on the trajectory
accumulated by VIO. To obtain a globally consistent map we combine
these factors with loop-closing constraints using bundle adjustment.
The VIO factors make the roll and pitch angles of the global map
observable, and improve the robustness and the accuracy of the mapping.
In experiments on a public benchmark, we demonstrate superior
performance of our method over the state-of-the-art approaches.