In our work EKFPhys [], we use state-of-the-art 3D object detection and pose estimation (Lipson et al., Coupled Iterative Refinement for 6D Multi-Object Pose Estimation, CVPR 2022) to detect objects with known shape and texture in RGB-D images. The 6-DoF object pose (3D rotation and translation) is filtered together with the object's Coulomb friction parameter with the underlying surface using an extended Kalman filter (EKF). The filter uses the detected object poses as observations and the differentiable physics simulation as state-transition model. We propose novel synthetic and real benchmark datasets and evaluate the performance of our approach in estimating object pose and friction parameters.
Physics-based understanding of object interactions from sensory observations is an essential capability in augmented reality and robotics. It enables to capture the properties of a scene for simulation and control. In this paper, we propose a novel approach for real-to-sim which tracks rigid objects in 3D from RGB-D images and infers physical properties of the objects. We use a differentiable physics simulation as state-transition model in an Extended Kalman Filter which can model contact and friction for arbitrary mesh-based shapes and in this way estimate physically plausible trajectories. We demonstrate that our approach can filter position, orientation, velocities, and concurrently can estimate the coefficient of friction of the objects. We analyze our approach on various sliding scenarios in synthetic image sequences of single objects and colliding objects. We also demonstrate and evaluate our approach on a real-world dataset. We make our novel benchmark datasets publicly available to foster future research in this novel problem setting and comparison with our method.