Talk Biography
17 November 2025 at 10:00 - 11:30 | MPI-IS Tübingen, N0.002

Special Talk: From Scaling Data to Scaling Capability: Compositional, Multimodal, and Embodied Robot Learning

ORGANIZERS
Thumb ticker sm 20241104 hardt moritz 12 cleaned kleiner
Social Foundations of Computation
  • Director
Thumb ticker sm laemmerhirt eva 2 02
Scientific Coordination Office
Referentin der Geschäftsleitung | Institute Management Officer
Thumb ticker sm 20240912 tyagi nisha 1 3
Scientific Coordination Office
Thumb ticker xxl jeannette boh portrait

Scaling robot learning has led to impressive progress, but also to an unsustainable dependence on large datasets collected mainly through robot teleoperation. Moving forward, we need approaches that scale capability rather than data. In this talk, I will present three complementary efforts in this direction. Masquerade exploits the abundance of human egocentric videos to scale robot policy learning from rich but unstructured demonstrations. DexForce couples visual and tactile modalities to enable dexterous, contact-rich manipulation. MobiPi generalizes manipulation policies to mobile robots through compositional control, avoiding exponential growth in data requirements as embodiment complexity increases. Together, these projects argue for capability scaling driven by compositional learning, multimodal grounding, and embodied intelligence.

Speaker Biography

Jeannette Bohg (Stanford University)

Assistant Professor of Computer Science

Jeannette Bohg is an Assistant Professor of Computer Science at Stanford University, where she leads research at the intersection of robotics, machine learning, and computer vision with a focus on autonomous robotic manipulation. Her lab aims to uncover the principles of robust sensorimotor coordination and implement them on real robots. Before joining Stanford, she was a group leader in the Autonomous Motion Department (AMD) at the Max Planck Institute for Intelligent Systems from 2012 to 2017. She earned her Ph.D. in the Division of Robotics, Perception, and Learning (RPL) at KTH Royal Institute of Technology in Stockholm, where her thesis introduced novel methods for multi-modal scene understanding in robotic grasping. She also studied at Chalmers University of Technology in Gothenburg and the Technical University of Dresden, receiving an M.Sc. in Art and Technology and a Diploma in Computer Science, respectively. Bohg’s work has been recognized with multiple Early Career and Best Paper awards, including the 2019 IEEE Robotics and Automation Society Early Career Award, the 2020 Robotics: Science and Systems Early Career Award, and the 2023 Sloan Research Fellowship.