Abstract
Recent progress in vision-language-action models has made embodied intelligence increasingly promising, but current robotic demonstrations still expose several system-level bottlenecks, including execution mismatch, inference latency, and limited safety integration at the planning level. In this talk, I will present my research toward autonomous learning in real-world robotics through the joint lens of control, learning, and optimization. I will first introduce a model-based online learning framework for adaptive control, with rigorous convergence guarantees and successful evaluation on a pneumatic table-tennis robot, a soft robotic system, and a heavy-duty excavator. I will then discuss constraint-aware generative planning through a diffusion-based planner for obstacle avoidance in autonomous racing, where constraints are incorporated directly into the planning process. Finally, I will present my work on efficient inference of large foundation models on edge devices under memory and compute constraints, aiming to make large-model capabilities practical for real robotic deployment. Together, these directions form a system-level framework for embodied intelligence that is adaptive, safe, and deployable, and I will conclude by discussing future opportunities in vision-based and multimodal robot learning for contact-rich manipulation.
About the speaker
Hao Ma is currently a Postdoctoral Researcher at ETH Zurich and a Scientific Researcher at the Max Planck Institute for Intelligent Systems. He received his Bachelor’s degree in Energy and Power Engineering from Jilin University in 2017, his Master’s degree in Automotive Engineering from the Technical University of Munich from 2019 to 2021, and his Doctorate in Dynamic Systems and Control from ETH Zurich from 2022 to 2025. During his Ph.D., he was also affiliated with both ETH Zurich and the Max Planck Institute for Intelligent Systems through the highly competitive Max Planck-ETH Center for Learning Systems Fellowship. His research lies at the intersection of control theory and machine learning, with a focus on enabling robots to learn autonomously in the real world. His current interests include vision-based and multimodal robot learning, contact-rich manipulation, and on-device intelligence, with an emphasis on system-level solutions for real-world robotic autonomy.
