Hao Li / 李 皓

I am a PhD student in Mechanical Engineering at Biomimetics & Dexterous Manipulation Laboratory, advised by Prof. Mark Cutkosky. Additionaly, I am actively engaged in the OceanOneK project and have the previlige of working with Prof. Oussama Khatib.

I was a MSc student in Stanford Vision and Learning Lab, working with Prof. Jiajun Wu and Prof. Fei-Fei Li. I previously received my dual B.S. in Mechanical Engineering from Shanghai Jiao Tong University and Purdue University, where I was fortunate to be advised by Karthik Ramani on Human-Computer Interaction.

I support diversity, equity, and inclusion. If you would like to have a chat with me regrading research, career plans or anything, feel free to reach out! I would be happy to support people from underrepresented groups in the STEM research community, and hope my expertise can help you.

Email  /  Resume  /  Google Scholar  /  Twitter  /  GitHub

profile photo
  • [Oct 2023] Passed my PhD Qualifying Exam!
  • [Jun 2023] "The Design of a Virtual Prototyping System for Authoring Interactive VR Environments from Real World Scans" is accepted by JCISE.
  • [Feb 2023] "The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects" is accepted by CVPR'23.
  • [Jan 2023] "Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear" is accepted by ICRA'23.
  • [Aug 2022] "See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation" is accepted by CoRL'22.
  • [Sep 2021] Started at Stanford as a MSc student in Mechanical Engineering.
Research ( Show Selected | Show All )

I'm broadly interested in artificial intelligence and robotics, including but not limited to perception, planning, control, hardware design, and human-centered AI. The goal of my research is to build agents that can achieve human-level of learning and adapt to novel and challenging scenarios by leveraging multisensory information including vision, audio, touch, etc.

Navigation and 3D Surface Reconstruction from Passive Whisker Sensing
Michael A. Lin, Hao Li, Chengyi Xing, Mark Cutkosky
Under Review
project page / arXiv / video / code

We present a method for using highly flexible, curved, passive whiskers mounted along a robot arm to gather sensory data as they brush past objects during normal robot motion.

The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects
Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu (*Equal Contribution)
Computer Vision and Pattern Recognition (CVPR), 2023
dataset demo / project page / arXiv / video / code

We introduce the OBJECTFOLDER BENCHMARK, a benchmark suite of 10 tasks for multisensory object-centric learning, and the OBJECTFOLDER REAL dataset, in- cluding the multisensory measurements for 100 real-world household objects.

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
Ruohan Gao*, Hao Li*, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu (*Equal Contribution, in alphabetical order)
International Conference on Robotics and Automation (ICRA), 2023
project page / arXiv / video / code

We introduce SONICVERSE, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. We demonstrate SONICVERSE’s realism via sim-to-real transfer.

See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation
Hao Li*, Yizhi Zhang*, Junzhe Zhu, Shaoxiong Wang, Michelle A. Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao†, Jiajun Wu(*Equal Contribution) (†Equal Advising)
Conference on Robot Learning (CoRL), 2022
project page / arXiv / video / code

We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor, with all three sensory modalities fused with a self-attention model.

Academic Services

Reviewer for CoRL, RAL, CHI


Course Assistant in AA274A: Principle of Robot Autonomy, Stanford University, 2022

Course Assistant in CS231N: Deep Learning for Computer Vision, Stanford University, 2023

Template from Jon Barron's website