Hao Li

I am a MSc student in Mechanical Engineering at Stanford University, advised by Jiajun Wu, Fei-Fei Li, Jeannette Bohg.

I previously received my B.S. in Mechanical Engineering from Purdue University and Shanghai Jiao Tong University, where I was fortunate to be advised by Karthik Ramani on Human-Computer Interaction. Recently, I'm looking for Ph.D. positions. Please let me know if you are interested in my research.

I support diversity, equity, and inclusion. If you would like to have a chat with me regrading research, career plans or anything, feel free to reach out! I would be happy to support people from underrepresented groups in the STEM research community, and hope my expertise can help you.

Email  /  Resume  /  Google Scholar  /  Twitter  /  GitHub

profile photo
  • [Sep 2021] Started at Stanford as a Msc student in Mechanical Engineering.
  • [Aug 2022] "See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation" is accepted by CoRL'22.
  • [Jan 2023] "Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear" is accepted by ICRA'23.
  • [Feb 2023] "The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects" is accepted by CVPR'23.

I'm broadly interested in artificial intelligence and robotics, including but not limited to perception, planning, control, hardware design, and human-centered AI. The goal of my research is to build agents that can achieve human-level of learning and adapt to novel and challenging scenarios by leveraging multisensory information including vision, audio, touch, etc.

The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects
Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu (*Equal Contribution)
Computer Vision and Pattern Recognition (CVPR), Accepted, 2023
dataset demo / project page / arXiv

We introduce the OBJECTFOLDER BENCHMARK, a benchmark suite of 10 tasks for multisensory object-centric learning, and the OBJECTFOLDER REAL dataset, in- cluding the multisensory measurements for 100 real-world household objects.

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
Ruohan Gao*, Hao Li*, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu (*Equal Contribution, in alphabetical order)
International Conference on Robotics and Automation (ICRA), Accepted, 2023
project page / arXiv

We introduce SONICVERSE, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. We demonstrate SONICVERSE’s realism via sim-to-real transfer.

See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation
Hao Li*, Yizhi Zhang*, Junzhe Zhu, Shaoxiong Wang, Michelle A. Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao†, Jiajun Wu(*Equal Contribution) (†Equal Advising)
Conference on Robot Learning (CoRL), 2022
project page / arXiv

We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor, with all three sensory modalities fused with a self-attention model.

VRFromX: From Scanned Reality to Interactive Virtual Experience with Human-in-the-Loop
Ananya Ipsita, Hao Li, Runlin Duan, Yuanzhi Cao, Subramanian Chidambaram, Min Liu, Karthik Ramani
Conference on Human Factors in Computing Systems (CHI), 2021
project page / arXiv / video

Using our VRFromX system, users can select region(s) of interest (ROI) in scanned point cloud or sketch in mid-air using a brush tool to retrieve virtual models and then attach behavioral properties to them.


Course Assistant in AA274A: Principle of Robot Autonomy, Stanford University, 2022

Template from Jon Barron's website