Hao Li

I am a PhD student in Mechanical Engineering at Biomimetics & Dexterous Manipulation Laboratory, advised by Prof. Mark Cutkosky. Additionaly, I am actively engaged in the OceanOneK project and have the previlige of working with Prof. Oussama Khatib.

I was a MSc student in Stanford Vision and Learning Lab, working with Prof. Jiajun Wu and Prof. Fei-Fei Li. I previously received my dual B.S. in Mechanical Engineering from Shanghai Jiao Tong University and Purdue University, where I was fortunate to be advised by Karthik Ramani on Human-Computer Interaction.

I support diversity, equity, and inclusion. If you would like to have a chat with me regrading research, career plans or anything, feel free to reach out! I would be happy to support people from underrepresented groups in the STEM research community, and hope my expertise can help you.

Email  /  Resume  /  Google Scholar  /  Twitter  /  GitHub

profile photo
News
  • [Oct 2023] Passed my PhD Qualifying Exam!
  • [Jun 2023] "The Design of a Virtual Prototyping System for Authoring Interactive VR Environments from Real World Scans" is accepted by JCISE.
  • [Feb 2023] "The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects" is accepted by CVPR'23.
  • [Jan 2023] "Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear" is accepted by ICRA'23.
  • [Aug 2022] "See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation" is accepted by CoRL'22.
  • [Sep 2021] Started at Stanford as a Msc student in Mechanical Engineering.
Research

I'm broadly interested in artificial intelligence and robotics, including but not limited to perception, planning, control, hardware design, and human-centered AI. The goal of my research is to build agents that can achieve human-level of learning and adapt to novel and challenging scenarios by leveraging multisensory information including vision, audio, touch, etc.

The Design of a Virtual Prototyping System for Authoring Interactive VR Environments from Real World Scans
Ananya Ipsita*, Runlin Duan*, Hao Li*, Subramanian Chidambaram, Yuanzhi Cao, Min Liu, Alexander J Quinn, Karthik Ramani (*Equal Contribution)
Journal of Computing and Information Science in Engineering (JCISE)
arXiv

Using our VRFromX system, we performed a usability evaluation with 20 DUs from which 12 were novices in VR programming with a welding use case.

The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects
Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu (*Equal Contribution)
Computer Vision and Pattern Recognition (CVPR), 2023
dataset demo / project page / arXiv

We introduce the OBJECTFOLDER BENCHMARK, a benchmark suite of 10 tasks for multisensory object-centric learning, and the OBJECTFOLDER REAL dataset, in- cluding the multisensory measurements for 100 real-world household objects.

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
Ruohan Gao*, Hao Li*, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu (*Equal Contribution, in alphabetical order)
International Conference on Robotics and Automation (ICRA), 2023
project page / arXiv

We introduce SONICVERSE, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. We demonstrate SONICVERSE’s realism via sim-to-real transfer.

See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation
Hao Li*, Yizhi Zhang*, Junzhe Zhu, Shaoxiong Wang, Michelle A. Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao†, Jiajun Wu(*Equal Contribution) (†Equal Advising)
Conference on Robot Learning (CoRL), 2022
project page / arXiv

We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor, with all three sensory modalities fused with a self-attention model.

VRFromX: From Scanned Reality to Interactive Virtual Experience with Human-in-the-Loop
Ananya Ipsita, Hao Li, Runlin Duan, Yuanzhi Cao, Subramanian Chidambaram, Min Liu, Karthik Ramani
Conference on Human Factors in Computing Systems (CHI), 2021
project page / arXiv / video

Using our VRFromX system, users can select region(s) of interest (ROI) in scanned point cloud or sketch in mid-air using a brush tool to retrieve virtual models and then attach behavioral properties to them.

Academic Services

Reviewer for CoRL, RAL, CHI

Teaching

Course Assistant in AA274A: Principle of Robot Autonomy, Stanford University, 2022

Course Assistant in CS231N: Deep Learning for Computer Vision, Stanford University, 2023


Template from Jon Barron's website