Research Assistant Professor
Shanghai Jiao Tong University, School of Artificial Intelligence
上海交通大学人工智能学院 助理研究员, 硕士生导师
Member of Machine Vision and Intelligence Group (MVIG) at SJTUEmail: siriusyang at sjtu dot edu dot cn
Office: Bldg. SAI, No. 1954 Huashan Rd., Xuhui Dist., Shanghai, 200230, China
About.  I’m a Research Assistant Professor in Shanghai Jiao Tong University (SJTU),
affiliated with the School of Artificial Intelligence
(SAI),
where I joined in September 2024.
I obtained Ph.D. degree in Computer Science from SJTU in 2023, advised by Prof. Cewu
Lu at the Machine Vision and Intelligence Group and
M.S. degree in Mechanical Engineering, SJTU.
My research interests include 3D Vision and
Robotics.
Currently, I am focusing on modeling and imitating the hand manipulating objects,
including 3D hand | object pose | shape estimation,
grasp | motion generation, imitation learning, dexterous manipulation.
Join Us.  I am looking for Master Student at SJTU SAI and self-motivated research
interns. Contact me if you are interested
in
the above topics.
诚意科研研究实习生(带薪), 我们一起做有意思的科研。
A cross-embodiment framework that transfers wheeled-humanoid data to bipedal VLA models via morphology-agnostic 6D end-effector trajectories and a heuristic-enhanced online DAgger controller.
Multi-view Hand Reconstruction with a Point-Embedded Transformer
POEM-v2 a generalizable multi-view 3D hand reconstruction model trained on large-scale multi-view datasets. It enables accurate, flexible, and occlusion-robust hand mesh recovery across arbitrary multi-view setups.
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
A low-cost exoskeleton system for large-scale in-the-wild demonstration
collection.
It transforms the collected in-the-wild demonstrations into pseudo-robot demonstrations.
Dense Policy: Bidirectional Autoregressive Learning of Actions
A bidirectional robotic autoregressive policy, which infers trajectories by gradually expanding actions from sparse keyframes, demonstrated exceeding diffusion policies.
Motion Before Action: Diffusing Object Motion as Manipulation Condition
A MLLM-based method that infuses language instructions into grasp generation; & A new
language-pose
dataset, CapGrasp,
featuring detailed caption of grasping poses.
OakInk2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion
Extends CPF into a unified spring-mass-style contact field that couples attraction, repulsion, and geometry constraints for robust hand-object reconstruction.
CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction
Human-Robot Data Companion: Pipeline and Representation. [2025.09]
SII TechFest workshop: Embodied AI Reasoning and Scaling. Thank Panpan Cai for hosting.
Paving the Way for Understanding Human Interactions with Objects: The OakInk2 Dataset.
[2023.08] ICCV 2023 HANDS Workshop, Thank Linlin Yang for hosting.