Thanks for stopping by! 👋
I am currently a PhD candidate at the HCP-I2 Lab in Sun Yat-sen University advised by Prof. Xiaodan Liang. Previously, I obtained both my Bachelor’s and Master’s degree at SYSU under the supervision of Prof. Shancheng Jiang.
Currently, I am interested in building generalizable multimodal reasoning systems grounded from data-centric perspective. I am working to answer:
1) Inward: How can VLMs enhance perception and reasoning for real-world comprehension?
2) Outward: How can they facilitate modeling and interaction for physical engagement?
Full publication list on Google Scholar. (* denotes equal contribution)
SeePhys: Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning
Neural Information Processing Systems (NeurIPS), 2025.
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
Toward robust diagnosis: A contour attention preserving adversarial defense for covid-19 detection
Proceedings of the AAAI Conference on Artificial Intelligence, 2023
Applied Soft Computing, 2021
Outstanding Graduate of SYSU
National Scholarship
Postgraduate Scholarship of SYSU