Sicheng Mo

I am a first-year PhD student in Computer Science at UCLA, where I am being advised by Prof. Bolei Zhou. I earned my Bachelor’s degree in Computer Science and Applied Mathematics from the University of Wisconsin-Madison, during which I conducted research under the guidance of Prof. Yin Li and Prof. Fred Sala.

My research lies in Computer Vision. I am particularly interested in large generative models for visual content generation and understanding. I also worked on scalable language-driven video understanding and 3D reconstruction before.

In the past, my work extended to the field of machine learning, specifically on causal inference for Out-of-distribution generalization.

news

Jun 25, 2025	X-Fusion has been accepted to ICCV 2025 and received the Best Paper Award at the CVPR 2025 T4V Workshop.
Jun 09, 2025	I will join Adobe Research again as an intern this summer, working with Dr. Yuheng Li.
Oct 01, 2024	Two papers accepted to NeurIPS 2024! Check out our Ctrl-X and SimGen.
Jun 26, 2024	I will join Adobe Research as an intern this summer.
Feb 26, 2024	Two papers accepted to CVPR 2024! Thanks to my collaborators and advisors!

selected publications

Dreamland: Controllable World Creation with Simulator and Generative Models

Sicheng Mo* , Ziyang Leng* , Leon Liu , Weizhen Wang , Honglin He , and Bolei Zhou

In ArXiv , 2025

HTML PDF
X-Fusion: Introducing New Modality to Frozen Large Language Models

Sicheng Mo , Thao Nguyen , Xun Huang , Siddharth Srinivasan Iyer , Yijun Li , Yuchen Liu , Abhishek Tandon , Eli Shechtman , Krishna Kumar Singh , Yong Jae Lee , and 2 more authors

In Best Paper at CVPR 2025 T4V Workshop
International Conference on Computer Vision (ICCV) , 2025

HTML PDF
Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Kuan Heng Lin* , Sicheng Mo* , Ben Klingher , Fangzhou Mu , and Bolei Zhou

In Advances in Neural Information Processing Systems (NeurIPS) , 2024

HTML PDF
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Sicheng Mo* , Fangzhou Mu* , Kuan Heng Lin , Yanli Liu , Bochen Guan , Yin Li , and Bolei Zhou

In Computer Vision and Pattern Recognition (CVPR) , 2024

HTML PDF
SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu* , Sicheng Mo* , and Yin Li

In Computer Vision and Pattern Recognition (CVPR) , 2024

HTML PDF
Physics to the Rescue: Deep Non-line-of-sight Reconstruction for High-speed Imaging

Fangzhou Mu , Sicheng Mo, Jiayong Peng , Xiaochun Liu , Ji Hyun Nam , Siddeshwar Raghavan , Andreas Velten , and Yin Li

In ICCP/TPAMI , Jun 2022

HTML PDF