Sicheng Mo

my_photo/sm2.jpg

I am a first-year PhD student in Computer Science at UCLA, where I am being advised by Prof. Bolei Zhou. I earned my Bachelor’s degree in Computer Science and Applied Mathematics from the University of Wisconsin-Madison, during which I conducted research under the guidance of Prof. Yin Li and Prof. Fred Sala.

My research lies in Computer Vision. I am particularly interested in large generative models for visual content generation and understanding. I also worked on scalable language-driven video understanding and 3D reconstruction before.

In the past, my work extended to the field of machine learning, specifically on causal inference for Out-of-distribution generalization.

news

Jun 25, 2025 X-Fusion has been accepted to ICCV 2025 and received the Best Paper Award at the CVPR 2025 T4V Workshop.
Jun 09, 2025 I will join Adobe Research again as an intern this summer, working with Dr. Yuheng Li.
Oct 01, 2024 Two papers accepted to NeurIPS 2024! Check out our Ctrl-X and SimGen.
Jun 26, 2024 I will join Adobe Research as an intern this summer.
Feb 26, 2024 Two papers accepted to CVPR 2024! Thanks to my collaborators and advisors!

selected publications

  1. dreamland_thumbnail.png
    Dreamland: Controllable World Creation with Simulator and Generative Models
    Sicheng Mo* , Ziyang Leng* , Leon Liu , Weizhen Wang , Honglin He , and Bolei Zhou
    In ArXiv , 2025
  2. xfusion_thumbnail.png
    X-Fusion: Introducing New Modality to Frozen Large Language Models
    Sicheng Mo , Thao Nguyen , Xun Huang , Siddharth Srinivasan Iyer , Yijun Li , Yuchen Liu , Abhishek Tandon , Eli Shechtman , Krishna Kumar Singh , Yong Jae Lee , and 2 more authors
    In Best Paper at CVPR 2025 T4V Workshop
    International Conference on Computer Vision (ICCV)
    , 2025
  3. ctrl-x.jpg
    Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance
    Kuan Heng Lin* , Sicheng Mo* , Ben Klingher , Fangzhou Mu , and Bolei Zhou
    In Advances in Neural Information Processing Systems (NeurIPS) , 2024
  4. freecontrol1.jpg
    FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
    Sicheng Mo* , Fangzhou Mu* , Kuan Heng Lin , Yanli Liu , Bochen Guan , Yin Li , and Bolei Zhou
    In Computer Vision and Pattern Recognition (CVPR) , 2024
  5. snag.png
    SnAG: Scalable and Accurate Video Grounding
    Fangzhou Mu* , Sicheng Mo* , and Yin Li
    In Computer Vision and Pattern Recognition (CVPR) , 2024
  6. iccp.jpg
    Physics to the Rescue: Deep Non-line-of-sight Reconstruction for High-speed Imaging
    Fangzhou Mu , Sicheng Mo, Jiayong Peng , Xiaochun Liu , Ji Hyun Nam , Siddeshwar Raghavan , Andreas Velten , and Yin Li
    In ICCP/TPAMI , Jun 2022