publications

2025

Arxiv

Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration

Sicheng Mo, Thao Nguyen, Richard Zhang, Nicholas Kolkin, Siddharth Srinivasan Iyer, Eli Shechtman, Krishna Kumar Singh, Yong Jae Lee, Bolei Zhou, and Yuheng Li

In ArXiv, 2025

Website PDF Code
Arxiv

Relational Visual Similarity

Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, and Yuheng Li

In ArXiv, 2025

Website PDF Code
Arxiv

Dreamland: Controllable World Creation with Simulator and Generative Models

Sicheng Mo^*, Ziyang Leng^*, Leon Liu, Weizhen Wang, Honglin He, Zhang Huizhi, and Bolei Zhou

In ArXiv, 2025

Website PDF
Arxiv

X-Fusion: Introducing New Modality to Frozen Large Language Models

Sicheng Mo, Thao Nguyen, Xun Huang, Siddharth Srinivasan Iyer, Yijun Li, Yuchen Liu, Abhishek Tandon, Eli Shechtman, Krishna Kumar Singh, Yong Jae Lee, Bolei Zhou, and Yuheng Li

In International Conference on Computer Vision (ICCV) , 2025

Best Paper at CVPR 2025 T4V Workshop

Website PDF

2024

NeurIPS

SimGen: Simulator-conditioned Driving Scene Generation

Yunsong Zhou, Michael Simon, Zhenghao Peng, Sicheng Mo, Hongzhi Zhu, Minyi Guo, and Bolei Zhou

In Advances in Neural Information Processing Systems (NeurIPS), 2024

Website PDF
NeurIPS

Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

Kuan Heng Lin^*, Sicheng Mo^*, Ben Klingher, Fangzhou Mu, and Bolei Zhou

In Advances in Neural Information Processing Systems (NeurIPS), 2024

Website PDF
CVPR

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Sicheng Mo^*, Fangzhou Mu^*, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, and Bolei Zhou

In Computer Vision and Pattern Recognition (CVPR), 2024

Website PDF Code
CVPR

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu^*, Sicheng Mo^*, and Yin Li

In Computer Vision and Pattern Recognition (CVPR), 2024

Website PDF

2022

arXiv

A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge

Sicheng Mo, Fangzhou Mu, and Yin Li

In arXiv, Oct 2022

Website PDF
arXiv

Where a Strong Backbone Meets Strong Features–ActionFormer for Ego4D Moment Queries Challenge

Fangzhou Mu, Sicheng Mo, Gillian Wang, and Yin Li

In arXiv, Oct 2022

Website PDF
ICMLW

Causal Omnivore: Fusing Noisy Estimates of Spurious Correlations

Dyah Adila, Sonia Cromp, Sicheng Mo, and Frederic Sala

In ICML’22 Workshop on Spurious Correlations, Invariance, and Stability, Jun 2022

PDF
TPAMI

Physics to the Rescue: Deep Non-line-of-sight Reconstruction for High-speed Imaging

Fangzhou Mu, Sicheng Mo, Jiayong Peng, Xiaochun Liu, Ji Hyun Nam, Siddeshwar Raghavan, Andreas Velten, and Yin Li

In ICCP/TPAMI, Jun 2022

Website PDF