I am currently a first-year Ph.D. student in the Individualized Interdisciplinary Program at The Hong Kong University of Science and Technology (HKUST), advised by Prof. Anyi Rao and Prof. Huamin Qu. I received my B.Eng. and M.Eng. degrees from Beihang University (BUAA), where I was advised by Prof. Si Liu.

I was fortunate to have interned at several leading tech companies, including Tencent, Alibaba Group, Baidu Inc., and Meituan, where I gained valuable industry experience and insights into real-world applications of computer vision and generative models.

My current research interest includes interactive videos, visual content generation and editing, visual creativity, and storytelling. I am passionate about achieving an immersive visual experience for users through interactive videos and exploring how to leverage advanced generative models to enhance human creativity and storytelling through visual media.

🔥 News

Feb. 2026: 🎉🎉 BiCo is accepted by CVPR 2026!
Feb. 2026: 🎉🎉 SesaHand is accepted by ICLR 2026!
Dec. 2025: Excited to introduce BiCo that enables flexible concept composition from images and videos!

📝 Selected Papers

*: equal contribution, †: corresponding author

CVPR 2026

Composing Concepts from Images and Videos via Concept-prompt Binding

Xianghao Kong, Zeyu Zhang, Yuwei Guo, Zhuoran Zhao, Songchun Zhang, Anyi Rao

Project | Code

We introduce Bind & Compose (BiCo), a one-shot method that enables flexible visual concept composition by binding visual concepts with the corresponding prompt tokens and composing the target prompt with bound tokens from various sources.

arXiv Preprint

Taming Flow-based I2V Models for Creative Video Editing

Xianghao Kong, Hansheng Chen, Yuwei Guo, Lvmin Zhang, Gordon Wetzstein, Maneesh Agrawala, Anyi Rao

We propose IF-V2V, an Inversion-Free method that can adapt off-the-shelf flow-matching-based I2V models for video editing without significant computational overhead.

arXiv Preprint

ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images

Xianghao Kong, Qiaosong Qi, Yuanbin Wang, Anyi Rao, Biaolong Chen, Aixi Zhang, Si Liu, Hao Jiang

ProFashion is a prototype-guided fashion video generation framework leveraging multiple reference images and human keypoint motion flow to achieve improved view consistency and temporal coherence.

ECCV 2024

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Xianghao Kong^*, Jinyu Chen^*, Wenguan Wang^†, Hang Su, Xiaolin Hu, Yi Yang, Si Liu^†

Code

We propose C-Instructor, which utilizes the chain-of-thought-style prompt for style-controllable and content-controllable instruction generation.

ACM MM 2023

DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception

Xianghao Kong, Wentao Jiang, Jinrang Jia, Yifeng Shi^†, Runsheng Xu, Si Liu

Code

DUSA decouples the V2X collaborative sim2real domain adaptation problem into two sub-problems: sim2real adaptation and inter-agent adaptation.

CVPR 2022 (Oral)

3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection

Junyu Luo^*, Jiahui Fu^*, Xianghao Kong, Chen Gao^†, Haibing Ren, Hao Shen, Huaxia Xia, Si Liu

Code

3D-SPS bridges the gap between detection and matching in the 3D visual grounding task and localizes the target in a single stage.

SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment, Zhuoran Zhao, Xianghao Kong, Linlin Yang, Zheng Wei, Pan Hui, Anyi Rao, ICLR 2026

👨‍🎓 Educations

Sept. 2025 - Present, Ph.D. Student in Individualized Interdisciplinary Program, The Hong Kong University of Science and Technology (HKUST)
- Current Research Directions: Interactive Videos, Visual Content Generation and Editing, Visual Creativity
- Supervisors: Prof. Anyi Rao, Prof. Huamin Qu
Sept. 2022 - Jan. 2025, M.Eng. in Computer Science and Technology, Beihang University (BUAA)
- Main Research Directions: Visual Content Generation, Vision & Language, Collaborative Perception
- Supervisor: Prof. Si Liu
- Standardized Tests: TOEFL 114 / 120, GRE 332 / 340
Sept. 2018 - June 2022, B.Eng. in Computer Science and Technology, Beihang University (BUAA)
- Comprehensive Performance Ranking: 1st / 193

💻 Internships

Feb. 2026 - Present, LightSpeed Studios, Tencent
June 2024 - Nov. 2024, Taobao & Tmall Group, Alibaba Group
Nov. 2022 - July 2023, Baidu Inc.
July 2022 - Nov. 2022, Alibaba Group
July 2021 - July 2022, Meituan

🎖 Achievements

Hong Kong PhD Fellowship (HKPFS)
National Scholarship of China (both undergraduate and postgraduate levels)
Samsung Scholarship, and many university-level and school-level scholarships
Outstanding Graduate of Beijing
Merit Student of Beihang University, Outstanding Student of Beihang University

🕴️ Academic Services

Organizer of the 7th CVEU Workshop at SIGGRAPH 2025
Conference Reviewer for CVPR, ECCV, SIGGRAPH Asia, AAAI, ACM MM, and BMVC

Xianghao Kong （孔祥浩）

🔥 News

📝 Selected Papers

👨‍🎓 Educations

💻 Internships

🎖 Achievements

🕴️ Academic Services