I am currently a first-year Ph.D. student in the Individualized Interdisciplinary Program at The Hong Kong University of Science and Technology (HKUST), advised by Prof. Anyi Rao and Prof. Huamin Qu. I received my B.Eng. and M.Eng. degrees from Beihang University (BUAA), where I was advised by Prof. Si Liu.

I was fortunate to have interned at several leading tech companies, including Tencent, Alibaba Group, Baidu Inc., and Meituan, where I gained valuable industry experience and insights into real-world applications of computer vision and generative models.

My current research interest includes interactive videos, visual content generation and editing, visual creativity, and storytelling. I am passionate about achieving an immersive visual experience for users through interactive videos and exploring how to leverage advanced generative models to enhance human creativity and storytelling through visual media.

🔥 News

  • Feb. 2026:  🎉🎉 BiCo is accepted by CVPR 2026!
  • Feb. 2026:  🎉🎉 SesaHand is accepted by ICLR 2026!
  • Dec. 2025: Excited to introduce BiCo that enables flexible concept composition from images and videos!

📝 Selected Papers

*: equal contribution, †: corresponding author

CVPR 2026
sym

Composing Concepts from Images and Videos via Concept-prompt Binding

Xianghao Kong, Zeyu Zhang, Yuwei Guo, Zhuoran Zhao, Songchun Zhang, Anyi Rao

Project | Code

  • We introduce Bind & Compose (BiCo), a one-shot method that enables flexible visual concept composition by binding visual concepts with the corresponding prompt tokens and composing the target prompt with bound tokens from various sources.
arXiv Preprint
sym

Taming Flow-based I2V Models for Creative Video Editing

Xianghao Kong, Hansheng Chen, Yuwei Guo, Lvmin Zhang, Gordon Wetzstein, Maneesh Agrawala, Anyi Rao

  • We propose IF-V2V, an Inversion-Free method that can adapt off-the-shelf flow-matching-based I2V models for video editing without significant computational overhead.
arXiv Preprint
sym

ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images

Xianghao Kong, Qiaosong Qi, Yuanbin Wang, Anyi Rao, Biaolong Chen, Aixi Zhang, Si Liu, Hao Jiang

  • ProFashion is a prototype-guided fashion video generation framework leveraging multiple reference images and human keypoint motion flow to achieve improved view consistency and temporal coherence.
ECCV 2024
sym

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Xianghao Kong*, Jinyu Chen*, Wenguan Wang, Hang Su, Xiaolin Hu, Yi Yang, Si Liu

Code

  • We propose C-Instructor, which utilizes the chain-of-thought-style prompt for style-controllable and content-controllable instruction generation.
ACM MM 2023
sym

DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception

Xianghao Kong, Wentao Jiang, Jinrang Jia, Yifeng Shi, Runsheng Xu, Si Liu

Code

  • DUSA decouples the V2X collaborative sim2real domain adaptation problem into two sub-problems: sim2real adaptation and inter-agent adaptation.
CVPR 2022 (Oral)
sym

3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection

Junyu Luo*, Jiahui Fu*, Xianghao Kong, Chen Gao, Haibing Ren, Hao Shen, Huaxia Xia, Si Liu

Code

  • 3D-SPS bridges the gap between detection and matching in the 3D visual grounding task and localizes the target in a single stage.

👨‍🎓 Educations

  • Sept. 2025 - Present, Ph.D. Student in Individualized Interdisciplinary Program, The Hong Kong University of Science and Technology (HKUST)
    • Current Research Directions: Interactive Videos, Visual Content Generation and Editing, Visual Creativity
    • Supervisors: Prof. Anyi Rao, Prof. Huamin Qu
  • Sept. 2022 - Jan. 2025, M.Eng. in Computer Science and Technology, Beihang University (BUAA)
    • Main Research Directions: Visual Content Generation, Vision & Language, Collaborative Perception
    • Supervisor: Prof. Si Liu
    • Standardized Tests: TOEFL 114 / 120, GRE 332 / 340
  • Sept. 2018 - June 2022, B.Eng. in Computer Science and Technology, Beihang University (BUAA)
    • Comprehensive Performance Ranking: 1st / 193

💻 Internships

  • Feb. 2026 - Present, LightSpeed Studios, Tencent
  • June 2024 - Nov. 2024, Taobao & Tmall Group, Alibaba Group
  • Nov. 2022 - July 2023, Baidu Inc.
  • July 2022 - Nov. 2022, Alibaba Group
  • July 2021 - July 2022, Meituan

🎖 Achievements

  • Hong Kong PhD Fellowship (HKPFS)
  • National Scholarship of China (both undergraduate and postgraduate levels)
  • Samsung Scholarship, and many university-level and school-level scholarships
  • Outstanding Graduate of Beijing
  • Merit Student of Beihang University, Outstanding Student of Beihang University

🕴️ Academic Services

  • Organizer of the 7th CVEU Workshop at SIGGRAPH 2025
  • Conference Reviewer for CVPR, ECCV, SIGGRAPH Asia, AAAI, ACM MM, and BMVC