报告摘要:The rapid development of video generation technologies has recently garnered significant attention from both academia and industry. The increased interest is driven not only by the improvement in visual quality but also by enhanced understanding of real-world physics,powerful generative priors enabled by large-scale pretraining, and advanced controllable generation mechanisms. These advances have paved the way for applications across various fields, including AIGC, embodied AI, gaming, autonomous driving, and more. Despite the vast application potential of interactive video generation, current research still lacks a systematic and in-depth exploration. We begin by reviewing existing work in interactive video generation and propose five core modules that drive its technological development: generation, control, memory, dynamics, and intelligence. This framework clearly outlines the ultimate form of interactive video generation technology, the progress made in each component, and future research directions.Furthermore, our recent work provides a detailed study of the generation, control, and memory components. Specifically, through our two works, GameFactory and Context-as-Memory, we have made significant advances in streaming video generation, generalizable control, and memory for maintaining static scene consistency.
讲者简介:Xihui Liu is an Assistant Professor at the Department of Electrical and Electronic Engineering and Institute of Data Science, The University of Hong Kong. Before joining HKU, she was a postdoc Scholar at UC Berkeley. She obtained her Ph.D. degree from Multimedia Lab (MMLab), the Chinese University of Hong Kong and received her bachelor's degree from Tsinghua University. Her research interests cover computer vision, machine learning, and artificial intelligence, with special emphasis on visual synthesis, generative models, and multimodal AI. She was awarded Adobe Research Fellowship 2020, MIT EECS Rising Stars 2021, and WAIC Rising Stars Award 2022. She serves as area chairs for CVPR, NeurIPS, ICLR, and AAAI, and ACM MM.