TVG: A Training-free Transition Video Generation Method with Diffusion Models

1Sobey Media Intelligence Laboratory, Chengdu, China
2School of Cyber Science and Engineering, Sichuan University, Chengdu, China
3Key Laboratory of Data Protection and Intelligent Management (Sichuan University), Ministry of Education, China
4University of Electronic Science and Technology of China, Chengdu, China

Abstract

Transition videos play a crucial role in media production, enhancing the flow and coherence of visual narratives. Traditional methods like morphing often lack artistic appeal and require specialized skills, limiting their effectiveness. Recent advances in diffusion model-based video generation offer new possibilities for creating transitions but face challenges such as poor inter-frame relationship modeling and abrupt content changes. We propose a novel training-free Transition Video Generation (TVG) approach using video-level diffusion models that addresses these limitations without additional training. Our method leverages Gaussian Process Regression (GPR) to model latent representations, ensuring smooth and dynamic transitions between frames. Additionally, we introduce interpolation-based conditional controls and a Frequency-aware Bidirectional Fusion (FBiF) architecture to enhance temporal control and transition reliability. Evaluations of benchmark datasets and custom image pairs demonstrate the effectiveness of our approach in generating high-quality smooth transition videos.


Approach

Our Proposed Training-free TVG Method.


1. TC-Bench Dataset.

"Purple liquid in a transparent cup." "Red liquid in a transparent cup" "Transition Video"
"A black cup." "A cup with a collage of photos" "Transition Video"
"A smooth metal surface." "A corroded metal surface" "Transition Video"
"A daddy pig with a newspaper." "A daddy pig with a paper boat." "Transition Video"

2. MorphBench Dataset.

"A cake." "A burger." "Transition Video"
"A dog." "A wolf." "Transition Video"
"An American man." "An American man." "Transition Video"
"A van." "A jeep." "Transition Video"
"A portrait of Vincent van Gogh." "A portrait of a woman with a serene expression, dark hair, and a subtle smile." "Transition Video"

Comparison with Commercial Products

TVG LUMA AI KLing AI Jimeng AI





BibTeX

If you find our work useful, please cite our paper. BibTex code is provided below:
@inproceedings{zhang2024tvg,
        title = {TVG: A Training-free Transition Video Generation Method with Diffusion Models},
        author = {Rui Zhang and Yaosen Chen and Yuegen Liu and  Wei Wang and Xuming Wen and  Hongxia Wang},
        year = {2024},
        booktitle = {arxiv}}