Temporal Prototype Learning (TPL)

Synchronization of Multiple Videos

Avihai Naaman^*, Ron Shapira Weber^*, Oren Freifeld

Ben-Gurion University of the Negev, Israel
ICCV 2025
^*Indicates Equal Contribution

Abstract

Synchronizing videos captured simultaneously from multiple cameras in the same scene is often easy and typically requires only simple time shifts. However, synchronizing videos from different scenes or, more recently, generative AI videos, poses a far more complex challenge due to diverse subjects, backgrounds, and nonlinear temporal misalignment. We propose Temporal Prototype Learning (TPL), a prototype-based framework that constructs a shared, compact 1D representation from high-dimensional embeddings extracted by any of various pretrained models. TPL robustly aligns videos by learning a unified prototype sequence that anchors key action phases, thereby avoiding exhaustive pairwise matching. Our experiments show that TPL improves synchronization accuracy, efficiency, and robustness across diverse datasets, including fine-grained frame retrieval and phase classification tasks. Importantly, TPL is the first approach to mitigate synchronization issues in multiple generative AI videos depicting the same action.

Videos Synchronization by TPL - GEN-MVS Dataset

Top - Original Videos. Bottom - Videos After Synchronization.