Open-Source Lip-Sync Models in the Period 2020-2025: A Structured Comparative Analysis

Authors

  • Bilge Nur Saglam Aktif Investment Bank Inc.
  • Mustafa Keles Aktif Investment Bank, Inc.
  • Mehmet Kutanoglu Aktif Investment Bank, Inc.

DOI:

https://doi.org/10.55549/epess.971

Keywords:

Lip synchronization, Open-source, Artificial intelligence, Deep learning, GAN, Diffusion models, Wav2Lip, GeneFace

Abstract

Recent advancements in artificial intelligence have led to significant progress in the field of lip synchronization (lip-sync). This paper presents a systematic literature review focusing on popular open-source lip-sync models developed between 2020 and 2025, a period marked by the rapid evolution of deep generative models. Our aim was to examine and classify the prominent models of this era based on their architecture, performance, and technological approaches. To conduct our review, we searched the IEEE Xplore and Scopus databases. This study is based on three main methods most commonly used in the field: Generative Adversarial Networks (GANs), Transformers, and Diffusion Models. Each method was analyzed in detail using its popular representatives: Wav2Lip (GAN), GeneFace (Transformer/NeRF), and Diff2Lip (Diffusion). In this study, the training processes, architectural features, and performance metrics, such as video quality, synchronization accuracy, and computational cost, of these models were compared. Our findings indicate that diffusion models have recently gained prominence because they offer photorealistic outputs and stable training processes, although GAN-based models such as Wav2Lip are still widely used. This review serves as a comprehensive guide for researchers by summarizing the current state of the art in the field. Furthermore, it aims to contribute to new work by discussing the current challenges faced by lip-sync technologies and future research directions (e.g., real-time performance and multilingual support).

Downloads

Published

2025-11-30

Issue

Section

Articles

How to Cite

Open-Source Lip-Sync Models in the Period 2020-2025: A Structured Comparative Analysis. (2025). The Eurasia Proceedings of Educational and Social Sciences, 46, 137-143. https://doi.org/10.55549/epess.971