Comparison of CNN, ResNet50, and Xception for Deepfake Image Detection
Keywords:
Deepfake, Convolutional Neural Network (CNN), ResNet50, Xception, Transfer LearningAbstract
This study compares the performance of three deep learning architectures—Convolutional Neural Network , ResNet50, and Xception—for frame-based deepfake image detection and identifies the most effective model in terms of accuracy, precision, recall, F1-score, and generalization. The study followed the Knowledge Discovery in Databases (KDD) framework using the Deepfake Detection Dataset (DFD Entire Original) from Kaggle, which consists of 3,432 videos, including 3,068 fake and 364 real videos. Videos were converted into frames using OpenCV, followed by face detection and cropping using MTCNN. The resulting face images were resized to 224×224 pixels, normalized, augmented, and labeled. To reduce classification bias caused by class imbalance, the training data were balanced using random undersampling, resulting in real frames and fake frames. The dataset was then split into training, validation, and testing sets using a stratified 60:20:20 ratio. The results show that Xception achieved the best performance among the three models, with an accuracy of 95.21%, precision of 0.95, recall of 0.95, and F1-score of 0.95, followed by ResNet50 with an accuracy of 93.42% and CNN with an accuracy of 87.65%. These findings indicate that transfer learning-based architectures, particularly Xception, are more effective than conventional CNNs for deepfake image detection under a consistent experimental setting. This study is limited to a single dataset and frame-based evaluation, thus future work will explore the potential of hybrid models, such as Vision Transformer (ViT) combined with Capsule Networks , to improve detection performance and address challenges like temporal analysis and cross-dataset validation.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Rachmat, Mohammad Zainuddin, Handini Arga Damar Rani

This work is licensed under a Creative Commons Attribution 4.0 International License.











