Attention-Driven CNN-LSTM Fusion for Robust Deepfake Detection in Digital Media
DOI:
https://doi.org/10.26437/z15r9q80Keywords:
Attention mechanism. CNN. deepfake. deep learning. LSTMAbstract
Purpose: This paper aims to address the growing challenge of deepfake detection, driven by the increasing impact of synthetic media on digital integrity, privacy, and security.
Design/Methodology/ Approach: The proposed approach integrates a hybrid deep learning architecture combining Convolutional Neural Networks (CNNs) to extract spatial features and Long Short-Term Memory (LSTM) to model temporal relationships, enhanced by an attention mechanism to focus on important features and subtle manipulation patterns. The methodology includes video preprocessing such as frame extraction, face detection, alignment, and normalisation, followed by sequence-level classification.
Research Limitation: The study is limited by its reliance on benchmark datasets, which may not fully represent real-world scenarios, and by potential challenges in generalising to unseen manipulations. Additionally, no funding support was reported.
Findings: The model is evaluated on the FaceForensics++ dataset using standard metrics, including accuracy, precision, recall, F1-score, and ROC-AUC, demonstrating improved performance in detecting deepfake videos.
Practical Implication: The proposed model can be applied in security systems, social media platforms, and digital forensics to detect and prevent the spread of manipulated video content.
Social Implication: The work contributes to reducing misinformation, enhancing trust in digital media, and protecting user privacy and societal security.
Originality/Value: The originality of this work lies in integrating CNN, LSTM, and attention mechanisms into a unified framework for spatiotemporal feature learning, providing a scalable and effective solution for deepfake detection.
References
Al-betar, M. A., Abasi, A. K., Al-naymat, G., Sharif, A., & Makhadmeh, N. (2023). Bare-Bones Based Salp Swarm Algorithm for Text Document Clustering. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3314589 DOI: https://doi.org/10.1109/ACCESS.2023.3314589
Alin, A. Y., & Yuana, K. A. (2023). The Effect of Data Augmentation in Deep Learning with Drone Object Detection. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 17(3), 237–248. DOI: https://doi.org/10.22146/ijccs.84785
Andreoni, M., Lunardi, W. T., Lawton, G., & Thakkar, S. (2024). Enhancing autonomous system security and resilience with generative AI: A comprehensive survey. IEEE Access, 12, 109470-109493. DOI: https://doi.org/10.1109/ACCESS.2024.3439363
Belousov, S. (2021). MobileStyleGAN: A lightweight convolutional neural network for high-fidelity image synthesis. arXiv preprint arXiv:2104.04767.
Bond-Taylor, S., Leach, A., Long, Y., & Willcocks, C. G. (2021). Deep generative modelling: A comparative review of VAEs, GANs, normalising flows, energy-based and autoregressive models. IEEE transactions on pattern analysis and machine intelligence, 44(11), 7327–7347. DOI: https://doi.org/10.1109/TPAMI.2021.3116668
Delavari, A., Ghoreishy, F., Shahhoseini, H. S., & Mirzakuchaki, S. (2024, August). A reconfigurable approximate computing RISC-V platform for fault-tolerant applications. In 2024 27th Euromicro Conference on Digital System Design (DSD) (pp. 81-89). IEEE. DOI: https://doi.org/10.1109/DSD64264.2024.00020
Ding, X., Zhang, X., Han, J., Ding, G., & Sun, J. (2019). RepVGG : Making VGG-style ConvNets Great Again. Conference: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017. DOI: https://doi.org/10.1109/CVPR46437.2021.01352
Duan, Q., Wang, L., Kang, H., Shen, Y., Sun, X., & Chen, Q. (2021). SS symmetry Improved Salp Swarm Algorithm with Simulated Annealing for Solving Engineering Optimization Problems. DOI: https://doi.org/10.3390/sym13061092
Elnour, M., & Dalam, E. (2023). DeepFake on Face and Expression Swap : A Review. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3324403 DOI: https://doi.org/10.1109/ACCESS.2023.3324403
Fan, J., Li, R., Zhang, C. H., & Zou, H. (2020). Statistical foundations of data science. Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429096280
Fayyaz, U., & Jumani, T. A. (2026). A Comprehensive Review of Deepfake Detection Techniques : From Traditional Machine Learning to Advanced Deep Learning Architectures. Medical & Healthcare AI. https://doi.org/doi.org/10.3390/ai7040129
Gan, J., & Liu, J. (2024). Applied Research on Face Image Beautification Based on a Generative Adversarial Network. Electronics, 13(23). https://doi.org/doi.org/10.3390/electronics13234780 DOI: https://doi.org/10.3390/electronics13234780
Grill, J. B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., ... & Valko, M. (2020). Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33, 21271-21284.
Guetschel, P., Ahmadi, S., & Tangermann, M. (2024). Journal of Neural Engineering OPEN ACCESS Review of deep representation learning techniques for brain – computer interfaces. Journal of Neural Engineering, 21. https://doi.org/10.1088/1741-2552/ad8962 DOI: https://doi.org/10.1088/1741-2552/ad8962
Haitham, A., Amir, A., & Nemer, Z. N. (2025a). Deep Learning-Based Siamese Neural Network for Masked Face Recognition. Journal of Information Systems Engineering and Management, 10, 867–882. https://doi.org/DOI:10.52783/jisem.v10i50s.10403 DOI: https://doi.org/10.52783/jisem.v10i50s.10403
Haitham, A., Amir, A., & Nemer, Z. N. (2025b). Inclusive Review on Advances in Masked Human Face Recognition Technologies. Iraqi Journal of Intelligent Computing and Informatics (IJICI), 4(June), 1–17. https://doi.org/10.52940/ijici.v4i1.71 DOI: https://doi.org/10.52940/ijici.v4i1.71
Hazan, E., Klivans, A., & Yuan, Y. (2017). Hyperparameter optimization: A spectral approach. arXiv preprint arXiv:1706.00764.
Jin, H., Huang, L., Cai, H., Yan, J., Li, B., & Chen, H. (2020). From LLMs to LLM-based Agents for Software Engineering : A Survey of Current , Challenges and Future. Journal of Business Economics, 18(9), 1–50.
Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. Artificial intelligence review, 53(8), 5455-5516. DOI: https://doi.org/10.1007/s10462-020-09825-6
Khoshdeli, M., Cong, R., & Parvin, B. (2017, February). Detection of nuclei in H&E stained sections using convolutional neural networks. In 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) (pp. 105-108). IEEE. DOI: https://doi.org/10.1109/BHI.2017.7897216
Kilichev, D. (2023). Hyperparameter Optimization for 1D-CNN-Based Network Intrusion Detection Using GA and PSO. 1–31. DOI: https://doi.org/10.3390/math11173724
Kumar, A., Somya, J., Sahoo, R., & Kaubiyal, J. (2021). Online social networks security and privacy : comprehensive review and analysis. Complex & Intelligent Systems, 0123456789. https://doi.org/10.1007/s40747-021-00409-7 DOI: https://doi.org/10.1007/s40747-021-00409-7
Li, C., Jiang, J., Zhao, Y., Li, R., Wang, E., Zhang, X., & Zhao, K. (2022, June). Genetic algorithm based hyper-parameters optimization for transfer convolutional neural network. In International Conference on Advanced Algorithms and Neural Networks (AANN 2022) (Vol. 12285, pp. 232–241). SPIE. DOI: https://doi.org/10.1117/12.2637170
Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid search, random search, genetic algorithm: a big comparison for NAS. arXiv preprint arXiv:1912.06059.
Liu, H., An, J., Jia, X., Gan, L., Karagiannidis, G. K., Clerckx, B., ... & Cui, T. J. (2025). Stacked intelligent metasurfaces for wireless communications: Applications and challenges. IEEE Wireless Communications, 32(4), 46-53. DOI: https://doi.org/10.1109/MWC.001.2500002
Mohammad, S., & Moosavi, S. (2023). Examining StyleGAN as a Utility-Preserving Face De-identification Method. In Proceedings of Proceedings on Privacy Enhancing Technologies (Vol. 2023, Issue 4). Association for Computing Machinery. https://doi.org/10.56553/popets-2023-0114 DOI: https://doi.org/10.56553/popets-2023-0114
Mubarak, R., Alsboui, T., Alshaikh, O., Inuwa-dute, I. S. A., Khan, S., & Parkinson, S. (2023). A Survey on the Detection and Impacts of Deepfakes in Visual , Audio , and Textual Formats. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3344653 DOI: https://doi.org/10.1109/ACCESS.2023.3344653
Negi, S., Jayachandran, M., & Upadhyay, S. (2021). Deep fake: an understanding of fake images and videos. International Journal of Scientific Research in Computer Science Engineering and Information Technology, 7(3), 183-189. DOI: https://doi.org/10.32628/CSEIT217334
Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. T., , Duc Thanh Nguyen a, Thien Huynh-The c, Saeid Nahavandi d, Thanh Tam Nguyen e, Q.-V. P. f, & G, C. M. N. (2022). Deep learning for deepfakes creation and detection: A survey. Computer Vision and Image Understanding, 223. https://doi.org/doi.org/10.1016/j.cviu.2022.103525 DOI: https://doi.org/10.1016/j.cviu.2022.103525
Qing, C., Ruan, J., Xu, X., Ren, J., & Zabalza, J. (2019). Spatial-spectral classification of hyperspectral images : a deep learning framework with Markov Random fields based modelling. IET Image ProcessingVolume, 13(2). https://doi.org/10.1049/iet-ipr.2018.5727 DOI: https://doi.org/10.1049/iet-ipr.2018.5727
Review, A. S. (2023). Digital Face Manipulation Creation and Detection : Electronics, 12(16), 1–37. https://doi.org/10.3390/electronics12163407 DOI: https://doi.org/10.3390/electronics12163407
Rybnicek, R., & Königsgruber, R. (2019). What makes industry – university collaboration succeed ? A systematic review of the literature. In Journal of Business Economics (Vol. 89, Issue 2). Springer Berlin Heidelberg. https://doi.org/10.1007/s11573-018-0916-6 DOI: https://doi.org/10.1007/s11573-018-0916-6
Series, C. (2019). An Overview of Overfitting and its Solutions An Overview of Overfitting and its Solutions. Journal of Physics: Conference Series. https://doi.org/10.1088/1742-6596/1168/2/022022 DOI: https://doi.org/10.1088/1742-6596/1168/2/022022
Westerlund, M. (2019). The Emergence of Deepfake Technology : A Review. Technology Innovation Management Review. https://doi.org/10.22215/timreview/1282 DOI: https://doi.org/10.22215/timreview/1282
Zhang, J., Tao, C., Xu, Z., Xie, Q., Chen, W., & Yan, R. (2019, July). Ensemblegan: Adversarial learning for retrieval-generation ensemble model on short-text conversation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 435–444). DOI: https://doi.org/10.1145/3331184.3331193
Downloads
Published
Issue
Section
License
Copyright (c) 2026 AFRICAN JOURNAL OF APPLIED RESEARCH

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
By submitting and publishing your articles in the African Journal of Applied Research, you agree to transfer the copyright of the Article from the authors to the Journal ( African Journal of Applied Research).