Attention-Driven CNN-LSTM Fusion for Robust Deepfake Detection in Digital Media

S. E. Hammed; S. Al-Darraji

doi:10.26437/z15r9q80

Authors

S. E. Hammed University of Basra, Basra, Iraq.
S. Al-Darraji University of Basra, Basra, Iraq.

DOI:

https://doi.org/10.26437/z15r9q80

Keywords:

Attention mechanism. CNN. deepfake. deep learning. LSTM

Abstract

Purpose: This paper aims to address the growing challenge of deepfake detection, driven by the increasing impact of synthetic media on digital integrity, privacy, and security.

Design/Methodology/ Approach: The proposed approach integrates a hybrid deep learning architecture combining Convolutional Neural Networks (CNNs) to extract spatial features and Long Short-Term Memory (LSTM) to model temporal relationships, enhanced by an attention mechanism to focus on important features and subtle manipulation patterns. The methodology includes video preprocessing such as frame extraction, face detection, alignment, and normalisation, followed by sequence-level classification.

Research Limitation: The study is limited by its reliance on benchmark datasets, which may not fully represent real-world scenarios, and by potential challenges in generalising to unseen manipulations. Additionally, no funding support was reported.

Findings: The model is evaluated on the FaceForensics++ dataset using standard metrics, including accuracy, precision, recall, F1-score, and ROC-AUC, demonstrating improved performance in detecting deepfake videos.

Practical Implication: The proposed model can be applied in security systems, social media platforms, and digital forensics to detect and prevent the spread of manipulated video content.

Social Implication: The work contributes to reducing misinformation, enhancing trust in digital media, and protecting user privacy and societal security.

Originality/Value: The originality of this work lies in integrating CNN, LSTM, and attention mechanisms into a unified framework for spatiotemporal feature learning, providing a scalable and effective solution for deepfake detection.

Author Biographies

S. E. Hammed, University of Basra, Basra, Iraq.

Shahad E. Hammed is a Postgraduate student in the Department of Computer Science, College of Computer Science and Information Technology at the University of Basra, Basra, Iraq.
S. Al-Darraji, University of Basra, Basra, Iraq.

Dr.Salah Al-Darraji is an Assistant. Professor in the Department of Computer Science, College of Computer Science and Information Technology at the University of Basra, Basra, Iraq.

References

Al-betar, M. A., Abasi, A. K., Al-naymat, G., Sharif, A., & Makhadmeh, N. (2023). Bare-Bones Based Salp Swarm Algorithm for Text Document Clustering. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3314589 DOI: https://doi.org/10.1109/ACCESS.2023.3314589

Alin, A. Y., & Yuana, K. A. (2023). The Effect of Data Augmentation in Deep Learning with Drone Object Detection. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 17(3), 237–248. DOI: https://doi.org/10.22146/ijccs.84785

Andreoni, M., Lunardi, W. T., Lawton, G., & Thakkar, S. (2024). Enhancing autonomous system security and resilience with generative AI: A comprehensive survey. IEEE Access, 12, 109470-109493. DOI: https://doi.org/10.1109/ACCESS.2024.3439363

Belousov, S. (2021). MobileStyleGAN: A lightweight convolutional neural network for high-fidelity image synthesis. arXiv preprint arXiv:2104.04767.

Bond-Taylor, S., Leach, A., Long, Y., & Willcocks, C. G. (2021). Deep generative modelling: A comparative review of VAEs, GANs, normalising flows, energy-based and autoregressive models. IEEE transactions on pattern analysis and machine intelligence, 44(11), 7327–7347. DOI: https://doi.org/10.1109/TPAMI.2021.3116668

Delavari, A., Ghoreishy, F., Shahhoseini, H. S., & Mirzakuchaki, S. (2024, August). A reconfigurable approximate computing RISC-V platform for fault-tolerant applications. In 2024 27th Euromicro Conference on Digital System Design (DSD) (pp. 81-89). IEEE. DOI: https://doi.org/10.1109/DSD64264.2024.00020

Ding, X., Zhang, X., Han, J., Ding, G., & Sun, J. (2019). RepVGG : Making VGG-style ConvNets Great Again. Conference: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017. DOI: https://doi.org/10.1109/CVPR46437.2021.01352

Duan, Q., Wang, L., Kang, H., Shen, Y., Sun, X., & Chen, Q. (2021). SS symmetry Improved Salp Swarm Algorithm with Simulated Annealing for Solving Engineering Optimization Problems. DOI: https://doi.org/10.3390/sym13061092

Elnour, M., & Dalam, E. (2023). DeepFake on Face and Expression Swap : A Review. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3324403 DOI: https://doi.org/10.1109/ACCESS.2023.3324403

Fan, J., Li, R., Zhang, C. H., & Zou, H. (2020). Statistical foundations of data science. Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429096280

Fayyaz, U., & Jumani, T. A. (2026). A Comprehensive Review of Deepfake Detection Techniques : From Traditional Machine Learning to Advanced Deep Learning Architectures. Medical & Healthcare AI. https://doi.org/doi.org/10.3390/ai7040129

Gan, J., & Liu, J. (2024). Applied Research on Face Image Beautification Based on a Generative Adversarial Network. Electronics, 13(23). https://doi.org/doi.org/10.3390/electronics13234780 DOI: https://doi.org/10.3390/electronics13234780

Grill, J. B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., ... & Valko, M. (2020). Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33, 21271-21284.

Guetschel, P., Ahmadi, S., & Tangermann, M. (2024). Journal of Neural Engineering OPEN ACCESS Review of deep representation learning techniques for brain – computer interfaces. Journal of Neural Engineering, 21. https://doi.org/10.1088/1741-2552/ad8962 DOI: https://doi.org/10.1088/1741-2552/ad8962

Haitham, A., Amir, A., & Nemer, Z. N. (2025a). Deep Learning-Based Siamese Neural Network for Masked Face Recognition. Journal of Information Systems Engineering and Management, 10, 867–882. https://doi.org/DOI:10.52783/jisem.v10i50s.10403 DOI: https://doi.org/10.52783/jisem.v10i50s.10403

Haitham, A., Amir, A., & Nemer, Z. N. (2025b). Inclusive Review on Advances in Masked Human Face Recognition Technologies. Iraqi Journal of Intelligent Computing and Informatics (IJICI), 4(June), 1–17. https://doi.org/10.52940/ijici.v4i1.71 DOI: https://doi.org/10.52940/ijici.v4i1.71

Hazan, E., Klivans, A., & Yuan, Y. (2017). Hyperparameter optimization: A spectral approach. arXiv preprint arXiv:1706.00764.

Jin, H., Huang, L., Cai, H., Yan, J., Li, B., & Chen, H. (2020). From LLMs to LLM-based Agents for Software Engineering : A Survey of Current , Challenges and Future. Journal of Business Economics, 18(9), 1–50.

Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. Artificial intelligence review, 53(8), 5455-5516. DOI: https://doi.org/10.1007/s10462-020-09825-6

Khoshdeli, M., Cong, R., & Parvin, B. (2017, February). Detection of nuclei in H&E stained sections using convolutional neural networks. In 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) (pp. 105-108). IEEE. DOI: https://doi.org/10.1109/BHI.2017.7897216

Kilichev, D. (2023). Hyperparameter Optimization for 1D-CNN-Based Network Intrusion Detection Using GA and PSO. 1–31. DOI: https://doi.org/10.3390/math11173724

Kumar, A., Somya, J., Sahoo, R., & Kaubiyal, J. (2021). Online social networks security and privacy : comprehensive review and analysis. Complex & Intelligent Systems, 0123456789. https://doi.org/10.1007/s40747-021-00409-7 DOI: https://doi.org/10.1007/s40747-021-00409-7

Li, C., Jiang, J., Zhao, Y., Li, R., Wang, E., Zhang, X., & Zhao, K. (2022, June). Genetic algorithm based hyper-parameters optimization for transfer convolutional neural network. In International Conference on Advanced Algorithms and Neural Networks (AANN 2022) (Vol. 12285, pp. 232–241). SPIE. DOI: https://doi.org/10.1117/12.2637170

Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid search, random search, genetic algorithm: a big comparison for NAS. arXiv preprint arXiv:1912.06059.

Liu, H., An, J., Jia, X., Gan, L., Karagiannidis, G. K., Clerckx, B., ... & Cui, T. J. (2025). Stacked intelligent metasurfaces for wireless communications: Applications and challenges. IEEE Wireless Communications, 32(4), 46-53. DOI: https://doi.org/10.1109/MWC.001.2500002

Mohammad, S., & Moosavi, S. (2023). Examining StyleGAN as a Utility-Preserving Face De-identification Method. In Proceedings of Proceedings on Privacy Enhancing Technologies (Vol. 2023, Issue 4). Association for Computing Machinery. https://doi.org/10.56553/popets-2023-0114 DOI: https://doi.org/10.56553/popets-2023-0114

Mubarak, R., Alsboui, T., Alshaikh, O., Inuwa-dute, I. S. A., Khan, S., & Parkinson, S. (2023). A Survey on the Detection and Impacts of Deepfakes in Visual , Audio , and Textual Formats. IEEE Access, PP, 1. https://doi.org/10.1109/ACCESS.2023.3344653 DOI: https://doi.org/10.1109/ACCESS.2023.3344653

Negi, S., Jayachandran, M., & Upadhyay, S. (2021). Deep fake: an understanding of fake images and videos. International Journal of Scientific Research in Computer Science Engineering and Information Technology, 7(3), 183-189. DOI: https://doi.org/10.32628/CSEIT217334

Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. T., , Duc Thanh Nguyen a, Thien Huynh-The c, Saeid Nahavandi d, Thanh Tam Nguyen e, Q.-V. P. f, & G, C. M. N. (2022). Deep learning for deepfakes creation and detection: A survey. Computer Vision and Image Understanding, 223. https://doi.org/doi.org/10.1016/j.cviu.2022.103525 DOI: https://doi.org/10.1016/j.cviu.2022.103525

Qing, C., Ruan, J., Xu, X., Ren, J., & Zabalza, J. (2019). Spatial-spectral classification of hyperspectral images : a deep learning framework with Markov Random fields based modelling. IET Image ProcessingVolume, 13(2). https://doi.org/10.1049/iet-ipr.2018.5727 DOI: https://doi.org/10.1049/iet-ipr.2018.5727

Review, A. S. (2023). Digital Face Manipulation Creation and Detection : Electronics, 12(16), 1–37. https://doi.org/10.3390/electronics12163407 DOI: https://doi.org/10.3390/electronics12163407

Rybnicek, R., & Königsgruber, R. (2019). What makes industry – university collaboration succeed ? A systematic review of the literature. In Journal of Business Economics (Vol. 89, Issue 2). Springer Berlin Heidelberg. https://doi.org/10.1007/s11573-018-0916-6 DOI: https://doi.org/10.1007/s11573-018-0916-6

Series, C. (2019). An Overview of Overfitting and its Solutions An Overview of Overfitting and its Solutions. Journal of Physics: Conference Series. https://doi.org/10.1088/1742-6596/1168/2/022022 DOI: https://doi.org/10.1088/1742-6596/1168/2/022022

Westerlund, M. (2019). The Emergence of Deepfake Technology : A Review. Technology Innovation Management Review. https://doi.org/10.22215/timreview/1282 DOI: https://doi.org/10.22215/timreview/1282

Zhang, J., Tao, C., Xu, Z., Xie, Q., Chen, W., & Yan, R. (2019, July). Ensemblegan: Adversarial learning for retrieval-generation ensemble model on short-text conversation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 435–444). DOI: https://doi.org/10.1145/3331184.3331193

Attention-Driven CNN-LSTM Fusion for Robust Deepfake Detection in Digital Media

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Latest publications

guideforauthors