×
The submission system is temporarily under maintenance. Please send your manuscripts to
Go to Editorial ManagerDeep falsification of multimedia content, especially videos and photos, threatens social cohesion (e.g., rumour propagation, extortion, and truth distortion) and must not be ignored. In some cases, this issue requires effective detection solutions. Most studies suggest that convolutional neural networks (CNNs) may not be able to extract complex features like those used in deepfake production. Thus, hybrid approaches that can capture complex features and act as powerful descriptors for binary classification are needed to separate bogus from true content. In this paper, a hybrid algorithm is developed to combine gated recurrent units (GRU) and CNN. The proposed model aims to improve the extraction of complex features by simultaneously capturing instantaneous and spatial features. This approach permits the extraction of implicit features that are vital to the final classification process, especially when dealing with a sequential series within video content. Finally, a dense neural network is used to classify these features. Practically, two data sets were used to train the proposed model: the FaceForensics++ (FF++) and DeepFake Detection Challenge (DFDC) datasets. The evaluation results of the proposed model on the FF++ dataset for the Area Under the Curve (AUC) and F1-score metrics reached 0.88% and 0.85%, respectively. While DFDC is 0.95% and 0.86% for the same metrics, respectively.