Abstract
Recently, face recognition technology has become more prevalent in various applications, including mobile devices, access control, and financial transactions. Therefore, it is crucial to address potential vulnerabilities that attackers might exploit. In this study, a method for face presentation attack detection (PAD) is introduced. The method utilizes the diversity of modalities provided by some cameras and sensors to detect face spoofing using convolutional neural networks (CNN) within the context of deep learning. To assess the effectiveness of the proposed approach in real-world scenarios, the wide multi-channel presentation attack (WMCA) dataset is used. The presented method exploits the multi-modal data, including RGB, depth, IR, and thermal channels, to enhance system performance and explore different techniques for combining the results from each modality. Furthermore, this study explores diverse techniques for fusing results from each channel in two fusion scenarios, pre-fusion and post-fusion. In the pre-fusion scenario, data from the four channels is combined, resulting in an ACER value of 0.19%. In the post-fusion scenario, the results of each modality are fused using different fusion techniques, such as majority voting, weighted voting, average pooling, and a stacking classifier. The stacking classifier yields the most favorable outcome with an ACER ratio of 0.03%. This performance is notably superior when compared to state-of-the-art methodologies.