In recent years, there has been a considerable rise in the applications in which object or image categorization is beneficial for example, analyzing medicinal images, assisting persons to organize their collections of photos, recognizing what is around self-driving vehicles, and many more. These applications necessitate accurately labeled datasets, in their majority involve an extensive diversity in the types of images, from cats or dogs to roads, landscapes, and so forth. The fundamental aim of image categorization is to predict the category or class for the input image by specifying to which it belongs. For human beings, this is not a considerable thing, however, learning computers to perceive represents a hard issue that has become a broad area of research interest, and both computer vision techniques and deep learning algorithms have evolved. Conventional techniques utilize local descriptors for finding likeness between images, however, nowadays; progress in technology has provided the utilization of deep learning algorithms, especially the Convolutional Neural Networks (CNNs) to auto-extract representative image patterns and features for classification The fundamental aim of this paper is to inspect and explain how to utilize the algorithms and technologies of deep learning to accurately classify a dataset of images into their respective categories and keep model structure complication to a minimum. To achieve this aim, must focus precisely and accurately on categorizing the objects or images into their respective categories with excellent results. And, specify the best deep learning-based models in image processing and categorization. The developed CNN-based models have been proposed and a lot of pre-training models such as (VGG19, DenseNet201, ResNet152V2, MobileNetV2, and InceptionV3) have been presented, and all these models are trained on the Caltech-101 and Caltech-256 datasets. Extensive and comparative experiments were conducted on this dataset, and the obtained results demonstrate the effectiveness of the proposed models. The obtained results demonstrate the effectiveness of the proposed models. The accuracy for Caltech-101 and Caltech-256 datasets was (98.06% and 90%) respectively.
Clustering is a fundamental data analysis task that presents challenges. Choosing proper initialization centroid techniques is critical to the success of clustering algorithms, such as k-means. The current work investigates six established methods (random, Forgy, k-means++, PCA, hierarchical clustering, and naive sharding) and three innovative swarm intelligence-based approaches—Spider Monkey Optimization (SMO), Whale Optimization Algorithm (WOA) and Grey Wolf Optimizer (GWO)—for k-means clustering (SMOKM, WOAKM, and GWOKM). The results on ten well-known datasets strongly favor swarm intelligence-based techniques, with SMOKM consistently outperforming WOAKM and GWOKM. This finding provides critical insights into selecting and evaluating centroid techniques in k-means clustering. The current work is valuable because it provides guidance for those seeking optimal solutions for clustering diverse datasets. Swarm intelligence, especially SMOKM, effectively generates distinct and well-separated clusters, which is valuable in resource-constrained settings. The research also sheds light on the performance of traditional methods such as hierarchical clustering, PCA, and k-means++, which, while promising for specific datasets, consistently underperform swarm intelligence-based alternatives. In conclusion, the current work contributes essential insights into selecting and evaluating initialization centroid techniques for k-means clustering. It highlights the superiority of swarm intelligence, particularly SMOKM, and provides actionable guidance for addressing various clustering challenges.
Low-quality data can be dangerous for the machine learning models, especially in crucial situations. Some large-scale datasets have low-quality data and false labels, also, datasets with images type probably have artifacts and biases from measurement errors. So, automatic algorithms that are able to recognize low-quality data are needed. In this paper, Shapley Value is used, a metric for evaluation of data, to quantify the value of training data to the performance of a classification algorithm in a large ImageNet dataset. We specify the success of data Shapley in recognizing low-quality against precious data for classification. We figure out that model performance is increased when low Shapley values are removed, whilst classification model performance is declined when high Shapley values are removed. Moreover, there were more true labels in high-Shapley value data and more mislabeled samples in low-Shapley value. Results represent that mislabeled or poor-quality images are in low Shapley value and valuable data for classification are in high Shapley value.
Growing interests in nature-inspired computing and bio-inspired optimization techniques have led to powerful tools for solving learning problems and analyzing large datasets. Several methods have been utilized to create superior performance-based optimization algorithms. However, certain applications, like nonlinear real-time, are difficult to explain using accurate mathematical models. Such large-scale combination and highly nonlinear modeling problems are solved by usage of soft computing techniques. So, in this paper, the researchers have tried to incorporate one of the most advanced plant algorithms known as Venus Flytrap Plant algorithm(VFO) along with soft-computing techniques and, to be specific, the ANFIS inverse model-Adaptive Neural Fuzzy Inference System for controlling the real-time temperature of a microwave cavity that heats oil. The MATLAB was integrated successfully with the LabVIEW platform. Wide ranges of input and output variables were experimented with. Problems were encountered due to heating system conditions like reflected power, variations in oil temperature, and oil inlet absorption and cavity temperatures affecting the oil temperature, besides the temperature’s effect on viscosity. The LabVIEW design followed and the results figure in the performance of the VFO- Inverse ANFIS controller.
Kinship (Familial relationships) detection is crucial in many fields and has applications in biometric security, adoption, forensic investigations, and more. It is also essential during wars and natural disasters like earthquakes since it may aid in reunion, missing person searches, establishing emergency contacts, and providing psychological support. The most common method of determining kinship is DNA analysis which is highly accurate. Another approach, which is noninvasive, uses facial photos with computer vision and machine learning algorithms for kinship estimation. Each part of the Human -body has its own embedded information that can be extracted and adopted for identification, verification, or classification of that person. Kinship recognition is based on finding traits that are shared by every family. We investigate the use of hand geometry for kinship detection, which is a new approach. Because of the available hand image Datasets do not contain kinship ground truth; therefore, we created our own dataset. This paper describes the tools, methodology, and details of the collected MKH, which stands for the Mosul Kinship Hand, images dataset. The images of MKH dataset were collected using a mobile phone camera with a suitable setup and consisted of 648 images for 81 individuals from 14 families (8 hand situations per person). This paper also presents the use of this dataset in kinship prediction using machine learning. Google MdiaPipe was used for hand detection, segmentation, and geometrical key points finding. Handcraft feature extraction was used to extract 43 distinctive geometrical features from each image. A neural network classifier was designed and trained to predict kinship, yielding about 93% prediction accuracy. The results of this novel approach demonstrated that the hand possesses biometric characteristics that may be used to establish kinship, and that the suggested method is a promising way as a kinship indicator.
Recently, numerous researches have emphasized the importance of professional inspection and repair in case of suspected faults in Photovoltaic (PV) systems. By leveraging electrical and environmental features, many machine learning models can provide valuable insights into the operational status of PV systems. In this study, different machine learning models for PV fault detection using a simulated 0.25MW PV power system were developed and evaluated. The training and testing datasets encompassed normal operation and various fault scenarios, including string-to-string, on-string, and string-to-ground faults. Multiple electrical and environmental variables were measured and exploited as features, such as current, voltage, power, temperature, and irradiance. Four algorithms (Tree, LDA, SVM, and ANN) were tested using 5-fold cross-validation to identify errors in the PV system. The performance evaluation of the models revealed promising results, with all algorithms demonstrating high accuracy. The Tree and LDA algorithms exhibited the best performance, achieving accuracies of 99.544% on the training data and 98.058% on the testing data. LDA achieved perfect accuracy (100%) on the testing data, while SVM and ANN achieved 95.145% and 89.320% accuracy, respectively. These findings underscore the potential of machine learning algorithms in accurately detecting and classifying various types of PV faults. .
Breast cancer is one of the most critical diseases suffered by many people around the world, making it the most common medical risk they will face. This disease is considered the leading cause of death around the world, and early detection is difficult. In the field of healthcare, where early diagnosis based on machine learning (ML) helps save patients’ lives from the risks of diseases, better-performing diagnostic procedures are crucial. ML models have been used to improve the effectiveness of early diagnosis. In this paper, we proposed a new feature selection method that combines two filter methods, Pearson correlation and mutual information (PC-MI), to analyse the correlation amongst features and then select important features before passing them to a classification model. Our method is capable of early breast cancer prediction and depends on a soft voting classifier that combines a certain set of ML models (decision tree, logistic regression and support vector machine) to produce one model that carries the strengths of the models that have been combined, yielding the best prediction accuracy. Our work is evaluated by using the Wisconsin Diagnostic Breast Cancer datasets. The proposed methodology outperforms previous work, achieving 99.3% accuracy, an F1 score of 0.9922, a recall of 0.9846, a precision of 1 and an AUC of 0.9923. Furthermore, the accuracy of 10-fold cross-validation is 98.2%.
Due to their vital applications in many real-world situations, researchers are still presenting bunches of methods for better analysis of motor imagery (MI) electroencephalograph (EEG) signals. However, in general, EEG signals are complex because of their nonstationary and high-dimensionality properties. Therefore, high consideration needs to be taken in both feature extraction and classification. In this paper, several hybrid classification models are built and their performance is compared. Three famous wavelet mother functions are used for generating scalograms from the raw signals. The scalograms are used for transfer learning of the well-known VGG-16 deep network. Then, one of six classifiers is used to determine the class of the input signal. The performance of different combinations of mother functions and classifiers are compared on two MI EEG datasets. Several evaluation metrics show that a model of VGG-16 feature extractor with a neural network classifier using the Amor mother wavelet function has outperformed the results of state-of-the-art studies.
COVID-19 is an infectious viral disease that mostly affects the lungs. That quickly spreads across the world. Early detection of the virus boosts the chances of patients recovering quickly worldwide. Many radiographic techniques are used to diagnose an infected person such as X-rays, deep learning technology based on a large amount of chest x-ray images is used to diagnose COVID-19 disease. Because of the scarcity of available COVID-19 X-rays image, the limited COVID-19 Datasets are insufficient for efficient deep learning detection models. Another problem with a limited dataset is that training models suffer from over-fitting, and the predictions are not generalizable to address these problems. In this paper, we developed Conditional Generative Adversarial Networks (CGAN) to produce synthetic images close to real images for the COVID-19 case and traditional augmentation that was used to expand the limited dataset then used to train by Customized deep detection model. The Customized Deep learning model was able to obtain excellent detection accuracy of 97% accurate with only ten epochs. The proposed augmentation outperforms other augmentation techniques. The augmented dataset includes 6988 high-quality and resolution COVID-19 X-rays images. At the same time, the original COVID-19 X-rays images are only 587.