Page 272 - 2024-Vol20-Issue2
P. 272

268 |                                                                                    Qadir, Abdalla & Abd

for training, with only 10% of the training dataset utilized.     Fig. 1. Performance measures that are used to validate the
Chaunzwa et al. constructed a model for the purpose of iden-      suggested hybrid-LCSCDM model.
tifying lung cancer patients with early-stage adenocarcinoma
(ADC) and squamous cell carcinoma (SCC) via a supervised          1) Confusion Matrix
CNN detection system. Utilizing real-time non-SCLC from           A confusion matrix displays the model’s predictions. It is
patients who were impacted in the early stages and were           intended to display instances in which the model properly and
collected at Massachusetts General Hospital, CNN has been         erroneously worked on the data. Fig. 1 shows the components
tested [14]. About 311 data phases have been gathered, and        of a confusion matrix; it provides details on the anticipated
they are all present in the database. They created CNN, a         performance indicators for confirming the validity of the sug-
learning detection system with a 71 percent AUC detection         gested Hybrid-LCSCDM model and shows four findings. The
rate, which was inadequate.                                       first two are the true positive (TP) and true negative (TN),
Chaturvedi et al. evaluated the methods of the most recent        as these are the correctly predicted predictions by the pro-
studies conducted on detecting and classifying lung cancer.       posed model. The other two are false positive (FP) and false
Super Bowl Dataset 2016, LUNA 16, and standard datasets           negative (FN), which are the predictions that our proposed
LIDC-IDRI are accustomed to supervised learning algorithms        model failed to correctly predict. The rows reflect the possible
like SVM, CNN, and KNN. According to the authors of the           classifications, while the accurate categorization of the data is
article, these algorithms are often used in the identification    represented as columns [19].
of diseases and in CT data [15]. Naik et al. [16] provided
a detailed description of a pulmonary nodule classification       2) Accuracy
system utilizing a fractal network. The Fractalnet model was      One of the most popular and widely used metrics in machine
employed on the LUNA16 dataset for training and validating        learning, this metric shows the proportion of the correct pre-
the system’s performance, resulting in an accuracy of 94.7%.      dictions with respect to the overall data [19].
Nasser et al. [17] presented a LungNet-SVM approach for the
efficient segmentation and classification of pulmonary nodules                          TP+TN  (1)
in CT images into just two classes. LungNet-SVM is a modi-        Accuracy =
fied iteration of the AlexNet architecture, and a support vector
machine (SVM) algorithm is employed as a classifier. During                      TP+TN +FP+FN
the training and validation phases, the model considers three
different input image sizes (16 × 16, 32 × 32, and 48 × 48)       3) Precision
and undergoes optimization using three different optimiz-         Precision indicates the ratio of accurately anticipated out-
ers—Adam, RMSprop, and SGD—in order to fine-tune the              comes compared to the total of the correctly predicted obser-
model for optimal accuracy. The experimental results reveal       vations [19] and is described as:
that the LungNet-SVM model, particularly when utilizing the
SGD optimizer, attains the highest accuracy when operating                           TP        (2)
on 48 × 48 input image sizes with 97.64% accuracy.                Precision =
The current approaches and datasets utilized in various algo-
rithms are summarized in Table I.                                                TP+FP

                 III. METHODOLOGY

The presented model consists of two major components: a
deep transferring learning model, namely “VGG-16”, is the
first component that is used as a feature extractor, and the
second component is a machine learning algorithm, namely
“XGboost classifier,” the proposed model was trained and
tested on the IQ-OTH/NCCD dataset.

A. Performance Metrics                                            4) Recall
                                                                  The recall is used to calculate the proportion of properly pre-
We used several performance metrics to estimate our model         dicted positive outcomes relative to the total number of out-
(Hybrid-LCSCDM), including:                                       comes in a given class [19] and may be defined as follows:
   267   268   269   270   271   272   273   274   275   276   277