Page 273 - 2024-Vol20-Issue2
P. 273

269 |                                                                          Qadir, Abdalla & Abd

                                                                     TABLE I.
SOME OF THE CURRENT APPROACHES AND DATASETS UTILIZED IN VARIOUS ALGORITHMS FOR THE DETECTION OF

                                                                 LUNG CANCER

Study Year Methodology                 Dataset(s)               Performance    Analysis

[9] 2011 Algorithms for unsu- LIDC-IDRI                         Acc. (94.3%)   Focus has been placed on a tissue-
                                                                Acc. (92.65%)  based categorization approach
       pervised learning                                                       Additional trending algorithms
                                                                Acc. (90.85%)  might improve the accuracy, pre-
[10] 2018 3D-CNN model                 BR-Dataset                              cision, and recall
                                                                               Simply utilized a small database
[18] 2020 DenseNet classifier LIDC-IDRI

[11] 2020 BPSO-DT                      LUNA                     Acc. (88.25%)  Other trending algorithms can in-

                                                                               crease the accuracy

[12] 2020 RNN, CNN                     LUNA16, LIDC- Acc.(91%) AUC.(0.78) The proposed model, compared to

                                       IDRI, and ANODE09                       the other approaches, achieved a

                                                                               lower accuracy rate

[13] 2021 3D CNN-AlexNet LUNA                                   Acc. (89%)     With 10% of the data evaluated,

                                                                               the AlexNet model’s flaw was

                                                                               shown to be ineffective for real-

                                                                               time medical evaluation

[14] 2021 CNN                          LUAD                     Acc. (71%)     This model’s limitations are not

                                                                               concentrated on the segmentation

                                                                               and preprocessing that increase the

                                                                               accuracy of the model

[15] 2021 KNN, SVM classi- LUNA 16, LIDC- Acc. (91%)                           The complexity of elapsed time is

       fiers IDRI                                                              considerable

[16] 2022 SqueezeNet + ResNet LUNA16                            Acc. (94.87%)

[17] 2023 Modified CNN + LUNA16                                 Acc. (97.64%)  The proposed LungNet-SVM clas-
                        SVM                                                    sifies lung cancer into only two cat-
                                                                               egories: benign and malignant

Recall = T P                                       (3)          problems, there is an uneven distribution of classes. While
            TP+FN                                               accuracy is employed in cases where the distribution of the
                                                                class is comparable, the disadvantage is that this metric does
5) F1 Score                                                     not consider the ratio of the distribution between the classes,
The F1 score is the weighted average of accuracy and recall,    and this can affect the obtained results, leading to incorrect
resulting in a value between 0 and 1. F1 score is considered    conclusions [20].
a superior performance statistic than accuracy [19]and is de-
                                                                B. Dataset
fined as follows:                                               The source of the dataset used in this research is the National
                                                                Center for Cancer Diseases/IQ-OTH (IQ-OTH/NCCD) [21];
             2 × (recall * precision)              (4)          the data was collected over three months in the autumn of
F1score = recall + precision                                    2019. It includes CT images of individuals who are healthy
                                                                as well as patients with varying stages of lung cancer.
It should be mentioned that the distribution of the data is     Radiologists and oncologists at these two institutions anno-
the determining factor in the process of selecting the metrics  tated IQ-OTH/NCCD images. The entire collection consists of
of F1 score or accuracy. In cases when the classes are very     1190 images corresponding to CT scans. These images were
imbalanced, the F1 score tends to be an appropriate option      obtained from 110 patients. Originally, Siemens SOMATOM
over accuracy since the majority of real-world classification   was employed as the scanner, and the original format of CT
   268   269   270   271   272   273   274   275   276   277   278