Page 273 - 2024-Vol20-Issue2
P. 273
269 | Qadir, Abdalla & Abd
TABLE I.
SOME OF THE CURRENT APPROACHES AND DATASETS UTILIZED IN VARIOUS ALGORITHMS FOR THE DETECTION OF
LUNG CANCER
Study Year Methodology Dataset(s) Performance Analysis
[9] 2011 Algorithms for unsu- LIDC-IDRI Acc. (94.3%) Focus has been placed on a tissue-
Acc. (92.65%) based categorization approach
pervised learning Additional trending algorithms
Acc. (90.85%) might improve the accuracy, pre-
[10] 2018 3D-CNN model BR-Dataset cision, and recall
Simply utilized a small database
[18] 2020 DenseNet classifier LIDC-IDRI
[11] 2020 BPSO-DT LUNA Acc. (88.25%) Other trending algorithms can in-
crease the accuracy
[12] 2020 RNN, CNN LUNA16, LIDC- Acc.(91%) AUC.(0.78) The proposed model, compared to
IDRI, and ANODE09 the other approaches, achieved a
lower accuracy rate
[13] 2021 3D CNN-AlexNet LUNA Acc. (89%) With 10% of the data evaluated,
the AlexNet model’s flaw was
shown to be ineffective for real-
time medical evaluation
[14] 2021 CNN LUAD Acc. (71%) This model’s limitations are not
concentrated on the segmentation
and preprocessing that increase the
accuracy of the model
[15] 2021 KNN, SVM classi- LUNA 16, LIDC- Acc. (91%) The complexity of elapsed time is
fiers IDRI considerable
[16] 2022 SqueezeNet + ResNet LUNA16 Acc. (94.87%)
[17] 2023 Modified CNN + LUNA16 Acc. (97.64%) The proposed LungNet-SVM clas-
SVM sifies lung cancer into only two cat-
egories: benign and malignant
Recall = T P (3) problems, there is an uneven distribution of classes. While
TP+FN accuracy is employed in cases where the distribution of the
class is comparable, the disadvantage is that this metric does
5) F1 Score not consider the ratio of the distribution between the classes,
The F1 score is the weighted average of accuracy and recall, and this can affect the obtained results, leading to incorrect
resulting in a value between 0 and 1. F1 score is considered conclusions [20].
a superior performance statistic than accuracy [19]and is de-
B. Dataset
fined as follows: The source of the dataset used in this research is the National
Center for Cancer Diseases/IQ-OTH (IQ-OTH/NCCD) [21];
2 × (recall * precision) (4) the data was collected over three months in the autumn of
F1score = recall + precision 2019. It includes CT images of individuals who are healthy
as well as patients with varying stages of lung cancer.
It should be mentioned that the distribution of the data is Radiologists and oncologists at these two institutions anno-
the determining factor in the process of selecting the metrics tated IQ-OTH/NCCD images. The entire collection consists of
of F1 score or accuracy. In cases when the classes are very 1190 images corresponding to CT scans. These images were
imbalanced, the F1 score tends to be an appropriate option obtained from 110 patients. Originally, Siemens SOMATOM
over accuracy since the majority of real-world classification was employed as the scanner, and the original format of CT