Page 274 - 2024-Vol20-Issue2
P. 274

270 |                                                              Qadir, Abdalla & Abd

Fig. 2. Samples of the dataset: (A) Benign, (B) Malignant,         all of the pixels were normalized to the range of 0 and 1 in
and (C) Normal.                                                    order to reduce the computational complexity of the image
                                                                   recognition process. In our study, we addressed the challenge
images was the DICOM format. CT procedure consists of              of class imbalance in our dataset through the application of
120 kV, Window sizes of 350 to 1200 HU, slices of 1 mm             Stratified 5-fold Cross-Validation. This technique is particu-
of thickness, and window centers ranging from 50 to 600 for        larly beneficial for imbalanced datasets as it ensures that each
reading with full inspiration, with breath-hold. Before the        fold maintains a consistent ratio of the different class labels,
analysis process, all photos were undergone de-identification.     closely mirroring the original dataset’s class distribution. By
Each scan includes a number of slices. The number of these         employing this method, we aimed to achieve a more reliable
slices ranges between 80 and 200, and each of them is rep-         and unbiased evaluation of our model’s performance. Each
resented with a picture of the patient’s chest shown from          of the five folds was used once as a validation set, while the
different perspectives and various sides. Each of the benign       remaining four folds served as the training set. This approach
patients had a different age, gender, geographical location,       not only allowed for a thorough assessment of the model
educational level, and living condition. Some of these patients    across different subsets of the data but also contributed to miti-
were workers and farmers, while another portion of them were       gating the potential bias that could arise from the uneven class
employees of the Transport and Oil Ministry of Iraq. Mostly        distribution. The consistent representation of classes in each
living in the provinces of Babylon, Wasit, Salahuddin, Diyala,     fold helped in validating the robustness and generalizability of
and Baghdad. Fig. 2 shows a sampling of our datasets. The          our model, ensuring that the validation metrics were reflective
dataset was divided into three categories: malignant (40), be-     of its performance across the diverse class spectrum [24].
nign (15), and normal (55).
Benign lesions, such as benign tumors or cysts, are non-           D. Feature Extraction using VGG-16
cancerous but may resemble cancer on scans. They typically
have well-defined borders and uniform appearances, unlike          Initially, the first VGG network design was proposed in a
malignant tumors, which are often irregular and rapidly chang-     study by Simonyan and Zisserman [25]. Two variations of the
ing. A common benign lung lesion is a hamartoma, character-        model were created, which are a 16-layer (VGG-16) as well
ized by its slow growth and well-circumscribed nature [22],        as a 19-layer (VGG-19) network, with the aim of submitting
while normal lung imaging shows healthy lung tissue with no        these models and participating in the ImageNet challenge in
abnormalities. These images are essential for comparison to        2014, where the competition was won by the team of Visual
identify pathological changes in subsequent scans [23].            Geometry Group (VGG), achieving a second place in the lo-
                                                                   calization track while taking first place in the classification
C. Preprocessing                                                   track. The architecture of VGG16 is composed of five con-
In this study, two major preprocessing steps are utilized. First,  volutional layer blocks followed by three layers fully linked.
the dataset images came with an original size of 512x512 pix-      Convolutional layers make sure that each of the activation
els; the size of these images was reduced to 224x224, which        maps stays at the same spatial dimensions compared to the
is the default input size of the VGG16 network. Second,            layer, which is below it, through the use of 3x3 kernels and a
                                                                   padding and stride of 1.
                                                                   After each convolution, a rectified linear unit (ReLU) activa-
                                                                   tion and max pooling operation are immediately performed
                                                                   to reduce the spatial dimension. In the activation map, each
                                                                   spatial dimension from the previous layer is cut in half when
                                                                   using 22 kernels through the max pooling layers while using
                                                                   two strides and without padding. The final layer is the softmax
                                                                   layer, which consists of 1,000 completely linked layers and
                                                                   is then employed after two fully connected layers containing
                                                                   4,096 units with a ReLu activation function.
                                                                   At the same time, one of the drawbacks of the VGG6 model is
                                                                   that it is expensive to analyze, and it requires a lot of process-
                                                                   ing power in terms of memories and parameters. The model
                                                                   contains almost 138 million parameters. In the proposed
                                                                   model, an XGBoost classifier replaces the fully connected
                                                                   layers, in which approximately 123 million parameters reside.
                                                                   These layers are considered the bulk of these parameters; this
   269   270   271   272   273   274   275   276   277   278   279