Page 54 - 2023-Vol19-Issue2
P. 54

50 |                                                                                                               Hashim & Yassin

selected through the use of the proposed method (PC-MI)             TABLE IV.
to be used by the proposed model (soft voting classifier) to        Comparison between the performance of our work and
give the best classification fit of the tumour type, whether it is  machine learning models used
benign or malignant.
                                                                                       Model             Accuracy  Precision  Recall  F1 Score
          IV. RESULTS AND DISCUSSION
                                                                            Logistic Regression (LR)        (%)     0.9846    0.9846   0.9846
   In this section, the performance results based on the pro-                  Decision Tree (DT)          98.6%    0.9412    0.9846   0.9624
posed methodology are shown and discussed in terms of the                         Support Vector           96.5%
F1 score, precision, recall, accuracy, AUC, and ROC curves.                                                         0.9839    0.9385   0.9606
We conduct three experiments, where the first experiment                         Machine (SVM)             96.5%
includes comparing the performance of the soft voting classi-       Soft VotingClassifier [LR, DT, SVM]                 1     0.9846   0.9922
fier with the models included (LR, SVM and DT) separately.                                                99.3%
The second experiment includes displaying the results of the
soft voting classifier and comparing them with the previous             Table IV shows that the soft voting classifier obtains the
work, as both experiments use train–test–split as a way to split    highest degree of accuracy (99.3%), F1 score (0.9922), re-
the dataset. In the third experiment, the dataset is split into     call (0.9846), and precision (1) because the voting classifier
10-fold, and the results of the performance of the soft voting      depends on integrating the three models into one model that
classifier are presented. In addition, we explain the place of      carries the strength of these combined models, which leads to
our proposed methodology and the contribution made to the           the best prediction accuracy.
applied side of early diagnosis of breast cancer.
                                                                        Figure 7 shows the ROC curves for the soft voting classi-
                                                                    fier with the models included in it (LR, SVM and DT).

                                                                    Fig. 7. ROC curves for models that used.

           Fig. 6. The proposed soft voting classifier.                 Experiment (2): In this experiment, we use the bal-
                                                                    anced dataset after selecting the best features through the
    Experiment (1): In this experiment, we compare the              proposed method PC-MI, where only 18 features are used.
performance results of the models used (LR, DT and SVM)             Train–test–split is used as a method for splitting the dataset,
with the proposed model (soft voting classifier) using the          as the data are entered into a soft voting classifier to predict
presented methodology, where train–test–split is adopted as         the type of tumour that may appear in some persons, whether
a method for splitting the dataset. Table IV refers to the          it is benign or malignant. Fig. 8 displays the result of the per-
comparison results of these models.                                 formance of the soft voting classifier based on the important
                                                                    performance scaling factors.
   49   50   51   52   53   54   55   56   57   58   59