Page 55 - 2023-Vol19-Issue2

P. 55

51 | Hashim & Yassin

Fig. 8. Performance results of soft voting classifier. curve. In this work, feature selection is performed by devel-
oping a method that combines two filtering techniques, PC
Experiment (3): In this experiment, we use 18 features and mutual information (PC-MI), to select the best features
obtained from the proposed feature selection method (PC- before passing them to a classification model. The proposed
MI) where the dataset is split into 10-fold for training and model (soft voting classifier) is used to enhance the perfor-
testing. The training data are passed to the voting classifier, mance where it includes three models (LR, SVM and DT). A
and our proposed model is evaluated by cross-validation. Our comparison is made between the performance of this models
proposed model’s soft voting classifier (LR, DT and SVM) and the proposed model to prove the efficiency and strength of
exhibits 98.2% test accuracy. our proposed model in the prediction process. The proposed
methodology outperforms previous work, achieving 99.3%
Results on Applied Side: We have proposed a methodol- accuracy, an F1 score of 0.9922, a recall of 0.9846, a precision
ogy that can create an applied health system (web page) that of 1 and an AUC of 0.9923. Furthermore, the accuracy of 10-
helps many health institutions in the speed and accuracy of fold cross-validation is 98.2%. Finally, a web page is created
diagnosing the type of breast cancer tumour based on ML using spyder and streamlit to make the proposed methodology
models. This method helps preserve the patient’s life through workable from the practical side, thereby helping many health
early treatment and disposal of the tumour. This page is im- institutions in the speed and accuracy of diagnosing the type
plemented using Spyder, which is a development environment of breast cancer tumour. This study’s future goals include
that uses the Python language to create software applications. using more feature selection techniques in conjunction with
the WDBC dataset to improve breast cancer diagnosis. In
Firstly, a sample of the mass in the breast is obtained, and addition, deep learning models will also be used for breast
this sample is analysed by a specialist called a pathologist. cancer detection.
After that the values of the required features are extracted.
The values of these features are entered into the Breast Can- CONFLICT OF INTEREST
cer Tumor Diagnostic website, which is built based on our
proposed model (soft voting classifier) that predicts whether The authors have no conflict of relevant interest to this article.
the tumour is benign or malignant as shown in Fig. 9.
REFERENCES
Table V presents a comparison between our proposed
method and the related studies that use the feature selection [1] W. H. O. . WHO, “http://www.who.int/cancer/
process on a WDBC dataset where the table shows that our prevention/diagnosis-screening/breast-cancer/en/,”
proposed method gives the highest degree of accuracy by 99.3, World Breast Cancer Rep., 2020.
making it superior to all previous studies that we compared
in the last years. This superiority in accuracy is because of [2] A. B. Nassif, M. A. Talib, Q. Nasir, Y. Afadar, and
the methods that we have used such as dataset balancing and O. Elgendy, “Breast cancer detection using artificial
selecting the feature proposed, as well as the proposed model intelligence techniques: A systematic literature review,”
(soft voting classifier). Artificial Intelligence in Medicine, vol. 127, p. 102276,
2022.

[3] A. Haleem, M. Javaid, and I. H. Khan, “Current status
and applications of artificial intelligence (ai) in medical
field: An overview,” Current Medicine Research and
Practice, vol. 9, no. 6, pp. 231–237, 2019.

V. CONCLUSION [4] H. Asri, H. Mousannif, H. Al Moatassime, and T. Noel,
“Using machine learning algorithms for breast cancer risk
Breast cancer should be detected early for effective treat- prediction and diagnosis,” Procedia Computer Science,
ment. Being one of the top causes of mortality in women, vol. 83, pp. 1064–1069, 2016.
early diagnosis is crucial. The developed ML models enhance
early breast cancer tumour prediction. However, false pos- [5] S. Guo, Y. Liu, R. Chen, X. Sun, and X. Wang, “Im-
itive and false negative instances are important in medical proved smote algorithm to deal with imbalanced activ-
research. Therefore, we focus not just on accuracy in our ity classes in smart homes,” Neural Processing Letters,
work but also on F1 score, precision, recall, AUC and ROC vol. 50, pp. 1503–1526, 2019.

50 51 52 53 54 55 56 57 58 59 60