Page 145 - IJEEE-2022-Vol18-ISSUE-1
P. 145
Atiyah & Thalij | 141
models. Therefore, the standard deviation is utilized with 6) eXtreme Gradient Boosting
standardization, making the mean zero to the features, and XGBoost is an algorithm of supervised learning used of
the standard deviation becomes one for the dissemination of regression and classification. It is an application open-source
the results. common and effective for gradient boosted in the tree model.
It attempts to forecast the accurate target variables by
E. Dataset Splitting combining an estimates group for the set of weaker models
and simpler ones. The idea of the algorithm is to add
The dataset must be segmented into a set of trains and a set continuously trees and apply the fragmentation of features to
of tests before applying the algorithms of classification. the grow a tree [21].
dataset dividing for the set of training is 80% and the set of
testing is 20%. 7) K_Nearest Neighbors
It is a supervised learning algorithm utilized in problems of
F. Classification Algorithms classification. It utilized the ‘feature similarity’ to create the
coordinates of new data values. These assigned coordinates
There are many classification algorithms in machine values will be based on how identical to the training set
learning. In the first classification model, we use the coordinates values. The training phase keeps the dataset
algorithms like Stochastic Gradient Descent (SGD), Naive- only. while the test phase classifies the new data that a much
Bayes (BN), Logistic-Regression (LR), and Random-Forest identical to the class of the dataset [22].
(RF). In the second model, we use the algorithms like
Support-Vector Machines (SVM), Decision-Tree (DT), 8) Decision Tree
eXtreme-Gradient Boosting (XGBoost), and K_Nearest A Decision Tree is an algorithm supervised utilized for
Neighbors (KNN). regression problems and classification. It works for both
continuous and categorical output variables. A decision tree
1) Naïve Bayes contains two stages at the classification learning stages and
predicting. The system is training to use the training data
NB provides a way to predict different class potentials granted in the learning stage. It is used to predict the outcome
relying on the various features. it is mostly used in that indicated to test the data in the predicting stage [23].
classifying texts and processing of multi-class problems [14].
IV. THE RESULTS
2) Logistic Regression
LR is a classification algorithm of learning supervised that is This part shows detail of the dataset, the learning system, the
utilized to predict a target variant possibility, due to the algorithms utilized in classifying the COVID-19, and the
dichotomous dependent various or the goal nature, two metrics of performance. and display the results of comparing
potential classes will be made, LR is used as the reaction the performance of the algorithms.
between different sets from prediction variants and the
categorical result variables [15]. A. Characterizing Data
3) Stochastic Gradient Descent Characterizing data is a significant procedure in the data
It is very qualified method and proper for the linear preparation stage, it gives visualizes things by showing the
classifiers based on functions of convex lack like linear. The variables of the dataset used. Table I display a description of
SGD is implementing sporadic and large-scale ML problems the variables in the dataset.
successfully in classifying the text and processing normal
language [16]. In the SGD the classifier applies regularized B. Pre-Processing Results
linear models, for every sample, the loss gradient is
estimated concurrently and the model is upgraded during the Pre-Processing is an operation of processing data and
learning ratio [17]. working on statistical analysis. The results of any phase are
the entry of the next phase, so the data need to be prepared in
4) Random Forests similar details. In this phase, the miss of values is processed,
It is an algorithm learning of supervised used in classification if the miss of values is numeric should be a substitution with
problems, it assembling amounts of data training of decision the mean of the value in the column, or if it was nominal must
trees, utilize in classification a means known packing. Each be a substitution with a value in the neighbor, the dataset is
decision tree refers to forecasting of class, this way collecting ready to the next stage.
the volumes in the decision trees, the final layer is who has
further volumes [18]. C. Data Analysis Results
5) Support Vector Machines In this stage, the data assembled is summarized and
It is an algorithm learning of supervised based on the interpreted during logical reasoning and analysis to specify
decision planes idea, it works to isolate the data through modules, trends links, and describe the pre-processed data to
creating the hyperplane, where utilized the hyperplanes of perceive features. Table (2) displays the patients' total. Table
classification the specific classes set [19]. It works to (3) displays the distribution of ages for the total patients.
discover the boundaries and lines for classification the Table (4) display the distribution of age for patients who
training dataset correctly and determine the line of data need to the ICU.
points nearest [20].