Page 145 - IJEEE-2022-Vol18-ISSUE-1
P. 145

Atiyah & Thalij                                                                                                                     | 141

models. Therefore, the standard deviation is utilized with             6) eXtreme Gradient Boosting
standardization, making the mean zero to the features, and         XGBoost is an algorithm of supervised learning used of
the standard deviation becomes one for the dissemination of        regression and classification. It is an application open-source
the results.                                                       common and effective for gradient boosted in the tree model.
                                                                   It attempts to forecast the accurate target variables by
  E. Dataset Splitting                                             combining an estimates group for the set of weaker models
                                                                   and simpler ones. The idea of the algorithm is to add
The dataset must be segmented into a set of trains and a set       continuously trees and apply the fragmentation of features to
of tests before applying the algorithms of classification. the     grow a tree [21].
dataset dividing for the set of training is 80% and the set of
testing is 20%.                                                        7) K_Nearest Neighbors
                                                                   It is a supervised learning algorithm utilized in problems of
  F. Classification Algorithms                                     classification. It utilized the ‘feature similarity’ to create the
                                                                   coordinates of new data values. These assigned coordinates
There are many classification algorithms in machine                values will be based on how identical to the training set
learning. In the first classification model, we use the            coordinates values. The training phase keeps the dataset
algorithms like Stochastic Gradient Descent (SGD), Naive-          only. while the test phase classifies the new data that a much
Bayes (BN), Logistic-Regression (LR), and Random-Forest            identical to the class of the dataset [22].
(RF). In the second model, we use the algorithms like
Support-Vector Machines (SVM), Decision-Tree (DT),                     8) Decision Tree
eXtreme-Gradient Boosting (XGBoost), and K_Nearest                 A Decision Tree is an algorithm supervised utilized for
Neighbors (KNN).                                                   regression problems and classification. It works for both
                                                                   continuous and categorical output variables. A decision tree
    1) Naïve Bayes                                                 contains two stages at the classification learning stages and
                                                                   predicting. The system is training to use the training data
NB provides a way to predict different class potentials            granted in the learning stage. It is used to predict the outcome
relying on the various features. it is mostly used in              that indicated to test the data in the predicting stage [23].
classifying texts and processing of multi-class problems [14].
                                                                                            IV. THE RESULTS
    2) Logistic Regression
LR is a classification algorithm of learning supervised that is    This part shows detail of the dataset, the learning system, the
utilized to predict a target variant possibility, due to the       algorithms utilized in classifying the COVID-19, and the
dichotomous dependent various or the goal nature, two              metrics of performance. and display the results of comparing
potential classes will be made, LR is used as the reaction         the performance of the algorithms.
between different sets from prediction variants and the
categorical result variables [15].                                   A. Characterizing Data

    3) Stochastic Gradient Descent                                 Characterizing data is a significant procedure in the data
It is very qualified method and proper for the linear              preparation stage, it gives visualizes things by showing the
classifiers based on functions of convex lack like linear. The     variables of the dataset used. Table I display a description of
SGD is implementing sporadic and large-scale ML problems           the variables in the dataset.
successfully in classifying the text and processing normal
language [16]. In the SGD the classifier applies regularized         B. Pre-Processing Results
linear models, for every sample, the loss gradient is
estimated concurrently and the model is upgraded during the        Pre-Processing is an operation of processing data and
learning ratio [17].                                               working on statistical analysis. The results of any phase are
                                                                   the entry of the next phase, so the data need to be prepared in
    4) Random Forests                                              similar details. In this phase, the miss of values is processed,
It is an algorithm learning of supervised used in classification   if the miss of values is numeric should be a substitution with
problems, it assembling amounts of data training of decision       the mean of the value in the column, or if it was nominal must
trees, utilize in classification a means known packing. Each       be a substitution with a value in the neighbor, the dataset is
decision tree refers to forecasting of class, this way collecting  ready to the next stage.
the volumes in the decision trees, the final layer is who has
further volumes [18].                                                C. Data Analysis Results

    5) Support Vector Machines                                     In this stage, the data assembled is summarized and
It is an algorithm learning of supervised based on the             interpreted during logical reasoning and analysis to specify
decision planes idea, it works to isolate the data through         modules, trends links, and describe the pre-processed data to
creating the hyperplane, where utilized the hyperplanes of         perceive features. Table (2) displays the patients' total. Table
classification the specific classes set [19]. It works to          (3) displays the distribution of ages for the total patients.
discover the boundaries and lines for classification the           Table (4) display the distribution of age for patients who
training dataset correctly and determine the line of data          need to the ICU.
points nearest [20].
   140   141   142   143   144   145   146   147   148