The reliance on networks and systems has grown rapidly in contemporary times, leading to increased vulnerability to cyber assaults. The Distributed Denial-of-Service (Distributed Denial of Service) attack, a threat that can cause great financial liabilities and reputation damage. To address this problem, Machine Learning (ML) algorithms have gained huge attention, enabling the detection and prevention of DDOS (Distributed Denial of Service) Attacks. In this study, we proposed a novel security mechanism to avoid Distributed Denial of Service attacks. Using an ensemble learning methodology aims to it also can differentiate between normal network traffic and the malicious flood of Distributed Denial of Service attack traffic. The study also evaluates the performance of two well-known ML algorithms, namely, the decision tree and random forest, which were used to execute the proposed method. Tree in defending against Distributed Denial of Service (DDoS) attacks. We test the models using a publicly available dataset called TIME SERIES DATASET FOR DISTRIBUTED DENIAL OF SERVICE ATTACK DETECTION. We compare the performance of models using a list of evaluation metrics developing the Model. This step involves fetching the data, preprocessing it, and splitting it into training and testing subgroups, model selection, and validation. When applied to a database of nearly 11,000 time series; in some cases, the proposed approach manifested promising results and reached an Accuracy (ACC) of up to 100 % in the dataset. Ultimately, this proposed method detects and mitigates distributed denial of service. The solution to securing communication systems from this increasing cyber threat is this: preventing attacks from being successful.
The ability of the human brain to communicate with its environment has become a reality through the use of a Brain-Computer Interface (BCI)-based mechanism. Electroencephalography (EEG) has gained popularity as a non-invasive way of brain connection. Traditionally, the devices were used in clinical settings to detect various brain diseases. However, as technology advances, companies such as Emotiv and NeuroSky are developing low-cost, easily portable EEG-based consumer-grade devices that can be used in various application domains such as gaming, education. This article discusses the parts in which the EEG has been applied and how it has proven beneficial for those with severe motor disorders, rehabilitation, and as a form of communicating with the outside world. This article examines the use of the SVM, k-NN, and decision tree algorithms to classify EEG signals. To minimize the complexity of the data, maximum overlap discrete wavelet transform (MODWT) is used to extract EEG features. The mean inside each window sample is calculated using the Sliding Window Technique. The vector machine (SVM), k-Nearest Neighbor, and optimize decision tree load the feature vectors.
Early in the 20th century, as a result of technological advancements, the importance of digital marketing significantly increased as the necessity for digital customer experience, promotion, and distribution emerged. Since the year 1988, in the case when the term ”Digital Marketing” first appeared, the business sector has undergone drastic growth, moving from small startups to massive corporations on a global scale. The marketer must navigate a chaotic environment caused by the vast volume of generated data. Decision-makers must contend with the fact that user data is dynamic and changes every day. Smart applications must be used within enterprises to better evaluate, classify, enhance, and target audiences. Customers who are tech-savvy are pushing businesses to make bigger financial investments and use cutting-edge technologies. It was only natural that marketing and trade could be one of the areas to move to such development, which helps to move to the speed of spread, advertisements, along with other things to facilitate things for reaching and winning customers. In this study, we utilized machine learning (ML) algorithms (Decision tree (DT), K-Nearest Neighbor (KNN), CatBoost, and Random Forest (RF) (for classifying data in customers to move to development. Improve the ability to forecast customer behavior so one can gain more business from them more quickly and easily. With the use of the aforementioned dataset, the suggested system was put to the test. The results show that the system can accurately predict if a customer will buy something or not; the random forest (RF) had an accuracy of 0.97, DT had an accuracy of 0. 95, KNN had an accuracy of 0. 91, while the CatBoost algorithm had the execution time 15.04 of seconds, and gave the best result of highest f1 score and accuracy (0.91, 0. 98) respectively. Finally, the study’s future goals involve being created a web page, thereby helping many banking institutions with speed and forecast accuracy. Using more techniques of feature selection in conjunction with the marketing dataset to improve diagnosis.
COVID-19 emerged in 2019 in china, the worldwide spread rapidly, and caused many injuries and deaths among humans. Accurate and early detection of COVID-19 can ensure the long-term survival of patients and help prohibit the spread of the epidemic. COVID-19 case classification techniques help health organizations quickly identify and treat severe cases. Algorithms of classification are one the essential matters for forecasting and making decisions to assist the diagnosis, early identification of COVID-19, and specify cases that require to intensive care unit to deliver the treatment at appropriate timing. This paper is intended to compare algorithms of classification of machine learning to diagnose COVID-19 cases and measure their performance with many metrics, and measure mislabeling (false-positive and false-negative) to specify the best algorithms for speed and accuracy diagnosis. In this paper, we focus onto classify the cases of COVID-19 using the algorithms of machine learning. we load the dataset and perform dataset preparation, pre-processing, analysis of data, selection of features, split of data, and use of classification algorithm. In the first using four classification algorithms, (Stochastic Gradient Descent, Logistic Regression, Random Forest, Naive Bayes), the outcome of algorithms accuracy respectively was 99.61%, 94.82% ,98.37%,96.57%, and the result of execution time for algorithms respectively were 0.01s, 0.7s, 0.20s, 0.04. The Stochastic Gradient Descent of mislabeling was better. Second, using four classification algorithms, (eXtreme-Gradient Boosting, Decision Tree, Support Vector Machines, K_Nearest Neighbors), the outcome of algorithms accuracy was 98.37%, 99%, 97%, 88.4%, and the result of execution time for algorithms respectively were 0.18s, 0.02s, 0.3s, 0.01s. The Decision Tree of mislabeling was better. Using machine learning helps improve allocate medical resources to maximize their utilization. Classification algorithm of clinical data for confirmed COVID-19 cases can help predict a patient's need to advance to the ICU or not need by using a global dataset of COVID-19 cases due to its accuracy and quality.
Data-intensive science is a critical science paradigm that interferes with all other sciences. Data mining (DM) is a powerful and useful technology with wide potential users focusing on important meaningful patterns and discovers a new knowledge from a collected dataset. Any predictive task in DM uses some attribute to classify an unknown class. Classification algorithms are a class of prominent mathematical techniques in DM. Constructing a model is the core aspect of such algorithms. However, their performance highly depends on the algorithm behavior upon manipulating data. Focusing on binarazaition as an approach for preprocessing, this paper analysis and evaluates different classification algorithms when construct a model based on accuracy in the classification task. The Mixed National Institute of Standards and Technology (MNIST) handwritten digits dataset provided by Yann LeCun has been used in evaluation. The paper focuses on machine learning approaches for handwritten digits detection. Machine learning establishes classification methods, such as K-Nearest Neighbor(KNN), Decision Tree (DT), and Neural Networks (NN). Results showed that the knowledge-based method, i.e. NN algorithm, is more accurate in determining the digits as it reduces the error rate. The implication of this evaluation is providing essential insights for computer scientists and practitioners for choosing the suitable DM technique that fit with their data.
Due to the changing flow conditions during the pipeline's operation, several locations of erosion, damage, and failure occur. Leak prevention and early leak detection techniques are the best pipeline risk mitigation measures. To reduce detection time, pipeline models that can simulate these breaches are essential. In this study, numerical modeling using COMSOL Multiphysics is suggested for different fluid types, velocities, pressure distributions, and temperature distributions. The system consists of 12 meters of 8-inch pipe. A movable ball with a diameter of 5 inches is placed within. The findings show that dead zones happen more often in oil than in gas. Pipe insulation is facilitated by the gas phase's thermal inefficiency (thermal conductivity). The fluid mixing is improved by 2.5 m/s when the temperature is the lowest. More than water and gas, oil viscosity and dead zones lower maximum pressure. Pressure decreases with maximum velocity and vice versa. The acquired oil data set is utilized to calibrate the Support Vector Machine and Decision Tree techniques using MATLAB R2021a, ensuring the precision of the measurement. The classification result reveals that the Support Vector Machine (SVM) and Decision Tree (DT) models have the best average accuracy, which is 98.8%, and 99.87 %, respectively.
Object detection has become faster and more precise due to improved computer vision systems. Many successful object detections have dramatically improved owing to the introduction of machine learning methods. This study incorporated cutting- edge methods for object detection to obtain high-quality results in a competitive timeframe comparable to human perception. Object-detecting systems often face poor performance issues. Therefore, this study proposed a comprehensive method to resolve the problem faced by the object detection method using six distinct machine learning approaches: stochastic gradient descent, logistic regression, random forest, decision trees, k-nearest neighbor, and naive Bayes. The system was trained using Common Objects in Context (COCO), the most challenging publicly available dataset. Notably, a yearly object detection challenge is held using COCO. The resulting technology is quick and precise, making it ideal for applications requiring an object detection accuracy of 97%.
Breast cancer is one of the most critical diseases suffered by many people around the world, making it the most common medical risk they will face. This disease is considered the leading cause of death around the world, and early detection is difficult. In the field of healthcare, where early diagnosis based on machine learning (ML) helps save patients’ lives from the risks of diseases, better-performing diagnostic procedures are crucial. ML models have been used to improve the effectiveness of early diagnosis. In this paper, we proposed a new feature selection method that combines two filter methods, Pearson correlation and mutual information (PC-MI), to analyse the correlation amongst features and then select important features before passing them to a classification model. Our method is capable of early breast cancer prediction and depends on a soft voting classifier that combines a certain set of ML models (decision tree, logistic regression and support vector machine) to produce one model that carries the strengths of the models that have been combined, yielding the best prediction accuracy. Our work is evaluated by using the Wisconsin Diagnostic Breast Cancer datasets. The proposed methodology outperforms previous work, achieving 99.3% accuracy, an F1 score of 0.9922, a recall of 0.9846, a precision of 1 and an AUC of 0.9923. Furthermore, the accuracy of 10-fold cross-validation is 98.2%.
The learning process in online lectures through the Learning Management System (LMS) will produce a learning flow according to the event log. Assessment in a group of parallel classes is expected to produce the same assessment point of view based on the semester lesson plan. However, it does not rule out the implementation of each class to produce unequal fairness. Some of the factors considered to influence the assessment in the classroom include the flow of learning, different lecturers, class composition, time and type of assessment, and student attendance. The implementation of process mining in fairness assessment is used to determine the extent to which the learning flow plays a role in the assessment of ten parallel classes, including international classes. Moreover, a decision tree algorithm will also be applied to determine the root cause of the student assessment analysis based on the causal factors. As a result, there are three variables that have effects on student graduation and assessment, i.e attendance, class and gender. Variable lecturer does not have much impact on the assessment, but has an influence on the learning flow.