The segmentation methods for image processing are studied in the presented work. Image segmentation can be defined as a vital step in digital image processing. Also, it is used in various applications including object co-segmentation, recognition tasks, medical imaging, content based image retrieval, object detection, machine vision and video surveillance. A lot of approaches were created for image segmentation. In addition, the main goal of segmentation is to facilitate and alter the image representation into something which is more important and simply to be analyzed. The approaches of image segmentation are splitting the images into a few parts on the basis of image’s features including texture, color, pixel intensity value and so on. With regard to the presented study, many approaches of image segmentation are reviewed and discussed. The techniques of segmentation might be categorized into six classes: First, thresholding segmentation techniques such as global thresholding (iterative thresholding, minimum error thresholding, otsu's, optimal thresholding, histogram concave analysis and entropy based thresholding), local thresholding (Sauvola’s approach, T.R Singh’s approach, Niblack’s approaches, Bernsen’s approach Bruckstein’s and Yanowitz method and Local Adaptive Automatic Binarization) and dynamic thresholding. Second, edge-based segmentation techniques such as gray-histogram technique, gradient based approach (laplacian of gaussian, differential coefficient approach, canny approach, prewitt approach, Roberts approach and sobel approach). Thirdly, region based segmentation approaches including Region growing techniques (seeded region growing (SRG), statistical region growing, unseeded region growing (UsRG)), also merging and region splitting approaches. Fourthly, clustering approaches, including soft clustering (fuzzy C-means clustering (FCM)) and hard clustering (K-means clustering). Fifth, deep neural network techniques such as convolution neural network, recurrent neural networks (RNNs), encoder-decoder and Auto encoder models and support vector machine. Finally, hybrid techniques such as evolutionary approaches, fuzzy logic and swarm intelligent (PSO and ABC techniques) and discusses the pros and cons of each method.
Brain tumors are collections of abnormal tissues within the brain. The regular function of the brain may be affected as it grows within the region of the skull. Brain tumors are critical for improving treatment options and patient survival rates to prevent and treat them. The diagnosis of cancer utilizing manual approaches for numerous magnetic resonance imaging (MRI) images is the most complex and time-consuming task. Brain tumor segmentation must be carried out automatically. A proposed strategy for brain tumor segmentation is developed in this paper. For this purpose, images are segmented based on region-based and edge-based. Brain tumor segmentation 2020 (BraTS2020) dataset is utilized in this study. A comparative analysis of the segmentation of images using the edge-based and region-based approach with U-Net with ResNet50 encoder, architecture is performed. The edge-based segmentation model performed better in all performance metrics compared to the region-based segmentation model and the edge-based model achieved the dice loss score of 0. 008768, IoU score of 0. 7542, f1 score of 0. 9870, the accuracy of 0. 9935, the precision of 0. 9852, recall of 0. 9888, and specificity of 0. 9951.
Image segmentation is a wide research topic; a huge amount of research has been performed in this context. Image segmentation is a crucial procedure for most object detection, image recognition, feature extraction, and classification tasks depend on the quality of the segmentation process. Image segmentation is the dividing of a specific image into a numeral of homogeneous segments; therefore, the representation of an image into simple and easy forms increases the effectiveness of pattern recognition. The effectiveness of approaches varies according to the conditions of objects arrangement, lighting, shadow and other factors. However, there is no generic approach for successfully segmenting all images, where some approaches have been proven to be more effective than others. The major goal of this study is to provide summarize of the disadvantages and the advantages of each of the reviewed approaches of image segmentation.
Arial images are very high resolution. The automation for map generation and semantic segmentation of aerial images are challenging problems in semantic segmentation. The semantic segmentation process does not give us precise details of the remote sensing images due to the low resolution of the aerial images. Hence, we propose an algorithm U-Net Architecture to solve this problem. It is classified into two paths. The compression path (also called: the encoder) is the first path and is used to capture the image's context. The encoder is just a convolutional and maximal pooling layer stack. The symmetric expanding path (also called: the decoder) is the second path, which is used to enable exact localization by transposed convolutions. This task is commonly referred to as dense prediction, which is completely connected to each other and also with the former neurons which gives rise to dense layers. Thus it is an end-to-end fully convolutional network (FCN), i.e. it only contains convolutional layers and does not contain any dense layer because of which it can accept images of any size. The performance of the model will be evaluated by improving the image using the proposed method U-NET and obtaining an improved image by measuring the accuracy compared with the value of accuracy with previous methods.
Video prediction theories have quickly progressed especially after a great revolution of deep learning methods. The prediction architectures based on pixel generation produced a blurry forecast, but it is preferred in many applications because this model is applied on frames only and does not need other support information like segmentation or flow mapping information making getting a suitable dataset very difficult. In this approach, we presented a novel end-to-end video forecasting framework to predict the dynamic relationship between pixels in time and space. The 3D CNN encoder is used for estimating the dynamic motion, while the decoder part is used to reconstruct the next frame based on adding 3DCNN CONVLSTM2D in skip connection. This novel representation of skip connection plays an important role in reducing the blur predicted and preserved the spatial and dynamic information. This leads to an increase in the accuracy of the whole model. The KITTI and Cityscapes are used in training and Caltech is applied in inference. The proposed framework has achieved a better quality in PSNR=33.14, MES=0.00101, SSIM=0.924, and a small number of parameters (2.3 M).
Given the role that pipelines play in transporting crude oil, which is considered the basis of the global economy and across different environments, hundreds of studies revolve around providing the necessary protection for it. Various technologies have been employed in this pursuit, differing in terms of cost, reliability, and efficiency, among other factors. Computer vision has emerged as a prominent technique in this field, albeit requiring a robust image-processing algorithm for spill detection. This study employs image segmentation techniques to enable the computer to interpret visual information and images effectively. The research focuses on detecting spills in oil pipes caused by leakage, utilizing images captured by a drone equipped with a Raspberry Pi and Pi camera. These images, along with their global positioning system (GPS) location, are transmitted to the base station using the message queuing telemetry transport Internet of Things (MQTT IoT) protocol. At the base station, deep learning techniques, specifically Holistically-Nested Edge Detection (HED) and extreme inception (Xception) networks, are employed for image processing to identify contours. The proposed algorithm can detect multiple contours in the images. To pinpoint a contour with a black color, representative of an oil spill, the CIELAB color space (LAB) algorithm effectively removes shadow effects. If a contour is detected, its area and perimeter are calculated to determine whether it exceeds a certain threshold. The effectiveness of the proposed system was tested on Iraqi oil pipeline systems, demonstrating its capability to detect spills of different sizes.
This study proposes a blind speech separation algorithm that employs a single-channel technique. The algorithm’s input signal is a segment of a mixture of speech for two speakers. At first, filter bank analysis transforms the input from time to time-frequency domain (spectrogram). Number of sub-bands for the filter is 257. Non-Negative Matrix Factorization (NNMF) factorizes each sub-band output into 28 sub-signals. A binary mask separates each sub-signal into two groups; one group belongs to the first speaker and the other to the second speaker. The binary mask separates each sub-signal of the (257×28) 7196 sub-speech signals. That separation cannot identify the speaker. Identification of the sub-signal speaker for each sub-signal is achieved by speaker clustering algorithms. Since speaker clustering cannot process without speaker segmentation, the standard windowed-overlap frames have been used to partition the speech. The speaker clustering process fetches the extracted phase angle from the spectrogram (of the mixture speech) and merges it into the spectrogram (of the recovered speech). Filter bank synthesizes these signals to produce a full-band speech signal for each speaker. Subjective tests denote that the algorithm results are accepted. Objectively, the researchers experimented with 66 mixture chats (6 females and 6 males) to test the algorithm. The average of the SIR test is 11.1 dB, SDR is 1.7 dB, and SAR is 2.8 dB.
Novel Coronavirus (Covid-2019), which first appeared in December 2019 in the Chinese city of Wuhan. It is spreading rapidly in most parts of the world and becoming a global epidemic. It is devastating, affecting public health, daily life, and the global economy. According to the statistics of the World Health Organization on August 11, the number of cases of coronavirus (Covid-2019) reached nearly 17 million, and the number of infections globally distributed among most European countries and most countries of the Asian continent, and the number of deaths from the Corona virus reached 700 thousand people around the world. . It is necessary to detect positive cases as soon as possible in order to prevent the spread of this epidemic and quickly treat infected patients. In this paper, the current literature on the methods used to detect Covid is presented. In these studies, the research that used different techniques of artificial intelligence to detect COVID-19 was reviewed as the convolutionary neural network (ResNet50, ResNet101, ResNet152, InceptionV3 and Inception-ResNetV2) were proposed for the identification of patients infected with coronavirus pneumonia using chest X-ray radiographs By using 5-fold cross validation, three separate binary classifications of four grades (COVID-19, normal (healthy), viral pneumonia and bacterial pneumonia) were introduced. It has been shown that the pre-trained ResNet50 model offers the highest classification performance (96.1 percent accuracy for Dataset-1, 99.5 percent accuracy for Dataset-2 and 99.7 percent accuracy for Dataset-2) based on the performance results obtained.
Nowadays, the trend has become to utilize Artificial Intelligence techniques to replace the human's mind in problem solving. Vehicle License Plate Recognition (VLPR) is one of these problems in which the computer outperforms the human being in terms of processing speed and accuracy of results. The emergence of deep learning techniques enhances and simplifies this task. This work emphasis on detecting the Iraqi License Plates based on SSD Deep Learning Algorithm. Then Segmenting the plate using horizontal and vertical shredding. Finally, the K-Nearest Neighbors (KNN) algorithm utilized to specify the type of car. The proposed system evaluated by using a group of 500 different Iraqi Vehicles. The successful results show that 98% regarding the plate detection, and 96% for segmenting operation.
Kinship (Familial relationships) detection is crucial in many fields and has applications in biometric security, adoption, forensic investigations, and more. It is also essential during wars and natural disasters like earthquakes since it may aid in reunion, missing person searches, establishing emergency contacts, and providing psychological support. The most common method of determining kinship is DNA analysis which is highly accurate. Another approach, which is noninvasive, uses facial photos with computer vision and machine learning algorithms for kinship estimation. Each part of the Human -body has its own embedded information that can be extracted and adopted for identification, verification, or classification of that person. Kinship recognition is based on finding traits that are shared by every family. We investigate the use of hand geometry for kinship detection, which is a new approach. Because of the available hand image Datasets do not contain kinship ground truth; therefore, we created our own dataset. This paper describes the tools, methodology, and details of the collected MKH, which stands for the Mosul Kinship Hand, images dataset. The images of MKH dataset were collected using a mobile phone camera with a suitable setup and consisted of 648 images for 81 individuals from 14 families (8 hand situations per person). This paper also presents the use of this dataset in kinship prediction using machine learning. Google MdiaPipe was used for hand detection, segmentation, and geometrical key points finding. Handcraft feature extraction was used to extract 43 distinctive geometrical features from each image. A neural network classifier was designed and trained to predict kinship, yielding about 93% prediction accuracy. The results of this novel approach demonstrated that the hand possesses biometric characteristics that may be used to establish kinship, and that the suggested method is a promising way as a kinship indicator.