Page 63 - IJEEE-2022-Vol18-ISSUE-1
P. 63
Hussein & Ali | 59
output feature map consists of many channels where there Efficient results were obtained in this field and this was
are classes for each pixel to predict from [5][6]. So, Semantic proven with CNN.
Segmentation is assigned class to each pixel in the image to a
class label [7]. II. RELATED WORKS
Fig. 1: Show is Semantic Segmentation The most important problem with computer vision is
what is called semantic segmentation [11]. It's widely
The degree of complexity of the issue varies from utilized in image processing to get a full picture of a
image to image, depending on the job and according to the situation. Because of the rapid advancement In recent years,
degree of complexity that you train the neural network. The deep learning architectures have been used to solve the
purpose of semantic image segmentation is to produce more majority of semantic segmentation difficulties [12].
than just labels and bounding box parameters as expected Convolutional neural networks are the most efficient and
output [8]. The output is a high-resolution image (usually the accurate deep learning designs.
same size as the input) with each pixel classified into a
different class. Since 2012, various Convolutional Neural Network
based architectures like VGG16 [13], ResNet [14], Mobile
Computer vision is a subfield of computer science that Net [15], U-Net, as well as the recently developed Efficient
tries to create intelligent applications that can comprehend Net, have evolved and set standards in picture classification.
the information of images in the same way that humans can. The application of these CNNs as feature extractors has
Where picture data can be in a variety of formats, including lately made substantial progress in the field of semantic
sequential images (video), scenes captured by several segmentation. Fully Convolutional Neural Networks were
cameras, and data with multiple dimensions obtained from a used in one of the first attempts at semantic segmentation
medical imaging device [9][10]: using CNN (FCN) [16]. The loss of spatial information of
? Recognition: one or some of the objects that were small and thin objects is hampered by the CNNs' progressive
down sampling of the original image resolution. The notion
previously marked to the computer are recognized, often of dilated convolution was invented in to solve this problem
with their different positions or different camera angles. [17] to increase the resolution of the feature map while
? Select: select a single match for the defined object. For maintaining the receptive field of the neuron. Dilated
example: identifying the face of a particular person or residual networks were proposed by Yu et al., which solved
identifying the fingerprint of a particular person or a the problem of gridding artifacts [18].
vehicle of a particular type.
? Investigation: The image data is searched to find a III. DATA SETS PREPARING
specific object.
Example: investigating the presence of diseased cells in a Computer vision systems vary greatly, ranging from
medical picture, investigating the presence of a car on a large and complex systems that perform general and
highway. comprehensive tasks, and between small systems that
? Image retrieval based on content: Images stored in a perform simple and customized tasks. But most computer
specific database are retrieved based on the content and vision systems mainly include the following components:
concepts similar to the query from within the database. ? Image acquisition: From the image sensors we get the
One of the most popular query methods in CBIR systems
is the Query Image query, where an image is entered and image used, these include many cameras with light
the output is a set of similar images. sensors, distance sensors, radiographic devices, radar,
? Contributions aerial images containing 72 satellite ultrasound cameras, and others. Depending on the type of
images of Dubai were first applied with the proposed sensor, the resulting image can be 2D, 3D, or a series of
method U-Net to find out the efficiency of U-Net for sequential images. The value of each pixel in the image
semantic segmentation, and it was compared with CNN. depends on one or more light intensity levels (gray scale
images or color images) and can indicate many physical
measurements such as absorption, reflection of
electromagnetic waves or distance.
? Pre-processes: It is necessary to ensure that the data
provides the specific data onto the algorithm before
applying the computer vision algorithm to the image and
then obtaining the required information, including
resetting the resolution and clarity to ensure the
correctness of the image coordinates system. Second,
Minimize noise in order to ensure that the sensor is not
giving any false information. Third, Increase the variance