Page 63 - IJEEE-2022-Vol18-ISSUE-1
P. 63

Hussein & Ali                                                                                                                       | 59

output feature map consists of many channels where there             Efficient results were obtained in this field and this was
are classes for each pixel to predict from [5][6]. So, Semantic      proven with CNN.
Segmentation is assigned class to each pixel in the image to a
class label [7].                                                                        II. RELATED WORKS

                Fig. 1: Show is Semantic Segmentation                   The most important problem with computer vision is
                                                                 what is called semantic segmentation [11]. It's widely
      The degree of complexity of the issue varies from          utilized in image processing to get a full picture of a
image to image, depending on the job and according to the        situation. Because of the rapid advancement In recent years,
degree of complexity that you train the neural network. The      deep learning architectures have been used to solve the
purpose of semantic image segmentation is to produce more        majority of semantic segmentation difficulties [12].
than just labels and bounding box parameters as expected         Convolutional neural networks are the most efficient and
output [8]. The output is a high-resolution image (usually the   accurate deep learning designs.
same size as the input) with each pixel classified into a
different class.                                                       Since 2012, various Convolutional Neural Network
                                                                 based architectures like VGG16 [13], ResNet [14], Mobile
    Computer vision is a subfield of computer science that       Net [15], U-Net, as well as the recently developed Efficient
tries to create intelligent applications that can comprehend     Net, have evolved and set standards in picture classification.
the information of images in the same way that humans can.       The application of these CNNs as feature extractors has
Where picture data can be in a variety of formats, including     lately made substantial progress in the field of semantic
sequential images (video), scenes captured by several            segmentation. Fully Convolutional Neural Networks were
cameras, and data with multiple dimensions obtained from a       used in one of the first attempts at semantic segmentation
medical imaging device [9][10]:                                  using CNN (FCN) [16]. The loss of spatial information of
? Recognition: one or some of the objects that were              small and thin objects is hampered by the CNNs' progressive
                                                                 down sampling of the original image resolution. The notion
    previously marked to the computer are recognized, often      of dilated convolution was invented in to solve this problem
    with their different positions or different camera angles.   [17] to increase the resolution of the feature map while
? Select: select a single match for the defined object. For      maintaining the receptive field of the neuron. Dilated
    example: identifying the face of a particular person or      residual networks were proposed by Yu et al., which solved
    identifying the fingerprint of a particular person or a      the problem of gridding artifacts [18].
    vehicle of a particular type.
? Investigation: The image data is searched to find a                               III. DATA SETS PREPARING
    specific object.
    Example: investigating the presence of diseased cells in a          Computer vision systems vary greatly, ranging from
    medical picture, investigating the presence of a car on a    large and complex systems that perform general and
    highway.                                                     comprehensive tasks, and between small systems that
? Image retrieval based on content: Images stored in a           perform simple and customized tasks. But most computer
    specific database are retrieved based on the content and     vision systems mainly include the following components:
    concepts similar to the query from within the database.      ? Image acquisition: From the image sensors we get the
    One of the most popular query methods in CBIR systems
    is the Query Image query, where an image is entered and          image used, these include many cameras with light
    the output is a set of similar images.                           sensors, distance sensors, radiographic devices, radar,
? Contributions aerial images containing 72 satellite                ultrasound cameras, and others. Depending on the type of
    images of Dubai were first applied with the proposed             sensor, the resulting image can be 2D, 3D, or a series of
    method U-Net to find out the efficiency of U-Net for             sequential images. The value of each pixel in the image
    semantic segmentation, and it was compared with CNN.             depends on one or more light intensity levels (gray scale
                                                                     images or color images) and can indicate many physical
                                                                     measurements such as absorption, reflection of
                                                                     electromagnetic waves or distance.
                                                                 ? Pre-processes: It is necessary to ensure that the data
                                                                     provides the specific data onto the algorithm before
                                                                     applying the computer vision algorithm to the image and
                                                                     then obtaining the required information, including
                                                                     resetting the resolution and clarity to ensure the
                                                                     correctness of the image coordinates system. Second,
                                                                     Minimize noise in order to ensure that the sensor is not
                                                                     giving any false information. Third, Increase the variance
   58   59   60   61   62   63   64   65   66   67   68