Page 153 - 2023-Vol19-Issue2
P. 153

149 |                                                              Mohammed, Oraibi & Hussain

Fig. 1. CBIR system structure. The CBIR system is separated            The number and type of images in our dataset play a
into online and offline subsystems in accordance with two          critical role in the performance of our CBIR system. The count
distinct information processing methods, with a common             of 3,843 images is substantial and helps in achieving diversity
feature extraction block shared between them.                      and generalization in our image retrieval model. This broad set
                                                                   of images enables the model to learn and differentiate between
feature-based picture representation is essential. The CBIR        a wide range of categories and their subclasses. However, as
system is built technologically on database search and picture     with any machine learning task, the more data, the better. So
representation. It may therefore review CBIR research based        while our number is a good start, we may need to supplement
on advancements in respectively, database search and picture       our dataset to enhance the model’s ability to understand and
synthesis.                                                         distinguish between complex and nuanced differences within
                                                                   and across categories.
B. Feature Extraction
The crucial stage in CBIR is the image representation, which           The dataset exclusively containing JPEG images is of
involves taking the important elements from an image and           importance too. JPEG is a common image format, and its
turning them into a fixed-sized vector (so called Figure 2:        widespread use is partly why we chose it. However, the JPEG
The feature vector). The conventional features, classification     format uses lossy compression, which may result in some
CNN features, and retrieval CNN features are the three broad       loss of image detail. This could pose a challenge when the
categories into which the extracted features may generally be      retrieval task requires fine-grained identification or discrimina-
separated. In the next section, we provide the techniques for      tion. Moreover, different image formats may exhibit distinct
image representation for CBIR based on each of these three         characteristics or encode different levels of detail, which could
feature groups.                                                    affect the feature extraction process.

                 IV. METHODOLOGY                                       The collection of metadata such as image resolution, color
                                                                   depth, and file size helps in the retrieval process by providing
In this section, we will present the dataset used in our work.     additional dimensions for searching and matching. However,
In addition, the three DL architectures used in this paper are     inconsistency in these metadata parameters (like differing res-
described. Finally, we will discuss the methods we will use        olutions) can add to the complexity of the task and potentially
for feature extraction in CBIR as Figure 2 illustrate.             affect the system’s performance. In the CBIR task, different
                                                                   feature extraction methods can be used on our dataset, includ-
A. Dataset                                                         ing Xception, MobileNet, Inception, and Ensemble Learning.
Our dataset, designed for CBIR, comprises 3,843 JPEG im-           These methods are tasked to convert raw image data into a
ages. These images are categorically arranged into 20 dif-         suitable form that can be processed by our model. The choice
ferent classes: TajMahal, Bottle, Shark, Lotus, Eiffel Tower,      of method may significantly affect the retrieval performance,
Jeans, Ship, Dalmatian, Obama, Apple, Maggi, Clock, Bud-           and hence selecting an appropriate feature extraction strategy
dha, Modi, Helmet, Mobile, Peacock, Soccer Ball, Tabla,            is another challenge with our dataset.
Horse. A sample image from each class is presented in Fig-
ure 3. Each class is further divided into subclasses based on          In conclusion, our dataset, while being a robust starting
different attributes. For instance, the ”car” class is subdivided  point for our CBIR system, does pose certain challenges that
into types of cars, like sedans, SUVs, and sports cars, or even    need to be addressed to ensure optimal performance. It’s a
by different car brands. This hierarchical organization helps in   reminder that dataset creation and management is as crucial
a more nuanced and detailed retrieval process, thus enhancing      a step as model selection and tuning in machine learning
the accuracy of the system.                                        projects.

                                                                   B. Inception Model
                                                                   One of the CNN networks, Inception, is used specifically
                                                                   for extracting characteristics from query pictures as well as
                                                                   database pictures [30, 31]. It benefits from factoring convolu-
                                                                   tions into distinct branches that operate on space and channels
                                                                   in succession. It uses a wide range of strategies to optimize
                                                                   the network. The core idea behind the Inception framework is
                                                                   to swap out tiny kernels for bigger ones in order to learn mul-
                                                                   tiscale representations and to lower the amount of restrictions
                                                                   and computational complexity [32, 33].
   148   149   150   151   152   153   154   155   156   157   158