Page 160 - 2024-Vol20-Issue2
P. 160

156 |                                                            Murad & Alasadi

                       TABLE I.                                  B. 3D Model-based Approach
A CONCISE COMPARISON OF HAND GESTURE                             The hand’s form is modeled and examined using the 3D model
                                                                 description [28], including the kinematic parameters required
           RECOGNITION APPROACHES                                to project the three-dimensional model into a two-dimensional
                                                                 one. However, this method can lead to a loss of features. Var-
Approach Pros              Cons                                  ious methods include volumetric, geometrical, and skeleton
Vision- Natural interac-   Lighting variations, com-             types [29]. Volumetric models deal with the three-dimensional
based tion, low cost       plex backgrounds, clutter,            and visual appearance of the human hand, while skeleton
                           time and computation re-              methods limit the set of parameters needed to form the hand’s
Glove-    Accurate com-    quirements                            shape. Geometrical models mimic the efficiency of visual
based     putation of      Expensive, difficult to               images but require several parameters and a time-consuming
          hand configura-  connect to computers, un-             process. Geometric shapes, such as mesh polygons and card-
Colored-  tion             suitable for virtual reality          board models, can also be used to approach visual forms.
markers   Simple and in-                                         However, this method has several disadvantages [30], includ-
          expensive        Limited natural interac-              ing the need for initial parameters to be near the solution for
                           tion                                  each image, noise in the imaging phase, difficulty in extract-
                                                                 ing features, and incapability to deal with singularities caused
ity environments [22]. This data-obtaining technique is often    by ambiguous views. In general, appearance-based detection
used in sign language [23] and Gaming [24]. Moore’s Law          in real-time is better than 3D-model-based methods, but it can
predicts sensor size and affordability will increase in the fu-  represent a wide range of hand gestures.
ture. Data gloves, including MIT, CyberGlove II, CyberGlove
III, Fifth Dimension Sensor Glove Ultra, P5, and X-IST, are                   V. FEATURE EXTRACTION
expected to become more prevalent.
                                                                 Feature extraction techniques collect data on gestures’ po-
C. Colored-Markers Approaches                                    sition, direction, posture, and temporal progression. They
The use of colored gloves for hand tracking and locating         process and analyze low-level data (pixel values) to produce
fingers and palms has been developed using markers and wool      high-level data like object contours. A manual algorithm is
gloves [25]. These gloves extract geometric characteristics      created to recognize and encode specific image features, such
to outline the hand’s shape. However, natural interaction        as texture, shape, and color. When an image is a collection
between humans and computers remains insufficient, despite       of pixels, a manually defined algorithm is applied to obtain a
sensor or data glove advancements.                               feature vector that describes the image’s contents. The result-
Table I shows a concise table of hand gesture recognition        ing feature vectors are used as inputs for machine-learning
approaches.                                                      models. Feature extraction reduces data dimensionality by
                                                                 encoding relevant information into a compressed representa-
      IV. HAND GESTURE RECOGNITION                               tion and eliminating less discriminative data. The efficiency
                       APPROACHES                                of gesture recognition relies heavily on feature extraction,
                                                                 making the most critical design decisions in hand motion, and
Techniques for gesture hand recognition rely on the bare hand    gesture recognition is deciding which features to work with
and extract data, offering properties like simplicity and ease   and how to extract them. The following subsections discuss
[26]. They provide direct connections with computers and         popular feature types and computation techniques, including
can be categorized into two types.                               shape and motion features.

A. Appearance-Based Approach                                     A. Motion
Appearance-based approaches design by extracting features        Using frame-to-frame comparisons, researchers use motion
from input hand images and comparing them to stored fea-         cues to detect hands, which are computationally efficient meth-
tures [27]. This method is simpler and easier than three-        ods for finding foreground objects and detecting their motion
dimensional models but can be affected by lighting and back-     and position. This method relies on assumptions like a static
ground objects. These patterns are part of a general pattern     background, image pre-processing, and a stable camera. Binh
recognition problem consisting of three tasks: extracting fea-   et al [31] used the Kalman filter to predict the hand’s position
tures, classifying from labeled training samples, and classify-  in one frame based on its observed position in the previous
ing unknown samples.                                             frame. Modern approaches combine motion information with
                                                                 other visual cues to enhance detection.
   155   156   157   158   159   160   161   162   163   164   165