Iraqi Journal for Electrical and Electronic Engineering
Login
Iraqi Journal for Electrical and Electronic Engineering
  • Home
  • Articles & Issues
    • Latest Issue
    • All Issues
  • Authors
    • Submit Manuscript
    • Guide for Authors
    • Authorship
    • Article Processing Charges (APC)
    • Proofreading Service
  • Reviewers
    • Guide for Reviewers
    • Become a Reviewer
  • Policies
    • Publication Ethics
    • Plagiarism
    • Allegations of Misconduct
    • Appeals and Complaints
    • Corrections and Withdrawals
    • Copyright Policy
    • Open Access
    • Archiving Policy
  • About
    • About Journal
    • Aims and Scope
    • Editorial Team
    • Journal Insights
    • Peer Review Process
    • Abstracting and indexing
    • Announcements
    • Contact

Search Results for speaker-recognition

Article
Speaker Verification Based on Mel Frequency Cepestral Coefficients and Correlation

Abdalem A. Rasheed

Pages: 77-85

PDF Full Text
Abstract

Speaker recognition refers to identifying the speaker by his or her voice. People talk in a variety of tones and each speaking voice has features that distinguish one person from another. Speaker verification (SV)involves comparing a set of measures of the speaker’s utterances with a reference for the person whose identification is being asserted to accept or reject the speaker’s identity claim. An identity claim is made during speaker verification which consists of two steps: extraction of feature and matching of feature. In this work, the analysis of correlations of Mel-scale coefficients for the voice of utterance to identify the intended speaker is presented. Short text-dependent word and other text-independent word is represented in this study. The correlation accuracy ranged from 98% to 99% for user1 (same speaker) for text-dependent. whereas 83% and 61% for user1 correlation with other speakers for text-dependent and independent respectively. Furthermore, the MFCC feature extraction approach based on distributed Discrete Cosine Transform (DCT) is provided in this research. SV tests are carried out using the MFCC feature extractions method where close variance for the target speaker and away variance for other speakers is obtained. Additionally, the principle component analysis (PCA) is provided to improve the discriminative system performance. Where the PCA chooses the optimal path between every pair of extremely confusing speakers. The results obtained from PCA were similar to the correlation finding from the Mel-scale results with enhancing the discriminative information and with lowering the dimension of MFCCs data..

Article
A Review on Voice-based Interface for Human-Robot Interaction

Ameer A. Badr, Alia K. Abdul-Hassan

Pages: 91-102

PDF Full Text
Abstract

With the recent developments of technology and the advances in artificial intelligence and machine learning techniques, it has become possible for the robot to understand and respond to voice as part of Human-Robot Interaction (HRI). The voice-based interface robot can recognize the speech information from humans so that it will be able to interact more naturally with its human counterpart in different environments. In this work, a review of the voice-based interface for HRI systems has been presented. The review focuses on voice-based perception in HRI systems from three facets, which are: feature extraction, dimensionality reduction, and semantic understanding. For feature extraction, numerous types of features have been reviewed in various domains, such as time, frequency, cepstral (i.e. implementing the inverse Fourier transform for the signal spectrum logarithm), and deep domains. For dimensionality reduction, subspace learning can be used to eliminate the redundancies of high-dimensional features by further processing extracted features to reflect their semantic information better. For semantic understanding, the aim is to infer from the extracted features the objects or human behaviors. Numerous types of semantic understanding have been reviewed, such as speech recognition, speaker recognition, speaker gender detection, speaker gender and age estimation, and speaker localization. Finally, some of the existing voice-based interface issues and recommendations for future works have been outlined.

1 - 2 of 2 items

Search Parameters

×

The submission system is temporarily under maintenance. Please send your manuscripts to

Go to Editorial Manager
Journal Logo
Iraqi Journal for Electrical and Electronic Engineering

College of Engineering, University of Basrah

  • Copyright Policy
  • Terms & Conditions
  • Privacy Policy
  • Accessibility
  • Cookie Settings
Licensing & Open Access

CC BY 4.0 Logo Licensed under CC-BY-4.0

This journal provides immediate open access to its content.

Editorial Manager Logo Elsevier Logo

Peer-review powered by Elsevier’s Editorial Manager®

Copyright © 2026 College of Engineering, University of Basrah, its licensors, and contributors. All rights reserved, including those for text and data mining, AI training, and similar technologies. For all open access content, the relevant licensing terms apply.