Iraqi Journal for Electrical and Electronic Engineering
Login
Iraqi Journal for Electrical and Electronic Engineering
  • Home
  • Articles & Issues
    • Latest Issue
    • All Issues
  • Authors
    • Submit Manuscript
    • Guide for Authors
    • Authorship
    • Article Processing Charges (APC)
    • Proofreading Service
  • Reviewers
    • Guide for Reviewers
    • Become a Reviewer
  • About
    • About Journal
    • Aims and Scope
    • Editorial Team
    • Journal Insights
    • Peer Review Process
    • Publication Ethics
    • Plagiarism
    • Allegations of Misconduct
    • Appeals and Complaints
    • Corrections and Withdrawals
    • Open Access
    • Archiving Policy
    • Abstracting and indexing
    • Announcements
    • Contact

Search Results for nnmf

Article
NNMF with Speaker Clustering in a Uniform Filter-Bank for Blind Speech Separation

Ruaa N. Ismael, Hasan M. Kadhim

Pages: 111-121

PDF Full Text
Abstract

This study proposes a blind speech separation algorithm that employs a single-channel technique. The algorithm’s input signal is a segment of a mixture of speech for two speakers. At first, filter bank analysis transforms the input from time to time-frequency domain (spectrogram). Number of sub-bands for the filter is 257. Non-Negative Matrix Factorization (NNMF) factorizes each sub-band output into 28 sub-signals. A binary mask separates each sub-signal into two groups; one group belongs to the first speaker and the other to the second speaker. The binary mask separates each sub-signal of the (257×28) 7196 sub-speech signals. That separation cannot identify the speaker. Identification of the sub-signal speaker for each sub-signal is achieved by speaker clustering algorithms. Since speaker clustering cannot process without speaker segmentation, the standard windowed-overlap frames have been used to partition the speech. The speaker clustering process fetches the extracted phase angle from the spectrogram (of the mixture speech) and merges it into the spectrogram (of the recovered speech). Filter bank synthesizes these signals to produce a full-band speech signal for each speaker. Subjective tests denote that the algorithm results are accepted. Objectively, the researchers experimented with 66 mixture chats (6 females and 6 males) to test the algorithm. The average of the SIR test is 11.1 dB, SDR is 1.7 dB, and SAR is 2.8 dB.

1 - 1 of 1 items

Search Parameters

Journal Logo
Iraqi Journal for Electrical and Electronic Engineering

College of Engineering, University of Basrah

  • Copyright Policy
  • Terms & Conditions
  • Privacy Policy
  • Accessibility
  • Cookie Settings
Licensing & Open Access

CC BY 4.0 Logo Licensed under CC-BY-4.0

This journal provides immediate open access to its content.

Editorial Manager Logo Elsevier Logo

Peer-review powered by Elsevier’s Editorial Manager®

Copyright © 2025 College of Engineering, University of Basrah. All rights reserved, including those for text and data mining, AI training, and similar technologies.