IEEE TASLP Article

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

IEEE TASLP Article

This paper presents a robust beamformer for stereo noise reduction in hearing aid applications. The worst-case optimization method was applied to the binaural minimum-variance distortionless-response (BMVDR) beamformer, for providing robustness against parameter estimation inaccuracies.

The filtered-x least-mean-square (FxLMS) algorithm has been widely used for the active noise control. A fundamental analysis of the convergence behavior of the FxLMS algorithm, including the transient and steady-state performance, could provide some new insights into the algorithm and can be also helpful for its practical applications, e.g., the choice of the step size.

Active noise control (ANC) is a technology which lowers the noise level by using the principle of destructive interference of sound wave. Even though recent developments in digital signal processing (DSP) made it possible to implement ANC algorithms in real-time, insufficient computational power is still one of the challenges to solve. In the previous research, as a way of overcoming the lack of computational power, CPU-GPU architecture was proposed so that ANC algorithms utilize the massive computing power of GPU without suffering from the block data transfer between CPU and GPU memories.

This article investigates deep learning based single- and multi-channel speech dereverberation. For single-channel processing, we extend magnitude-domain masking and mapping based dereverberation to complex-domain mapping, where deep neural networks (DNNs) are trained to predict the real and imaginary (RI) components of the direct-path signal from reverberant (and noisy) ones.

The problem of blind audio source separation (BASS) in noisy and reverberant conditions is addressed by a novel approach, termed Global and LOcal Simplex Separation (GLOSS), which integrates full- and narrow-band simplex representations. We show that the eigenvectors of the correlation matrix between time frames in a certain frequency band form a simplex that organizes the frames according to the speaker activities in the corresponding band. 

This work presents a method that persuades acoustic reflections to be a favorable property for sound source localization. Whilst most real world spatial audio applications utilize prior knowledge of sound source position, estimating such positions in reverberant environments is still considered to be a difficult problem due to acoustic reflections.

Differential microphone arrays (DMAs) often encounter white noise amplification, especially at low frequencies. If the array geometry and the number of microphones are fixed, one can improve the white noise amplification problem by reducing the DMA order. With the existing differential beamforming methods, the DMA order can only be a positive integer number. 

Recurrent neural networks (RNNs) can predict fundamental frequency (F 0 ) for statistical parametric speech synthesis systems, given linguistic features as input. However, these models assume conditional independence between consecutive F 0 values, given the RNN state. In a previous study, we proposed autoregressive (AR) neural F 0 models to capture the causal dependency of successive F 0 values.

This article addresses the problem of distance estimation using binaural hearing aid microphones in reverberant rooms. Among several distance indicators, the direct-to-reverberant energy ratio (DRR) has been shown to be more effective than other features. Therefore, we present two novel approaches to estimate the DRR of binaural signals.

Visual cues such as lip movements, when available, play an important role in speech communication. They are especially helpful for the hearing impaired population or in noisy environments. When not available, having a system to automatically generate talking faces in sync with input speech would enhance speech communication and enable many novel applications. 

Pages

SPS on Twitter

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar