Skip to main content

IEEE SPL Article

A Novel Modified Mel-DCT Filter Bank Structure With Application to Voice Activity Detection

We propose a novel modified Mel-discrete cosine transform (MMD) filter bank structure, which restricts the overlap of each filter response to its immediate neighbor. In contrast to the well-known triangular filters employed in the extraction of the Mel-frequency cepstral coefficients (MFCC), the proposed filter structure has a smoother response and offers discrete cosine transformation and Mel-scale filtering in a single operation.

Read more

Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality

Utilizing a human-perception-related objective function to train a speech enhancement model has become a popular topic recently. The main reason is that the conventional mean squared error (MSE) loss cannot represent auditory perception well. One of the typical human-perception-related metrics, which is the perceptual evaluation of speech quality (PESQ), has been proven to provide a high correlation to the quality scores rated by humans.

Read more

Efficient Sensing of Correlated Spatiotemporal Signals: A Stochastic Gradient Approach

A significantly low cost and tractable progressive learning approach is proposed and discussed for efficient spatiotemporal monitoring of a completely unknown, two dimensional correlated signal distribution in localized wireless sensor field. The spatial distribution is compressed into a number of its contour lines and only those sensors that their sensor observations are in a margin of the contour levels are reporting to the information fusion center (FC).

Read more