Skip to main content

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification

There are a number of studies about extraction of bottleneck (BN) features from deep neural networks (DNNs) trained to discriminate speakers, pass-phrases, and triphone states for improving the performance of text-dependent speaker verification (TD-SV). However, a moderate success has been achieved.

Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation

Single-channel, speaker-independent speech separation methods have recently seen great progress. However, the accuracy, latency, and computational cost of such methods remain insufficient. The majority of the previous methods have formulated the separation problem through the time–frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase...

A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation

One of the challenges in computational acoustics is the identification of models that can simulate and predict the physical behavior of a system generating an acoustic signal. Whenever such models are used for commercial applications, an additional constraint is the time to market, making automation of the sound design process desirable.

Robust Joint Estimation of Multimicrophone Signal Model Parameters

One of the biggest challenges in multimicrophone applications is the estimation of the parameters of the signal model, such as the power spectral densities (PSDs) of the sources, the early (relative) acoustic transfer functions of the sources with respect to the microphones, the PSD of late reverberation, and the PSDs of microphone-self noise.