TASLP Featured Articles

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TASLP Featured Articles

The transfer of acoustic data across languages has been shown to improve keyword search (KWS) performance in data-scarce settings. In this paper, we propose a way of performing this transfer that reduces the impact of the prevalence of out-of-vocabulary (OOV) terms on KWS in such a setting.

Recently, the binaural auditory-model-based quality prediction (BAM-Q) was successfully applied to predict binaural audio quality degradations, while the generalized power-spectrum model for quality (GPSM q ) has been demonstrated to account for a large variety of monaural signal distortions.

The avoidance of spatial aliasing is a major challenge in the practical implementation of sound field synthesis. Such methods aim at a physically accurate reconstruction of a desired sound field inside a target region using a finite ensemble of loudspeakers. In the past, different theoretical treatises of the inherent spatial sampling process led to anti-aliasing criteria for simple loudspeaker array arrangements, e.g., lines and circles, and fundamental sound fields, e.g., plane and spherical waves. Many criteria were independent of the listener's position inside the target region.

Recently, generative neural network models which operate directly on raw audio, such as WaveNet, have improved the state of the art in text-to-speech synthesis (TTS). Moreover, there is increasing interest in using these models as statistical vocoders for generating speech waveforms from various acoustic features. However, there is also a need to reduce the model complexity, without compromising the synthesis quality.

Multi-channel linear prediction (MCLP) can model the late reverberation in the short-time Fourier transform domain using a delayed linear predictor and the prediction residual is taken as the desired early reflection component. Traditionally, a Gaussian source model with time-dependent precision (inverse of variance) is considered for the desired signal. In this paper, we propose a Student's t-distribution model for the desired signal, which is realized as a Gaussian source with a Gamma distributed precision.

Each edition of the challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) contained several tasks involving sound event detection in different setups. DCASE 2017 presented participants with three such tasks, each having specific datasets and detection requirements: Task 2, in which target sound events were very rare in both training and testing data, Task 3 having overlapping events annotated in real-life audio, and Task 4, in which only weakly labeled data were available for training.

Pages

Table of Contents:

TASLP Featured Articles

SPS on Twitter

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar