TASLP Volume 27 Issue 6

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

2019

TASLP Volume 27 Issue 6

The avoidance of spatial aliasing is a major challenge in the practical implementation of sound field synthesis. Such methods aim at a physically accurate reconstruction of a desired sound field inside a target region using a finite ensemble of loudspeakers. In the past, different theoretical treatises of the inherent spatial sampling process led to anti-aliasing criteria for simple loudspeaker array arrangements, e.g., lines and circles, and fundamental sound fields, e.g., plane and spherical waves. Many criteria were independent of the listener's position inside the target region.

Recently, generative neural network models which operate directly on raw audio, such as WaveNet, have improved the state of the art in text-to-speech synthesis (TTS). Moreover, there is increasing interest in using these models as statistical vocoders for generating speech waveforms from various acoustic features. However, there is also a need to reduce the model complexity, without compromising the synthesis quality.

Multi-channel linear prediction (MCLP) can model the late reverberation in the short-time Fourier transform domain using a delayed linear predictor and the prediction residual is taken as the desired early reflection component. Traditionally, a Gaussian source model with time-dependent precision (inverse of variance) is considered for the desired signal. In this paper, we propose a Student's t-distribution model for the desired signal, which is realized as a Gaussian source with a Gamma distributed precision.

Each edition of the challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) contained several tasks involving sound event detection in different setups. DCASE 2017 presented participants with three such tasks, each having specific datasets and detection requirements: Task 2, in which target sound events were very rare in both training and testing data, Task 3 having overlapping events annotated in real-life audio, and Task 4, in which only weakly labeled data were available for training.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel