TASLP Articles

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TASLP Articles

Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this article is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In the first part of the work, a varying length segment initialization technique for Information Bottleneck (IB) based speaker diarization system using phoneme rate as the side information is proposed. This initialization distributes speaker information uniformly across the segments and provides a better starting point for IB based clustering. 

One practical requirement of the music copyright management is the estimation of music relative loudness, which is mostly ignored in existing music detection works. To solve this problem, we study the joint task of music detection and music relative loudness estimation. To be specific, we observe that the joint task has two characteristics, i.e., temporality and hierarchy, which could facilitate to obtain the solution. For example, a tiny fragment of audio is temporally related to its neighbor fragments because they may all belong to the same event, and the event classes of the fragment in the two tasks have a hierarchical relationship. Based on the above observation, we reformulate the joint task as hierarchical event detection and localization problem. To solve this problem, we further propose Hierarchical Regulated Iterative Networks (HRIN), which includes two variants, termed as HRIN-r and HRIN-cr, which are based on recurrent and convolutional recurrent modules. 

We consider the problem of localizing the source using range, and range-difference measurements. Both the problems are non-convex, and non-smooth, and are challenging to solve. In this article, we develop an iterative algorithm - Source Localization Via an Iterative technique (SOLVIT) to localize the source using all the distinct range-difference measurements, i.e., without choosing a reference sensor.

Personal Sound Zones (PSZ) systems aim to render independent sound signals to multiple listeners within a room by using arrays of loudspeakers. One of the algorithms used to provide PSZ is Weighted Pressure Matching (wPM), which computes the filters required to render a desired response in the listening zones while reducing the acoustic energy arriving to the quiet zones.

This paper presents a robust beamformer for stereo noise reduction in hearing aid applications. The worst-case optimization method was applied to the binaural minimum-variance distortionless-response (BMVDR) beamformer, for providing robustness against parameter estimation inaccuracies.

The filtered-x least-mean-square (FxLMS) algorithm has been widely used for the active noise control. A fundamental analysis of the convergence behavior of the FxLMS algorithm, including the transient and steady-state performance, could provide some new insights into the algorithm and can be also helpful for its practical applications, e.g., the choice of the step size.

Active noise control (ANC) is a technology which lowers the noise level by using the principle of destructive interference of sound wave. Even though recent developments in digital signal processing (DSP) made it possible to implement ANC algorithms in real-time, insufficient computational power is still one of the challenges to solve. In the previous research, as a way of overcoming the lack of computational power, CPU-GPU architecture was proposed so that ANC algorithms utilize the massive computing power of GPU without suffering from the block data transfer between CPU and GPU memories.

This article investigates deep learning based single- and multi-channel speech dereverberation. For single-channel processing, we extend magnitude-domain masking and mapping based dereverberation to complex-domain mapping, where deep neural networks (DNNs) are trained to predict the real and imaginary (RI) components of the direct-path signal from reverberant (and noisy) ones.

The problem of blind audio source separation (BASS) in noisy and reverberant conditions is addressed by a novel approach, termed Global and LOcal Simplex Separation (GLOSS), which integrates full- and narrow-band simplex representations. We show that the eigenvectors of the correlation matrix between time frames in a certain frequency band form a simplex that organizes the frames according to the speaker activities in the corresponding band. 

This work presents a method that persuades acoustic reflections to be a favorable property for sound source localization. Whilst most real world spatial audio applications utilize prior knowledge of sound source position, estimating such positions in reverberant environments is still considered to be a difficult problem due to acoustic reflections.

Pages

SPS on Twitter

  • DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting… https://t.co/NLH2u19a3y
  • ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in… https://t.co/V6Z3wKGK1O
  • The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat… https://t.co/0aYPMDSWDj
  • CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special… https://t.co/NPCGrSjQbh
  • Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:… https://t.co/4xal7voFER

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar