Skip to main content

SPM May 2023

Neural Target Speech Extraction: An overview

Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail party effect . For decades, researchers have focused on approaching the listening ability of humans. One critical issue is handling interfering speakers because the target and nontarget speech signals share similar characteristics, complicating their discrimination. 

Read more

Bounded-Magnitude Discrete Fourier Transform

Analyzing the magnitude response of a finite-length sequence is a ubiquitous task in signal processing. However, the discrete Fourier transform (DFT) provides only discrete sampling points of the response characteristic. This work introduces bounds on the magnitude response, which can be efficiently computed without additional zero padding. The proposed bounds can be used for more informative visualization and inform whether additional frequency resolution or zero padding is required.

Read more

Historical Audio Search and Preservation: Finding Waldo Within the Fearless Steps Apollo 11 Naturalistic Audio Corpus

Apollo 11 was the first manned space mission to successfully bring astronauts to the Moon and return them safely. As part of NASA’s goal in assessing team and mission success, all voice communications within mission control, astronauts, and support staff were captured using a multichannel analog system, which until recently had never been made available. More than 400 personnel served as mission specialists/support who communicated across 30 audio loops, resulting in 9,000+ h of data. It is essential to identify each speaker’s role during Apollo and analyze group communication to achieve a common goal.

Read more