Update from the AASP-TC

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

10 years of news and resources for members of the IEEE Signal Processing Society

Update from the AASP-TC

Published in TC News on 1 February 2016

by Ivan Tashev (AASP-TC  Member) and Patrick A. Naylor (AASP-TC  Chair)

The Audio and Acoustic Signal Processing TC is proud of the mix of industrial applications and fundamental research within its scope. This is also echoed in its membership. Two TC members, one from industry and one from academia, introduce the topic of audio as a key technical component of augmented reality.

Sound plays a crucial role in helping us to understand our environment, be it real, virtual or augmented. Creating a convincing impression of virtual or augmented reality is the job of usually head-worn devices that control the audio and visual scene experienced by the user.

Augmented reality devices differ from virtual reality devices by not separating the user from the real world but instead adding to that reality. Transparent displays are used and the audio is delivered using acoustically transparent headphones or small loudspeakers near the ears. These augmented reality devices are expected to synthesize virtual objects (visually and acoustically) which should be perceived as existing within the surrounding reality. This is a much more complicated task as the synchronization between real and virtual objects requires very precise tracking of the head orientation and detailed information about the surroundings (walls, tables, chairs) in order to provide stable placement and proper occlusion of the virtual objects. The usage scenarios for augmented reality cover a wide range of activities including science, design, productivity, navigation and entertainment. Representative of an augmented reality device is Microsoft’s HoloLens [1].

Unlike vision, where humans have 90⁰ field of view, human hearing covers all directions. This means that the audio system of augmented reality devices is expected to provide realistic rendering of sound objects anywhere in 3D space.

Humans locate sounds in three dimensions using only two ears by exploiting the complicated directivity patterns of these two sensors. The directivity can be modeled using Head-Related Transfer Functions (HRTFs) that include the effects of occlusion from the head, reflections from the shoulders, and the complex shape of the pinna. Besides the differences in the time-of-arrival of sounds at the two ears, (interaural time differences) and in the magnitudes (interaural level differences), the human brain uses small movements of the head to disambiguate the direction of sound sources. Additional cues, such as level, reverberation and reflections from the ground, are used to determine the approximate distance. To provide a realistic sensation for the sound source position in augmented reality devices, all of these cues should be synthesized properly. A major problem however is the fact that HRTFs vary from person to person due to the different head size and pinna shape.

Binaural (‘two ears’) recording and rendering technology is not at all new. Using a dummy head with two microphones near or inside the ears, recordings of concerts and other sounds have been made, analyzed and found to have several deficiencies:

  • the entire audio scene rotates when the user moves their own head, for example when trying to disambiguate the direction to the sound source;
  • the differences between the HRTFs of the dummy head and the HRTFs of the listener causes front/back and up/down confusion, and reduction of the sharpness of the spatial audio image.

Both virtual and augmented reality devices have integrated head orientation tracking, which potentially enables the first problem to be addressed. Unfortunately even well measured and designed average HRTFs cannot provide good listening experience for majority of the listeners. Accordingly, HRTF personalization is of high interest for current research in the audio research community.

Spatial audio rendering technology with personalized HRTFs has much broader scope than just virtual and augmented reality. Even the experience from a simple task, such as listened to stereo music through headphones, can be improved by rendering the two channels as if coming from two loudspeakers in front of the listener.

Such spatial audio technology can also be found, perhaps surprisingly, in very diverse applications. Cities Unlocked [2] is a joint project between Guide Dogs (the UK’s oldest charity, which provides guide dogs for visually impaired people) and Microsoft. The idea is to assist navigation in lightly instrumented environments. The hardware consists of bone conducting headphones (to keep the ears open) with an inertial measurement unit to track the head orientation. A Windows phone app can track passive markers placed on lampposts, trees, etc. and convert their position into pinging sounds, helping the visually impaired user to move around a city. The first stage of the project was deployed to 50 people in 2014 and the second stage in 2015 to more than 500.

Currently, spatial audio processing is taking its first ‘serious’ steps in the industry. There are still many issues to be addressed, researched and converted to industrial strength algorithms and technologies: capture, compression, representation and rendering of spatial audio. This is just the beginning of providing a richer audio experience in pretty much all aspects of our lives: from listening to stereo music to augmented reality and beyond.

Table of Contents:

Research Opportunities

SPS on Twitter

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar