Skip to main content

NEWS AND RESOURCES FOR MEMBERS OF THE IEEE SIGNAL PROCESSING SOCIETY

Joachim Thiemann (McGill University), “A Sparse Auditory Envelope Representation with Iterative Reconstruction for Audio Coding” (2011)

Joachim Thiemann (McGill University), “A Sparse Auditory Envelope Representation with Iterative Reconstruction    for Audio Coding”, Advisor: Prof. Peter Kabal In this thesis, the author investigates perceptual domain coding by using a representation designed to contain only the audible information regardless of whether reconstruction can be performed efficiently. The perceptual representation is based on a multichannel Basilar membrane model, where each channel is decomposed into envelope and carrier components. It is assumed that the information in the carrier is also present in the envelopes and discard the carrier components. The envelope components are sparsified using a transmultiplexing masking model and form a sparse auditory envelope representation (SAER).  An iterative reconstruction algorithm for the SAER is evaluated using subjective and objective testing on speech and audio signals. It is found that some types of audio signals are reproduced very well using this method whereas others exhibit audible distortion. Except in specific cases where part of the carrier information is required, most of the audible information is present in the SAER and can be reconstructed using iterative methods. For details, please access the full thesis or contact the author.