1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
The 42nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP) was recently hosted in New Orleans, Louisiana from March 5-9th, 2017. The theme of the conference was, “the internet of signals.” In this article, we highlight the theme of the conference, emerging, and enduring trends at ICASSP.
ICASSP presents what is new and upcoming in speech and signal processing research, observable both through the keynote theme and the ever-growing diversity of the parallel sessions. This ICASSP was of particular note both for the aforementioned diversity and for the sheer enjoyment of being embedded in a vibrant city with one’s research colleagues. ICASSP was not merely “in” New Orleans, it was located deep in the heart of the French Quarter. The location created many opportunities for research conversations in both formal (the conference venue itself) and informal venues (over fresh beignets or at any of the numerous jazz venues). This combination is the bread and butter of innovation, sparking creativity by removing barriers (both conversational and spatial). I look forward to attending ICASSP next year to see the new directions that will result from these interactions.
Conference Theme: The Internet of Signals
Two of the four plenary talks at ICASSP this year focused on the Internet of Things (IoT). Dr. K. J. Ray Liu discussed the technology behind smart radios and demonstrated the plethora of applications that could effectively leverage these advances, including: home/office monitoring/security, radio human biometrics, vital signs detection, wireless charging, and 5G communications. Dr. Jan Rabay presented work on the growing and expanding field of IoT and discussed how the miniaturization of sensor technologies will enable the evolution of pervasive human-body oriented sensor networks, described as the Human Intranet. He discussed critical aspects of robustness, safety, security and privacy.
The fields of Affective Computing and Paralinguistic modeling has had an increasing presence over the last several years at ICASSP. This year, there were three sessions devoted to these topics, two lectures and one poster. These topics also appeared in lecture sessions focused on biomedical signal processing and deep learning, in addition to poster sessions on spoken language understanding, applications of machine learning in signal processing, and supervised and semi-supervised learning. The strength of this trend was clarified further through the opening day plenary, delivered by Dr. Rana El-Kaliouby from Affectiva. Her talk highlighted the commercial applicability of these technologies and highlighted existing commercial-academic collaborations.
There was also a lecture session devoted to speech processing for medical diagnostics. The talks in this session covered assistive technology applications ranging from feature design to machine learning methods. Critically, these talks also focused on methods to enhance robustness, necessary for real world deployment.
The inclusion of these research themes is thrilling. Success in these areas means two things: (1) the engineering community will have methods that can handle the inherent variability in human communication and/or disease expression and (2) the community, including and extending beyond engineering, will have new knowledge about human behavior. ICASSP’s continued willingness to highlight these emerging research areas will result in fundamental advances in how we think about human-centered technologies and their contributions beyond the so-called ivory tower.
The field of Automatic Speech Recognition (ASR) once again had a strong showing at ICASSP. Topics included advances in deep learning, end-to-end modeling, robustness to noise, spoken term detection, and speech enhancement. Progress in ASR was highlighted by the plenary on the third day, delivered by Dr. David Nahamoo. In addition to describing the history and development of ASR methods, he also highlighted IBM’s recent milestone, obtaining 5.5% word error rate (WER) on a benchmark speech corpus. Once again, there were strong submissions in source separation, speaker localization, speech synthesis, and speaker verification.
© Copyright 2020 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.