1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Date: 17 April 2025
Time: 11:00 AM ET (New York Time)
Presenter(s): Ms. Karn N. Watcharasupat
Date: 26 March 2025
Time: 8:00 AM ET (New York Time)
Presenter(s): Dr. Nhan Thanh Nguyen, Dr. Nir Shlezinger
Date: 3 March 2025
Chapter: Tokyo Joint Chapter
Chapter Chair: Nobutaka Ono
Title: Spatial Audio Intelligence: From Representation to Understanding and Control of Auditory Environments
Tampere University has several professor positions open related to AI and its applications, covering various areas of signal processing. The positions include a quite substantial starting package, covering funding for multiple research group members. Strong researchers are encouraged to apply! The deadline for applications is 9 March 2025.
Large language models(LLMs) have demonstrated increasingly powerful capabilities for reasoning tasks, especially in text. The project aims to explore and advance these capabilities in reasoning across multiple data modalities, including but not limited to text, speech and audio. The integration of multiple modalities can lead to more robust and general systems capable of understading and reasoning about the world in a more human-like manner.
Date: 26 February 2025
Time: 10:00 AM ET (New York time)
Presenter(s): Ivan Dokmanić
Date: 23 May 2025
Chapter: Kerala Chapter
Chapter Chair: Reshna Ayoob
Title: Unveil a Better Solution with the Toyota Production System
Date: 2-6 June 2025
Location: College Park, MD, USA
Most current models for analyzing multimodal sequences often disregard the imbalanced contributions of individual modal representations caused by varying information densities, as well as the inherent multi-relational interactions across distinct modalities. Consequently, a biased understanding of the intricate interplay among modalities may be fostered, limiting prediction accuracy and effectiveness.
Audio and visual signals complement each other in human speech perception, and the same applies to automatic speech recognition. The visual signal is less evident than the acoustic signal, but more robust in a complex acoustic environment, as far as speech perception is concerned.
Conventional fine-tuning encounters increasing difficulties given the size of current Pre-trained Language Models, which makes parameter-efficient tuning become the focal point of frontier research. Recent advances in this field is the unified tuning methods that aim to tune the representations of both multi-head attention (MHA) and fully connected feed-forward network (FFN) simultaneously, but they rely on existing tuning methods and do not explicitly model domain knowledge for downstream tasks.
Existing High Efficiency Video Coding (HEVC) selective encryption algorithms only consider the encoding characteristics of syntax elements to keep format compliance, but ignore the semantic features of video content, which may lead to unnecessary computational and bit rate costs. To tackle this problem, we present a content-aware tunable selective encryption (CATSE) scheme for HEVC. First, a deep hashing network is adopted to retrieve groups of pictures (GOPs) containing sensitive objects.
Image set compression (ISC) refers to compressing the sets of semantically similar images. Traditional ISC methods typically aim to eliminate redundancy among images at either signal or frequency domain, but often struggle to handle complex geometric deformations across different images effectively.
Explanatory Visual Question Answering (EVQA) is a recently proposed multimodal reasoning task consisting of answering the visual question and generating multimodal explanations for the reasoning processes. Unlike traditional Visual Question Answering (VQA) task that only aims at predicting answers for visual questions, EVQA also aims to generate user-friendly explanations to improve the explainability and credibility of reasoning models.