1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Large language models(LLMs) have demonstrated increasingly powerful capabilities for reasoning tasks, especially in text. The project aims to explore and advance these capabilities in reasoning across multiple data modalities, including but not limited to text, speech and audio. The integration of multiple modalities can lead to more robust and general systems capable of understading and reasoning about the world in a more human-like manner.
Date: 19 February 2025
Time: 9:00 AM ET (New York time)
Presenter(s): Ivan Dokmanić
Date: 23 May 2025
Chapter: Kerala Chapter
Chapter Chair: Reshna Ayoob
Title: Unveil a Better Solution with the Toyota Production System
Date: 2-6 June 2025
Location: College Park, MD, USA
Most current models for analyzing multimodal sequences often disregard the imbalanced contributions of individual modal representations caused by varying information densities, as well as the inherent multi-relational interactions across distinct modalities. Consequently, a biased understanding of the intricate interplay among modalities may be fostered, limiting prediction accuracy and effectiveness.
Audio and visual signals complement each other in human speech perception, and the same applies to automatic speech recognition. The visual signal is less evident than the acoustic signal, but more robust in a complex acoustic environment, as far as speech perception is concerned.
Conventional fine-tuning encounters increasing difficulties given the size of current Pre-trained Language Models, which makes parameter-efficient tuning become the focal point of frontier research. Recent advances in this field is the unified tuning methods that aim to tune the representations of both multi-head attention (MHA) and fully connected feed-forward network (FFN) simultaneously, but they rely on existing tuning methods and do not explicitly model domain knowledge for downstream tasks.
Existing High Efficiency Video Coding (HEVC) selective encryption algorithms only consider the encoding characteristics of syntax elements to keep format compliance, but ignore the semantic features of video content, which may lead to unnecessary computational and bit rate costs. To tackle this problem, we present a content-aware tunable selective encryption (CATSE) scheme for HEVC. First, a deep hashing network is adopted to retrieve groups of pictures (GOPs) containing sensitive objects.
Image set compression (ISC) refers to compressing the sets of semantically similar images. Traditional ISC methods typically aim to eliminate redundancy among images at either signal or frequency domain, but often struggle to handle complex geometric deformations across different images effectively.
Explanatory Visual Question Answering (EVQA) is a recently proposed multimodal reasoning task consisting of answering the visual question and generating multimodal explanations for the reasoning processes. Unlike traditional Visual Question Answering (VQA) task that only aims at predicting answers for visual questions, EVQA also aims to generate user-friendly explanations to improve the explainability and credibility of reasoning models.
Existing JPEG encryption approaches pose a security risk due to the difficulty in changing all block-feature values while considering format compatibility and file size expansion. To address these concerns, this paper introduces a novel JPEG image encryption scheme. First, the security of sketch information against chosen-plaintext attacks is improved by increasing the change rate of block-feature values.
Existing JPEG encryption approaches pose a security risk due to the difficulty in changing all block-feature values while considering format compatibility and file size expansion. To address these concerns, this paper introduces a novel JPEG image encryption scheme. First, the security of sketch information against chosen-plaintext attacks is improved by increasing the change rate of block-feature values.
Deep neural networks have demonstrated considerable effectiveness in recognizing complex communications signals through their applications in the tasks of automatic modulation recognition. However, the resilience of these networks is undermined by the introduction of carefully designed adversarial examples that compromise the reliability of the decision processes.
Date: 30 January 2025 (Virtual)
Chapter: Gujarat Chapter
Chapter Chair: Mita C. Paunwala
Title: Brain-Computer Interfaces Applications and Challanges