Nonlinear acoustic echo cancellation (AEC) is a highly challenging task in a single-microphone; hence, the AEC technique with a microphone array has also been considered to more effectively reduce the residual echo. However, these algorithms track only a linear acoustic path between the loudspeaker and the microphone array.
The importance of normalizing biometric features or matching scores is understood in the multimodal biometric case, but there is less attention to the unimodal case. Prior reports assess the effectiveness of normalization directly on biometric performance. We propose that this process is logically comprised of two independent steps: (1) methods to equalize the effect of each biometric feature on the similarity scores calculated from all the features together...
Sparse coding-based anomaly detection has shown promising performance, of which the keys are feature learning, sparse representation, and dictionary learning. In this paper, we propose a new neural network for anomaly detection (termed AnomalyNet) by deeply achieving feature learning, sparse representation, and dictionary learning in three joint neural processing blocks. Specifically, to learn better features,...
Constrained image splicing detection and localization (CISDL), which investigates two input suspected images and identifies whether one image has suspected regions pasted from the other, is a newly proposed challenging task for image forensics. In this paper, we propose a novel adversarial learning framework to learn a deep matching network for CISDL.
Spectrum auction is an effective approach to improve the spectrum utilization, by leasing an idle spectrum from primary users to secondary users. Recently, a few differentially private spectrum auction mechanisms have been proposed, but, as far as we know, none of them addressed the differential privacy in the setting of double spectrum auctions.
In this paper, the achievable secrecy rate of a relay-assisted massive multiple-input multiple-output (MIMO) downlink is investigated in the presence of a multi-antenna active/passive eavesdropper. The excess degrees-of-freedom offered by a massive MIMO base-station (BS) are exploited for sending artificial noise (AN) via random and null-space precoders.
The procedure for extracting a cryptographic key from noisy sources, such as biometrics and physically uncloneable functions (PUFs), is known as fuzzy extractor (FE). Although FE constructions deal with discrete sources, most noisy sources are continuous. In the continuous case, it is required to transform the source to a discrete one.
The aim of this paper is to present a new method for skin tumor segmentation in the 3D ultrasound images. We consider a variational formulation, the energy of which combines a diffuse interface phase field model (regularization term) and a log-likelihood computed using nonparametric estimates (data attachment term).
What sparked your interest in speech and language processing?
Early on, I was amazed by the ambiguity of natural language, that so many sentences could in fact be parsed and understood in different ways, and yet we can often times easily communicate with each other and interpret what we hear or read with the intended semantics. Then I found out in college that speech and natural language is actually quite an active area for computer science.
This year was one of the largest ICASSP conferences that I have attended with more than 3,000 participants. During opening remarks, SPS president, Ali H. Sayed announced that the membership fees for students has been set to $1, there will be an open access journal for signal processing, IEEE SPS formal policy statement for commitment to diversity, and initiating an E-learning center which are great steps forward to create an open society.