1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Constrained image splicing detection and localization (CISDL), which investigates two input suspected images and identifies whether one image has suspected regions pasted from the other, is a newly proposed challenging task for image forensics. In this paper, we propose a novel adversarial learning framework to learn a deep matching network for CISDL.
The importance of normalizing biometric features or matching scores is understood in the multimodal biometric case, but there is less attention to the unimodal case. Prior reports assess the effectiveness of normalization directly on biometric performance. We propose that this process is logically comprised of two independent steps: (1) methods to equalize the effect of each biometric feature on the similarity scores calculated from all the features together...
Sparse coding-based anomaly detection has shown promising performance, of which the keys are feature learning, sparse representation, and dictionary learning. In this paper, we propose a new neural network for anomaly detection (termed AnomalyNet) by deeply achieving feature learning, sparse representation, and dictionary learning in three joint neural processing blocks. Specifically, to learn better features,...
Nonlinear acoustic echo cancellation (AEC) is a highly challenging task in a single-microphone; hence, the AEC technique with a microphone array has also been considered to more effectively reduce the residual echo. However, these algorithms track only a linear acoustic path between the loudspeaker and the microphone array.
In this paper, we present an algorithm to estimate the relative acoustic transfer function (RTF) of a target source in wireless acoustic sensor networks (WASNs). Two well-known methods to estimate the RTF are the covariance subtraction (CS) method and the covariance whitening (CW) approach, the latter based on the generalized eigenvalue decomposition.
Lexical-based metrics such as BLEU, NIST, and WER have been widely used in machine translation (MT) evaluation. However, these metrics badly represent semantic relationships and impose strict identity matching, leading to moderate correlation with human judgments. In this paper, we propose a novel MT automatic evaluation metric Semantic Travel Distance (STD) based on word embeddings. STD incorporates both semantic and lexical features (word embeddings and n -gram and word order) into one metric.
Previous studies have shown that attention mechanisms and shortest dependency paths have a positive effect on relation classification. In this paper, a keyword-attentive sentence mechanism is proposed to effectively combine the two methods. Furthermore, to effectively handle the imbalanced classification problem, this paper proposes a new loss function called the synthetic stimulation loss , which uses a modulating factor to allow the model to focus on hard-to-classify samples.
Dialogue policy plays an important role in task-oriented spoken dialogue systems. It determines how to respond to users. The recently proposed deep reinforcement learning (DRL) approaches have been used for policy optimization. However, these deep models are still challenging for two reasons: first, many DRL-based policies are not sample efficient; and second, most models do not have the capability of policy transfer between different domains.
This paper addresses the problem of multichannel online dereverberation. The proposed method is carried out in the short-time Fourier transform (STFT) domain, and for each frequency band independently. In the STFT domain, the time-domain room impulse response is approximately represented by the convolutive transfer function (CTF).
While substantial noise reduction and speech enhancement can be achieved with multiple microphones organized in an array, in some cases, such as when the microphone spacings are quite close, it can also be quite limited. This degradation can, however, be resolved by the introduction of one or more external microphones (
A novel method is proposed to construct arbitrary length perfect integer sequences based on all-pass polynomial. It uses a fact that associative sequence of an all-pass polynomial is guaranteed to be perfect. In addition, geometric series method is our special case.
Epochs are the instants of significant excitation to vocal tract system. Existing methods can extract epochs accurately from clean speech signals. However, identification of epoch locations from band-limited telephonic speech is challenging due to the attenuation of fundamental frequency component and degradation caused by channel effect.
In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences.
Effectively utilizing the common information in a set of video frames is a vital aspect in video segmentation. However, existing methods that transport the common information from a prior frame to the current frame do not make use of the common information effectively.
Postdoctoral Fellow in Machine Learning for Modeling Alzheimer's Disease Development