TASLPRO Featured Articles

Visual cues such as lip movements, when available, play an important role in speech communication. They are especially helpful for the hearing impaired population or in noisy environments. When not available, having a system to automatically generate talking faces in sync with input speech would enhance speech communication and enable many novel applications.

Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units

TASLP Volume 27 Issue 12

The acoustic-to-word model based on the Connectionist Temporal Classification (CTC) criterion is a natural end-to-end (E2E) system directly targeting word as output unit. Two issues exist in the system: first, the current output of the CTC model relies on the current input and does not account for context weighted inputs. This is the hard alignment issue.

Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation

TASLPRO Featured Articles

TASLP Volume 27 Issue 12

Sequence generation tasks, such as neural machine translation (NMT) and abstractive summarization, usually suffer from exposure bias as well as the error propagation problem due to the autoregressive training and generation. Many previous works have discussed the relationship between error propagation and the accuracy drop problem (i.e., the right part of the generated sentence is often worse than its left part in left-to-right decoding models).

Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method

TASLPRO Featured Articles

TASLP Volume 27 Issue 12

A sound field reproduction method based on the spherical wavefunction expansion of sound fields is proposed, which can be flexibly applied to various array geometries and directivities. First, we formulate sound field synthesis as a minimization problem of some norm on the difference between the desired and synthesized sound fields, and then the optimal driving signals are derived by using the spherical wavefunction expansion of the sound fields.

Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification

TASLPRO Featured Articles

TASLP Volume 27 Issue 11

Short duration text-independent speaker verification remains a hot research topic in recent years, and deep neural network based embeddings have shown impressive results in such conditions. Good speaker embeddings require the property of both small intra-class variation and large inter-class difference, which is critical for the ability of discrimination and generalization.

Speech Emotion Classification Using Attention-Based LSTM

TASLPRO Featured Articles

TASLP Volume 27 Issue 11

Automatic speech emotion recognition has been a research hotspot in the field of human-computer interaction over the past decade. However, due to the lack of research on the inherent temporal relationship of the speech waveform, the current recognition accuracy needs improvement.

Effective Subword Segmentation for Text Comprehension

TASLPRO Featured Articles

TASLP Volume 27 Issue 11

Representation learning is the foundation of machine reading comprehension and inference. In state-of-the-art models, character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word.

Adversarial Learning for Constrained Image Splicing Detection and Localization Based on Atrous Convolution

TASLPRO Featured Articles

TIFS Volume 14 Issue 10

Constrained image splicing detection and localization (CISDL), which investigates two input suspected images and identifies whether one image has suspected regions pasted from the other, is a newly proposed challenging task for image forensics. In this paper, we propose a novel adversarial learning framework to learn a deep matching network for CISDL.

AnomalyNet: An Anomaly Detection Network for Video Surveillance

TASLPRO Featured Articles

TIFS Volume 14 Issue 10

Sparse coding-based anomaly detection has shown promising performance, of which the keys are feature learning, sparse representation, and dictionary learning. In this paper, we propose a new neural network for anomaly detection (termed AnomalyNet) by deeply achieving feature learning, sparse representation, and dictionary learning in three joint neural processing blocks. Specifically, to learn better features,...

Assessment of the Effectiveness of Seven Biometric Feature Normalization Techniques

TASLPRO Featured Articles

TIFS Volume 14 Issue 10

The importance of normalizing biometric features or matching scores is understood in the multimodal biometric case, but there is less attention to the unimodal case. Prior reports assess the effectiveness of normalization directly on biometric performance. We propose that this process is logically comprised of two independent steps: (1) methods to equalize the effect of each biometric feature on the similarity scores calculated from all the features together...

Subscribe to TASLPRO Featured Articles

Publications & Resources

Conferences & Events

Education & Training

Community & Involvement

Career & Industry

About IEEE SPS

For Volunteers

TASLPRO Featured Articles

Noise-Resilient Training Method for Face Landmark Generation From Speech

Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units

Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation

Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method

Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification

Speech Emotion Classification Using Attention-Based LSTM

Effective Subword Segmentation for Text Comprehension

Adversarial Learning for Constrained Image Splicing Detection and Localization Based on Atrous Convolution

AnomalyNet: An Anomaly Detection Network for Video Surveillance

Assessment of the Effectiveness of Seven Biometric Feature Normalization Techniques

IEEE Signal Processing Society on

Publications & Resources

Conferences & Events

Education & Training

Community & Involvement

About IEEE SPS

For Volunteers

Career & Industry

Education & Training