IEEE TASLP Article

The acoustic-to-word model based on the Connectionist Temporal Classification (CTC) criterion is a natural end-to-end (E2E) system directly targeting word as output unit. Two issues exist in the system: first, the current output of the CTC model relies on the current input and does not account for context weighted inputs. This is the hard alignment issue.

Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation

TASLP Volume 27 Issue 12

TASLPRO Featured Articles

Sequence generation tasks, such as neural machine translation (NMT) and abstractive summarization, usually suffer from exposure bias as well as the error propagation problem due to the autoregressive training and generation. Many previous works have discussed the relationship between error propagation and the accuracy drop problem (i.e., the right part of the generated sentence is often worse than its left part in left-to-right decoding models).

Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method

TASLP Volume 27 Issue 12

TASLPRO Featured Articles

A sound field reproduction method based on the spherical wavefunction expansion of sound fields is proposed, which can be flexibly applied to various array geometries and directivities. First, we formulate sound field synthesis as a minimization problem of some norm on the difference between the desired and synthesized sound fields, and then the optimal driving signals are derived by using the spherical wavefunction expansion of the sound fields.

Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification

TASLP Volume 27 Issue 11

TASLPRO Featured Articles

Short duration text-independent speaker verification remains a hot research topic in recent years, and deep neural network based embeddings have shown impressive results in such conditions. Good speaker embeddings require the property of both small intra-class variation and large inter-class difference, which is critical for the ability of discrimination and generalization.

Speech Emotion Classification Using Attention-Based LSTM

TASLP Volume 27 Issue 11

TASLPRO Featured Articles

Automatic speech emotion recognition has been a research hotspot in the field of human-computer interaction over the past decade. However, due to the lack of research on the inherent temporal relationship of the speech waveform, the current recognition accuracy needs improvement.

Effective Subword Segmentation for Text Comprehension

TASLP Volume 27 Issue 11

TASLPRO Featured Articles

Representation learning is the foundation of machine reading comprehension and inference. In state-of-the-art models, character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word.

State-Space Microphone Array Nonlinear Acoustic Echo Cancellation Using Multi-Microphone Near-End Speech Covariance

TASLP Volume 27 Issue 10

TASLPRO Featured Articles

Nonlinear acoustic echo cancellation (AEC) is a highly challenging task in a single-microphone; hence, the AEC technique with a microphone array has also been considered to more effectively reduce the residual echo. However, these algorithms track only a linear acoustic path between the loudspeaker and the microphone array.

Relative Acoustic Transfer Function Estimation in Wireless Acoustic Sensor Networks

TASLP Volume 27 Issue 10

TASLPRO Featured Articles

In this paper, we present an algorithm to estimate the relative acoustic transfer function (RTF) of a target source in wireless acoustic sensor networks (WASNs). Two well-known methods to estimate the RTF are the covariance subtraction (CS) method and the covariance whitening (CW) approach, the latter based on the generalized eigenvalue decomposition.

STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings

TASLP Volume 27 Issue 10

TASLPRO Featured Articles

Lexical-based metrics such as BLEU, NIST, and WER have been widely used in machine translation (MT) evaluation. However, these metrics badly represent semantic relationships and impose strict identity matching, leading to moderate correlation with human judgments. In this paper, we propose a novel MT automatic evaluation metric Semantic Travel Distance (STD) based on word embeddings. STD incorporates both semantic and lexical features (word embeddings and n -gram and word order) into one metric.

Relation Classification via Keyword-Attentive Sentence Mechanism and Synthetic Stimulation Loss

TASLP Volume 27 Issue 9

TASLPRO Featured Articles

Previous studies have shown that attention mechanisms and shortest dependency paths have a positive effect on relation classification. In this paper, a keyword-attentive sentence mechanism is proposed to effectively combine the two methods. Furthermore, to effectively handle the imbalanced classification problem, this paper proposes a new loss function called the synthetic stimulation loss , which uses a modulating factor to allow the model to focus on hard-to-classify samples.

webinar_cube.jpg

SPS JSTSP Webinar: Distributed Signal Processing for Extremely Large-Scale Antenna Array Systems

nominate_blue.jpg

Call for Nominations for Chair, Women in Signal Processing Committee (WISP)

Nominate-Blog-Header.jpg

Call for Nominations for Chair, Scholarship Committee

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

IEEE TASLP Article

Top Reasons to Join SPS Today!