TASLP Volume 31 | 2023

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

2023

TASLP Volume 31 | 2023

View on IEEE Xplore

Deep Learning-Based Non-Intrusive Multi-Objective Speech Assessment Model With Cross-Domain Features

TASLP Volume 31 | 2023

TASLPRO Articles

This study proposes a cross-domain multi-objective speech assessment model, called MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and distortion assessment scores of an input speech signal. MOSA-Net comprises a convolutional neural network and bidirectional long short-term memory architecture for representation extraction, and a multiplicative attention layer and a fully connected layer for each assessment metric prediction. Additionally, cross-domain features (spectral and time-domain features) and latent representations from self-supervised learned (SSL) models are used as inputs to combine rich acoustic information to obtain more accurate assessments.

A Diffeomorphic Flow-Based Variational Framework for Multi-Speaker Emotion Conversion

TASLP Volume 31 | 2023

TASLPRO Articles

This paper introduces a new framework for non-parallel emotion conversion in speech. Our framework is based on two key contributions. First, we propose a stochastic version of the popular Cycle-GAN model. Our modified loss function introduces a Kullback–Leibler (KL) divergence term that aligns the source and target data distributions learned by the generators, thus overcoming the limitations of sample-wise generation. By using a variational approximation to this stochastic loss function, we show that our KL divergence term can be implemented via a paired density discriminator.

Integrating Lattice-Free MMI Into End-to-End Speech Recognition

TASLP Volume 31 | 2023

TASLPRO Articles

In automatic speech recognition (ASR) research, discriminative criteria have achieved superior performance in DNN-HMM systems. Given this success, the adoption of discriminative criteria is promising to boost the performance of end-to-end (E2E) ASR systems. With this motivation, previous works have introduced the minimum Bayesian risk (MBR, one of the discriminative criteria) into E2E ASR systems. However, the effectiveness and efficiency of the MBR-based methods are compromised: the MBR criterion is only used in system training, which creates a mismatch between training and decoding;

Decoupling Speaker-Independent Emotions for Voice Conversion via Source-Filter Networks

TASLP Volume 31 | 2023

TASLPRO Articles

Emotional voice conversion (VC) aims to convert a neutral voice to an emotional one while retaining the linguistic information and speaker identity. We note that the decoupling of emotional features from other speech information (such as content, speaker identity, etc.) is the key to achieving promising performance. Some recent attempts of speech representation decoupling on the neutral speech cannot work well on the emotional speech, due to the more complex entanglement of acoustic properties in the latter.

Clean vs. Overlapped Speech-Music Detection Using Harmonic-Percussive Features and Multi-Task Learning

TASLP Volume 31 | 2023

TASLPRO Articles

Detection of speech and music signals in isolated and overlapped conditions is an essential preprocessing step for many audio applications. Speech signals have wavy and continuous harmonics, while music signals exhibit horizontally linear and discontinuous harmonic patterns. Music signals also contain more percussive components than speech signals, manifested as vertical striations in the spectrograms.

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

congratulations.jpg

Congratulations to Signal Processing Society Members Elevated to Senior Members!

MLSP-2027.jpg

2027 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2027)

ISPA-2025.jpg

2025 14th International Symposium on Image and Signal Processing and Analysis (ISPA)

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

TASLP Volume 31 | 2023

Publications & Resources

For Authors

congratulations.jpg

CAI_2027_Call_for_Proposals.png

pod .png

Top Reasons to Join SPS Today!

Deep Learning-Based Non-Intrusive Multi-Objective Speech Assessment Model With Cross-Domain Features

A Diffeomorphic Flow-Based Variational Framework for Multi-Speaker Emotion Conversion

Integrating Lattice-Free MMI Into End-to-End Speech Recognition

Decoupling Speaker-Independent Emotions for Voice Conversion via Source-Filter Networks

Clean vs. Overlapped Speech-Music Detection Using Harmonic-Percussive Features and Multi-Task Learning

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

TASLP Volume 31 | 2023

Search form

You are here

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

SPS Social Media

IEEE SPS Educational Resources