GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TASLP Volume 27 Issue 6

GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

TASLPRO Featured Articles

By:

Lauri Juvela ; Bajibabu Bollepalli ; Vassilis Tsiaras ; Paavo Alku

Recently, generative neural network models which operate directly on raw audio, such as WaveNet, have improved the state of the art in text-to-speech synthesis (TTS). Moreover, there is increasing interest in using these models as statistical vocoders for generating speech waveforms from various acoustic features. However, there is also a need to reduce the model complexity, without compromising the synthesis quality. Previously, glottal pulseforms (i.e., time-domain waveforms corresponding to the source of human voice production mechanism) have been successfully synthesized in TTS by glottal vocoders using straightforward deep feedforward neural networks. Therefore, it is natural to extend the glottal waveform modeling domain to use the more powerful WaveNet-like architecture. Furthermore, due to their inherent simplicity, glottal excitation waveforms permit scaling down the waveform generator architecture. In this study, we present a raw waveform glottal excitation model, called GlotNet, and compare its performance with the corresponding direct speech waveform model, WaveNet, using equivalent architectures. The models are evaluated as part of a statistical parametric TTS system. Listening test results show that both approaches are rated highly in voice similarity to the target speaker, and obtain similar quality ratings with large models. Furthermore, when the model size is reduced, the quality degradation is less severe for GlotNet.

Read on IEEE Xplore

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

BISP_TC_Webinar.jpg

SPS Webinar: An Anomaly Detection Framework with Compressed Transformer Architecture for Tiny ML

webinar_ASI.jpg

SPS Webinar: Presentation Attack Detection on ID Cards

webinar_general_text.jpg

SPS Webinar: Bilinear Expectation Propagation for Distributed Semi-Blind Joint Channel Estimation and Data Detection in Cell-Free Massive MIMO

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

Publications & Resources

For Authors

SP-Magazine-Front_Cover-March-2025.jpg

CAI_2027_Call_for_Proposals.png

nominate_2_general.jpg

Top Reasons to Join SPS Today!

GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

Search form

You are here

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis

SPS Social Media

IEEE SPS Educational Resources