Noise-Resilient Training Method for Face Landmark Generation From Speech

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TASLP Volume 28 | 2020

Noise-Resilient Training Method for Face Landmark Generation From Speech

TASLPRO Featured Articles

By:

Sefik Emre Eskimez; Ross K. Maddox; Chenliang Xu; Zhiyao Duan

Visual cues such as lip movements, when available, play an important role in speech communication. They are especially helpful for the hearing impaired population or in noisy environments. When not available, having a system to automatically generate talking faces in sync with input speech would enhance speech communication and enable many novel applications. In this article, we present a new system that can generate 3D talking face landmarks from speech in an online fashion. We employ a neural network that accepts the raw waveform as an input. The network contains convolutional layers with 1D kernels and outputs the active shape model (ASM) coefficients of face landmarks. To promote smoother transitions between video frames, we present a variant of the model that has the same architecture but also accepts the previous frame's ASM coefficients as an additional input. To cope with background noise, we propose a new training method to incorporate speech enhancement ideas at the feature level. Objective evaluations on landmark prediction show that the proposed system yields statistically significantly smaller errors than two state-of-the-art baseline methods on both a single-speaker dataset and a multi-speaker dataset. Experiments on noisy speech input with five types of non-stationary unseen noise show statistically significant improvements of the system performance thanks to the noise-resilient training method. Finally, subjective evaluations show that the generated talking faces have a significantly more convincing match with the input audio, achieving a similarly convincing level of realism as the ground-truth landmarks.

Read on IEEE Xplore

Tags:

IEEE TASLP Article

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

TMM.png

New Editor-in-Chief (EIC) of the IEEE Transactions on Multimedia (T-MM)

ICASSP 2026 Blog Header.png

(ICASSP 2026) 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing

mentor_help_general_3.jpg

Call for Mentors: 2025 IEEE SPS SigMA Program - Signal Processing Mentorship Academy

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Noise-Resilient Training Method for Face Landmark Generation From Speech

Publications & Resources

For Authors

TMM.png

mentor_help_general_3.jpg

general_get_involved_tc_article_full.jpg

Top Reasons to Join SPS Today!

Noise-Resilient Training Method for Face Landmark Generation From Speech

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Noise-Resilient Training Method for Face Landmark Generation From Speech

Search form

You are here

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

Noise-Resilient Training Method for Face Landmark Generation From Speech

SPS Social Media

IEEE SPS Educational Resources