SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

October 2022

SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Webinar.jpg

Education & Resources

Upcoming SPS Webinar!

Title: An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning
Date: 25 October 2022
Time: 10:00 AM Eastern (New York time)
Duration: Approximately 1 Hour
Presenters: Dr. Berrak Sisman, Dr. Simon King, Dr. Junichi Yamagishi, Dr. Haizou Li

Based on the IEEE Xplore® article: An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning
Published: IEEE/ACM Transactions on Audio, Speech, and Language Processing, November 2020, available in IEEE Xplore®

Download article: The original article is available for download.

Abstract:

Voice conversion (VC) is a significant aspect of artificial intelligence. It is the study of how to convert one’s voice to sound like that of another without changing the linguistic content. Voice conversion belongs to a general technical field of speech synthesis, which converts text to speech or changes the properties of speech, for example, voice identity, emotion, and accents. Voice conversion involves multiple speech processing techniques, such as speech analysis, spectral conversion, prosody conversion, speaker characterization, and vocoding. With the recent advances in theory and practice, we are now able to produce human-like voice quality with high speaker similarity. In this talk, we provide a comprehensive overview of the state-of-the-art of voice conversion techniques and their performance evaluation methods from the statistical approaches to deep learning and discuss their promise and limitations. We will also present the recent Voice Conversion Challenges (VCC), the performance of the current state of technology, and provide a summary of the available resources for voice conversion research.

Biography:

Dr. Berrak Sisman (Member, IEEE) received the Ph.D. degree in electrical and computer engineering from National University of Singapore in 2020, fully funded by A*STAR Graduate Academy under Singapore International Graduate Award (SINGA).

She is currently working as a tenure-track Assistant Professor at the Erik Jonsson School Department of Electrical and Computer Engineering at University of Texas at Dallas, United States. Prior to joining UT Dallas, she was a faculty member at Singapore University of Technology and Design (2020-2022). She was a Postdoctoral Research Fellow at the National University of Singapore (2019-2020). She was an exchange doctoral student at the University of Edinburgh and a visiting scholar at The Centre for Speech Technology Research (CSTR), University of Edinburgh (2019). She was a visiting researcher at RIKEN Advanced Intelligence Project in Japan (2018). Her research is focused on machine learning, signal processing, emotion, speech synthesis and voice conversion.

Dr. Sisman has served as the Area Chair at INTERSPEECH 2021, INTERSPEECH 2022, IEEE SLT 2022 and as the Publication Chair at ICASSP 2022. She has been elected as a member of the IEEE Speech and Language Processing Technical Committee (SLTC) in the area of Speech Synthesis for the term from January 2022 to December 2024. She plays leadership roles in conference organizations and active in technical committees. She has served as the General Coordinator of the Student Advisory Committee (SAC) of International Speech Communication Association (ISCA).

Dr. Simon King (Fellow, IEEE) received the M.A. (Cantab) and M.Phil. degrees from the University of Cambridge, Cambridge, U.K., and the Ph.D. degree from University of Edinburgh, Edinburgh, U.K.

Since 1993, he has been with the Centre for Speech Technology Research, University of Edinburgh, where he is currently Professor of Speech Processing and the Director of the Centre. His research interests include speech synthesis, recognition, and signal processing and he has approximately 230 publications across these areas.

Prof. King has served on the ISCA SynSIG Board and currently co-organizes the Blizzard Challenge. He has previously served on the IEEE SLTC and as an Associate Editor for the IEEE/ACM Transactions on Audio, Speech, and Language Processing, and is currently an Associate Editor for the area of Computer Speech and Language.

Dr. Junichi Yamagishi (SM' 13) received the Ph.D. degree from the Tokyo Institute of Technology (Tokyo Tech), Tokyo, Japan, in 2006.

From 2007 to 2013, he was a research fellow in the Centre for Speech Technology Research (CSTR) at the University of Edinburgh, UK. He was appointed Associate Professor at the National Institute of Informatics, Japan, in 2013. He is currently a Professor at the National Institute of Informatics (NII), Japan. His research topics include speech processing, machine learning, signal processing, biometrics, digital media cloning, and media forensics.

Dr. Yamagishi served previously as co-organizer for the bi-annual ASVspoof Challenge and the bi-annual Voice Conversion Challenge. He also served as a member of the IEEE Speech and Language Technical Committee (2013-2019), an Associate Editor of the IEEE/ACM Transactions on Audio Speech and Language Processing (2014-2017), and a chairperson of ISCA SynSIG (2017- 2021). He is currently a PI of JST-CREST and ANR-supported VoicePersona project and a Senior Area Editor of the IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Dr. Haizou Li (Fellow, IEEE) received the B.Sc., M.Sc., and Ph.D. degree in electrical and electronic engineering from South China University of Technology, Guangzhou, China, in 1984, 1987, and 1990, respectively.

He is currently a Professor at the School of Data Science, the Chinese University of Hong Kong, Shenzhen, China, and the Department of Electrical and Computer Engineering, National University of Singapore (NUS). His research interests include automatic speech recognition, speaker and language recognition, and natural language processing. Prior to joining NUS, he taught at the University of Hong Kong (1988-1990) and South China University of Technology (1990-1994). He was a Visiting Professor at CRIN in France (1994-1995), Research Manager at the Apple-ISS Research Centre (1996-1998), Research Director in Lernout and Hauspie Asia Pacific (1999-2001), Vice President in InfoTalk Corporation, Ltd. (2001-2003), and the Principal Scientist and Department Head of Human Language Technology in the Institute for Infocomm Research, Singapore (2003-2016).

Dr. Li served as the Editor-in-Chief of IEEE/ACM Transactions on Audio, Speech and Language Processing (2015-2018), a Member of the Editorial Board of Computer Speech and Language (2012-2018). He was an elected Member of IEEE Speech and Language Processing Technical Committee (2013-2015), the President of the International Speech Communication Association (2015-2017), the President of Asia Pacific Signal and Information Processing Association (2015-2016), and the President of Asian Federation of Natural Language Processing (2017-2018). He was the General Chair of ACL 2012, INTERSPEECH 2014 and ASRU 2019. Dr Li is a Fellow of the IEEE and the ISCA. He was a recipient of the National Infocomm Award 2002 and the President’s Technology Award 2013 in Singapore. He was named one of the two Nokia Visiting Professors in 2009 by the Nokia Foundation, University of Bremen Excellence Chair Professor in 2019, and Fellow of Academy of Engineering Singapore in 2022.

Tags:

SPS Webinars

SPS Newsletter Article

Open Calls

Nomination/Position	Deadline
Call for Nominations: Awards Board, Industry Board and Nominations & Elections Committee	19 September 2025
Take Part in the 2025 Low-Resource Audio Codec (LRAC) Challenge	1 October 2025
Meet the 2025 Candidates: IEEE President-Elect	1 October 2025
Call for proposals: 2027 IEEE Conference on Artificial Intelligence (CAI)	1 October 2025
Call for Nominations for the SPS Chapter of the Year Award	15 October 2025
Call for Papers for 2026 LRAC Workshop	22 October 2025
Submit a Proposal for ICASSP 2030	31 October 2025
Call for Project Proposals: IEEE SPS SigMA Program - Signal Processing Mentorship Academy	2 November 2025

Society News

Education & Resources

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

webinar_general_dsi.jpg

SA-TWG Webinar: Channel Estimation for Beyond Diagonal RIS via Tensor Decomposition

BISP_TC_Webinar.jpg

SPS Webinar: An Anomaly Detection Framework with Compressed Transformer Architecture for Tiny ML

webinar_ASI.jpg

SPS Webinar: Presentation Attack Detection on ID Cards

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Newsletter Menu

Newsletter Categories

Top Reasons to Join SPS Today!

SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Webinar.jpg

Upcoming SPS Webinar!

Abstract:

Biography:

Open Calls

Table of Contents:

Society News

Education & Resources

Publications News

Technical Committee News

SPS Social Media

IEEE SPS Educational Resources

webinar_general_dsi.jpg

SA-TWG Webinar: Channel Estimation for Beyond Diagonal RIS via Tensor Decomposition

BISP_TC_Webinar.jpg

SPS Webinar: An Anomaly Detection Framework with Compressed Transformer Architecture for Tiny ML

webinar_ASI.jpg

SPS Webinar: Presentation Attack Detection on ID Cards

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Search form

You are here

Newsletter Menu

Newsletter Categories

Top Reasons to Join SPS Today!

SPS Webinar: 25 October 2022, An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning

Webinar.jpg

Upcoming SPS Webinar!

Abstract:

Biography:

Open Calls

Table of Contents:

Society News

Education & Resources

Publications News

Technical Committee News

SPS Social Media

IEEE SPS Educational Resources