1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Associated SPS Event: IEEE ICASSP 2021 Grand Challenge
Text-to-speech (TTS) or speech synthesis has witnessed significant performance improvement with the help of deep learning. The latest advances in end-to-end text-to-speech paradigm and neural vocoder have enabled us to produce very realistic and natural-sounding synthetic speech reaching almost human-parity performance. But this amazing ability is still limited to the ideal scenarios with a large single-speaker less-expressive training set. The speech quality, target similarity, expressiveness and robustness are still not satisfied for synthetic speech with different speakers and various styles, especially in real-world low-resourced conditions, e.g., each speaker only has a few samples at hand. The current open solutions are also not robust enough to unseen speakers. We call this challenging task as multi-speaker multi-style voice cloning (M2VoC).
Recent advances in transfer learning, style transfer, speaker embedding and factor disentanglement have shed light on the potential solutions to low-resource voice cloning.
As a ICASSP2021 Signal Processing Grand Challenge, the M2VoC challenge aims to provide a common sizable dataset as well as a fair testbed for benchmarking the voice cloning task. We highly encourage the researchers from both academia and industry to join the challenge and have deep discussions as well as collaborations. For further details, visit the website.