Skip to main content

SPS SLTC/AASP Webinar: Simple, Training-Free Methods for SSL-Based Speech Processing

May

21

Webinar on laptop

Date: 21-May-2026
Time: 9:00 AM ET (New York Time)
Presenter: Dr. Herman Kamper

Abstract

Most modern speech systems are built on models that are trained using self-supervised learning (SSL) on vast amounts of unlabeled speech audio. These SSL models are often integrated into even larger architectures, which are then trained with supervised objectives for tasks like speech recognition and synthesis. In this talk our presenter argues that we may be underestimating what SSL representations already provide. With a careful understanding of their structure, many speech processing tasks can be solved without any additional training at all. He will demonstrate simple, training-free approaches for voice conversion, voice manipulation, unsupervised segmentation, and pure speech language modelling. The goal is to demonstrate that, instead of developing increasingly larger and more complex models, we can achieve strong results through careful analysis and thoughtful design using the representations we already have.

Biography

Dr. Herman Kamper
Dr. Herman Kamper

Herman Kamper received the B.Eng. and M.Sc.Eng. degrees in electrical and electronic engineering from Stellenbosch University, South Africa, in 2009 and 2012, respectively. He received the Ph.D. degree in informatics in 2017 from the University of Edinburgh, UK.

He is currently a Professor in Electrical and Electronic Engineering at Stellenbosch University. His group works on machine learning methods that would allow machines to acquire language autonomously, using as little supervision as possible. This is similar to the problem faced by human infants during early language learning. By trying to mimic this in machines, he hopes to gain new insights into both machine and human learning.

Dr. Kamper currently serves as an Associate Editor for the IEEE Open Journal of Signal Processing. His awards include two Google Faculty Awards and the ISCA Best Journal Paper in Computer Speech and Language from 2016 to 2020.