SLTC Newsletter - Winter 2013-2014 | IEEE Signal Processing Society

December 2013

SLTC Newsletter - Winter 2013-2014

The Winter 2013-2014 edition of the IEEE Speech and Language Processing Technical Committee’s Newsletter is now online. It includes a number of announcements from the TC chair, as well as a number of articles collated by the editorial boarded. Subscribe to the newsletter to be automatically notified of the new editions. We believe the newsletter is an ideal forum for updates, reports, announcements and editorials, and encourage interested individuals to send us their contributions. Dilek Hakkani-Tür, Editor-in-chief William Campbell, Editor Haizhou Li, Editor Patrick Nguyen, Editor

ANNOUNCEMENTS

From the IEEE SLTC chair

Douglas O’Shaughnessy

ARTICLES

The SLaTE 2013 Workshop

Pierre Badin, Thomas Hueber, Gérard Bailly, Martin Russell, Helmer Strik

SLaTE 2013 was the 5th workshop organised by the ISCA Special Interest Group on Speech and Language Technology for Education. It took place between 30th August and 1st September 2013 in Grenoble, France as a satellite workshop of Interspeech 2013. The workshop was attended by 68 participants from 20 countries. Thirty eight submitted papers and 14 demonstrations were presented in oral and poster sessions.

The INTERSPEECH 2013 Computational Paralinguistics Challenge - A Brief Review

Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus Scherer, Fabien Ringeval, Mohamed Chetouani

The INTERSPEECH 2013 Computational Paralinguistics Challenge was held in conjunction with INTERSPEECH 2013 in Lyon, France, 25-29 August 2013. This Challenge was the fifth in a series held at INTERSPEECH since 2009 as an open evaluation of speech-based speaker state and trait recognition systems. Four tasks were addressed, namely social signals (such as laughter), conflict, emotion, and autism. 65 teams participated, the baseline as was given by the organisers could be exceeded, and a new reference feature set by the openSMILE feature extractor and the four corpora used are publicly available at the repository of the series.

An Overview of the Base Period of the Babel Program

Tara N. Sainath, Brian Kingsbury, Florian Metze, Nelson Morgan, Stavros Tsakalidis

The goal of the Babel program is to rapidly develop speech recognition capability for keyword search in previously unstudied languages, working with speech recorded in a variety of conditions with limited amounts of transcription. Several issues and observations frame the challenges driving the Babel Program. The speech recognition community has spent years improving the performance of English automatic speech recognition systems. However, applying techniques commonly used for English ASR to other languages has often resulted in huge performance gaps for those other languages. In addition, there are an increasing number of languages for which there is a vital need for speech recognition technology but few existing training resources. It is easy to envision a situation where there is a large amount of recorded data in a language which contains important information, but for which there are very few people to analyze the language and no existing speech recognition technologies. Having keyword search in that language to pick out important phrases would be extremely beneficial.

MSR Identity Toolbox v1.0: A MATLAB Toolbox for Speaker-Recognition Research

Seyed Omid Sadjadi, Malcolm Slaney, and Larry Heck

We are happy to announce the release of the MSR Identity Toolbox: A MATLAB toolbox for speaker-recognition research. This toolbox contains a collection of MATLAB tools and routines that can be used for research and development in speaker recognition. It provides researchers with a test bed for developing new front-end and back-end techniques, allowing replicable evaluation of new advancements. It will also help newcomers in the field by lowering the "barrier to entry," enabling them to quickly build baseline systems for their experiments. Although the focus of this toolbox is on speaker recognition, it can also be used for other speech related applications such as language, dialect, and accent identification. Additionally, it provides many of the functionalities available in other open-source speaker recognition toolkits (e.g., ALIZE) but with a simpler design which makes it easier for the users to understand and modify the algorithms.

The REAL Challenge

Maxine Eskenazi

The Dialog Research Center at Carnegie Mellon (DialRC) is organizing the REAL Challenge. The goal of the REAL Challenge (dialrc.org/realchallenge) is to build speech systems that are used regularly by real users to accomplish real tasks. These systems will give the speech and spoken dialog communities steady streams of research data as well as platforms they can use to carry out studies. It will engage both seasoned researchers and high school and undergrad students in an effort to find the next great speech applications.

SPASR workshop brings together speech production and its use in speech technologies

Karen Livescu

The Workshop on Speech Production in Automatic Speech Recognition (SPASR) was recently held as a satellite workshop of Interspeech 2013 in Lyon on August 30.

Speaker Identification: Screaming, Stress and Non-Neutral Speech, is there speaker content?

John H.L. Hansen, Navid Shokouhi

The field of speaker recognition has evolved significantly over the past twenty years, with great efforts worldwide from many groups/laboratories/universities, especially those participating in the biannual U.S. NIST SRE - Speaker Recognition Evaluation. Recently, there has been great interest in considering the ability to perform effective speaker identification when speech is not produced in "neutral" conditions. Effective speaker recognition requires knowledge and careful signal processing/modeling strategies to address any mismatch conditions that could exist between the training and testing conditions. This article considers some past and recent efforts, as well as suggested directions when subjects move from a "neutral" speaking style, vocal effort, and ultimately pure "screaming" when it comes to speaker recognition. In the United States recently, there has been discussion in the news regarding the ability to accurately perform speaker recognition when the audio stream consists of a subject screaming. Here, we illustrate a probe experiment, but before that some background on speech under non-neutral conditions.

Open Calls

Nomination/Position	Deadline
Call for Proposals: Organize the Prestigious SP Cup 2027	31 July 2026
Call for Nominations: SPS Education Board Standing Committee Chairs	31 July 2026
2026 IEEE Signal Processing Society Award Nominations Now Open!	1 September 2026
Call for Papers: Signal Processing for Fluid Antenna Systems: Foundations, Algorithms, and Emerging Applications	1 September 2026
Call for Nominations: Technical Committee Vice Chair and Member Positions	15 September 2026
Call for Mentors: IEEE SPS SigMA Program	30 September 2026
Nominate a Chapter for the 2026 Chapter of the Year!	15 October 2026

Nomination/Position

Deadline

Call for Proposals: Organize the Prestigious SP Cup 2027

31 July 2026