December 12, 2004


Welcome to the IEEE Signal Processing Society Speech Technical Committee (STC) newsletter.  As always, contributions of events, publications, workshops, and career information to the newsletter are welcome (

New Speech Technical Committee Members   (Rick Rose)
STC ICASSP 2005 Paper Review Process   (Mazin Rahim and Tatsuya Kawahara)

InterSpeech 2005 / EuroSpeech Call for Papers  (Isabel Transcoso)
IEEE 2005 Automatic Speech Recognition and Understanding Preliminary Call for Papers 
IEEE 2005 Workshop on Applications of Signal Processing to Audio and Acoustics

Speech Communication Journal: Spoken Language Understanding for Conversational Systems  (Gokhan Tur)
Speech EURASIP Journal on Applied Signal Processing: Advances in Multimicrophone Speech Processing

James Flanagan Retires from Rutgers University  (Larry Rabiner and Rick Rose)
Position Open: Software Development Engineer in Speech Recognition  (Yifan Gong)
Transitions: ASR Researchers Take new Positions 

Links to conferences and workshops organized by date  (Rick Rose)

Seven New Members Elected to the STC

In October, seven  new members were elected to the Speech Technical Committee.  They will each serve three year terms on the committee.  They will join the existing members who can be found on the STC web page    Perspective members are nominated by existing committee members in technology areas with retiring members, and the STC elects new members from these nominees.   Each member serves a three year term.   The new members are given here along with short descriptions of their background:

back to top

Hisashi Kawai

Hisashi Kawai is the head of Department of Speech Synthesis, ATR Spoken Language Translation Research Laboratories in Japan. He received Dr. Eng. degree from the University of Tokyo, in 1989.  Since then he has been working on all parts of TTS including text processing, prosody generation, and waveform generation. He has also experience in telephone-based speech recognition.  He is now directing the development of ATR's third generation TTS system  named XIMERA that is one of the best quality TTS at present based on corpus-based technologies. 

Tomoko Matsui

Professor Matsui has been a frontier researcher in speaker recognition ever since she joined NTT Speech Research in 1988.  She has developed key technologies in this area with Dr. Sadaoki Furui; including: posterior probability based score normalization, text-prompt based speaker recognition system, and noise robust HMM compositions.  She worked at ATR, Kyoto, Japan, from 1998 to 2002 and visited Speech Research at Bell Labs, Murray Hill, NJ, in 2001, where she extended the speaker recognition/verification algorithms to topic detection and utterance verification.  She joined the Institute of Statistical Mathematics, Tokyo, Japan, in 2003 as an associate professor. Currently, she is investigating an unconventional approach to speaker recognition, "a dual penalized logistic regression machine," which has a potential to deliver an even better discrimination performance than the current main stream approach, the GMM-based method.  Professionally, she has been actively involved in various volunteering work in IEEE, Acoustic Society of Japan, and IEICE (Japan), including key posts like the secretariat of HSC2001, associate editor of IEICE Transactions on Information and Systems and a steering committee member of the society.


Jean-Claude Junqua
Dr. Junqua is currently with the AV Core Technology Development Center (ACC) at Matsushita Electric Industrial Co. in
Osaka, Japan.  He served as Vice President and Director of Panasonic Speech Technology Laboratory.  He has served as chairman at several international conferences and has participated in several international scientific committees.  He is the author/co-author of more than 100 articles and 80 patents with two books entitled "Robustness in Automatic Speech Recognition" and "Robust Speech Recognition in Embedded Systems and PC Applications".  He co-edited a book on "Robustness in Languages and Speech Technology", has served as associate editor for the IEEE Transactions on SAP, and he is currently a member of Speech Communication and ACM editorial boards.

 Shri Narayanan
Shrikanth Narayanan (Ph.D'95, UCLA), was with AT&T (both AT&T Bell Labs and AT&T Research) from 1995-2000. Currently he is an Associate Professor of Electrical Engineering, with joint appointments in Linguistics and Computer Science at the University of Southern California (USC). He is a research area director of the Integrated Media Systems Center, an NSF Engineering Research Center on multimedia, at USC. He is a member of Tau-Beta-Pi, Phi Kappa Phi and Eta-Kappa-Nu, a senior member of IEEE and a recipient of an NSF CAREER award, 2003 USC Engineering Faculty research Award, and a fellowship from the Center for Interdisciplinary research. His research interests are in multimodal signal processing and interpretation with applications to human-machine interfaces. NSF, NIH, and DARPA support his work. He has published over 130 papers and holds 3 U.S. patents.   He has served as technical chair of several workshops and has served as associate editor for the SAP Transactions.

Roger Moore
Dr. Moore is currently with the Speech and Hearing Research Group (SPandH) in the Department of Computer Science at the University of Sheffield in Sheffield, UK. He served as head of the UK Governments Speech Research Unit (SRU) from 1985 until its privatisation in 1999. He then served as Chief Scientific Officer of the resulting company - 20/20 Speech Ltd.  He has authored and co-authored over 100 scientific publications in Speech Technology algorithms, applications and assessment.  He was visiting Professor in the Dept. Phonetics & Linguistics, University College London.

Dr. Moore is the immediate past-President of both the International Speech Communication Association (ISCA) and the Permanent Council of the International Conferences on Spoken Language Processing (ICSLP).  He is a member of the Editorial/Advisory Boards for the scientific Journals Computer Speech & Language and Speech Communication.  He was awarded the UK Institute of Acoustics Tyndall Medal for "distinguished work in the field of speech research and technology" and was awarded the NATO RTO Scientific Achievement Award for "repeated contribution in scientific and technological cooperation".

Yunxin Zhao

Dr. Yunxin Zhao is currently professor of the Department of Computer Science, University of Missouri at Columbia. Her research interests are in spoken language processing, signal and speech processing, statistical pattern recognition, multimedia interface, and biomedical applications. Dr. Zhao was associate editor of IEEE Transactions on Speech and Audio Processing. She has published over 100 journal and conference papers and received six United States patents. She received 1995 NSF Career award and several other research grant awards from US federal and private funding agencies. Dr. Zhao was listed in American Men and Women of Science, February 1998.

Peter Kabal
Professor Kabal is Professor of Electrical and Computer Engineering at McGill University in Montreal.  He has carried out research on the application of signal processing to speech coding and enhancement, data transmission, filtering and multiplexing, and multi-sensor arrays over a period of more than 25 years. He has contributed to nearly 100 journal and conference papers in these areas over the last 10 years alone. He has supervised over sixty graduate students that are now working in academia, government, and industry. His deep background in speech, signal processing, and communications brings a broad perspective on topics in the SPS that is currently lacking in the STC. Prof. Kabal has also served in visiting positions in laboratories in Sweden, Australia, and the United States. He has served as a consultant to major telecommunications companies including Nortel Networks and AT&T.  Professor Kabal served as Co-Technical Chairman for the IEEE Int. Conf. Acoustics, Speech and Signal Processing held in Montreal in May 2004.  He has served for many years as a Member of the Technical Committee for the IEEE Speech Coding Workshop.  He also served as Vice-Technical Program Chair for the IEEE International Conference on Communications, held in Montreal in June 1997 (ICC'97).

Review Process for ICASSP 2005

The Speech Technical Committee (STC) received 572 papers for ICASSP 2005. This represents nearly a 9% increase from ICASSP 2004 and ICASSP 2003. About 51% of the papers were rejected. A rigorous review process was implemented in which each paper was essentially reviewed by at least 3 people; one being STC member and 2 external reviewers. The review process included 150 external reviewers and 46 STC members; 9 of the STC members acted as Area Chairs, working closely with the STC team members to integrate the final scores, define accept/reject, form technical sessions and identify session chairs.

For ICASSP 2005, the STC adopted a new speech EDICS classification of topics in order for ICASSP to align with recent advancements in speech and language processing research. As a result of this change, we experienced nearly 50% increase in speaker recognition papers, nearly 50% increase in spoken language technology papers and a modest 5% of new submissions in the areas of speech data mining and multimodal/multimedia Human/Computer interfaces. Detailed statistics of the paper submission is shown in the Table below.

The STC has been successful in attracting two very strong special sessions on Human Language Technology that will be organized by M. Ostendorf, E. Shriberg and A. Stolke. We have also been successful in hosting a unique tutorial on "Machine Learning for Speech and Language processing" that will be given by  J. Bilmes and P. Haffner.

Find out more on the STC website: 


Number of Papers

Speech Production


Speech Perception


Speech Analysis


Speech Synthesis


Speech Coding


Speech Enhancement


Acoustic Modeling






General Topics in ASR


Language Modeling


Large-vocabulary ASR


Speaker Recognition


Spoken Language Processing


Human/Computer Interaction




Data Mining and Search


back to top

Interspeech 2005 / Eurospeech

September 4--8, 2005 

Theme: Ubiquitous Speech Processing


 ISCA, toghether with the Interspeech 2005 - Eurospeech organizing committee, would like to encourage submission of papers for the upcoming conference.



The deadline for full paper submission (4 pages) is April 8, 2005. Paper submission is done exclusively via the conference website, using the submission guidelines. No previously published papers should be submitted. Each corresponding author will be notified by e-mail of the acceptance of the paper by June 10, 2005. Minor updates will be allowed during June 10 - June 17.


We encourage proposals for half-day pre-conference tutorials to be held on September 4, 2005. Those interested in organizing a tutorial should send a one-page description to by January 14, 2005.

INTERSPEECH'2005 also welcomes proposals for special sessions. Each special session normally consists of 6 invited papers, although other formats may be considered. The topics of the special sessions should be important, new, emerging areas of interest to the speech processing community, yet have little overlap with the regular sessions. They
could also be interdisciplinary topics that encourage cross-fertilization of fields or topics investigated by members of other societies that are becoming of keen interest to the speech and language community. Special session papers follow the same submission format as regular papers. Proposals for special sessions should be sent to by January 14, 2005.


Proposals for Tutorials and Special Sessions due by:  January 14, 2005
Full paper submission deadline:                           April 8, 2005
Notification of paper acceptance/rejection            June 10, 2005
Early registration deadline:                                  June 25, 2005
For further information:                               
or send email to                                       


L2F - Spoken Language Systems Laboratory, INESC ID Lisboa
Rua Alves Redol, 9 - 1000-029 Lisbon - Portugal
Phone:+351 213100268    Fax: +351 213145843   www.l2f.inesc-id-pt
back to top



Automatic Speech Recognition and Understanding Workshop

Cancun, Mexico
November 27 – December 1, 2005

The Ninth biannual IEEE workshop on Automatic Speech Recognition and  Understanding (ASRU) will be held November 27-December 1, 2005.   The ASRU Workshops have a tradition of bringing together researchers from academia and industry in an intimate and collegial setting to discuss problems of common interest in automatic speech recognition and understanding. Papers in all areas of human language technology are encouraged to be submitted, with emphasis placed on automatic speech recognition and understanding technology, speech to text systems, spoken dialog systems, multilingual language processing, robustness in ASR, spoken document retrieval, and speech-to-speech translation.
The workshop program will consist of invited lectures, oral and poster presentations, and panel discussions. Ample time will be allowed for informal discussions and to enjoy the impressive tropical setting.  The workshop website will be accessible by January, 2005.

Prospective authors are invited to submit full-length, 4-6 page papers, including figures and references, to . All papers will be handled and reviewed electronically. The ASRU 2005 website will provide you with further details. Please note that the submission dates for papers are strict deadlines.

Special sessions proposals should be submitted by June 15, 2005, to and must include a topical title, rationale, session outline, contact information, and a description of how the session will be organized.


May 1, 2005            Workshop registration opens
July 1, 2005              Camera-ready paper submission deadline
August 15, 2005        Paper Acceptance / Rejection notices mailed
Sept. 15, 2005          Revised Papers Due and Author Registration Deadline
Oct. 1, 2005             Hotel Reservation and Workshop Registration
Nov. 27 – Dec.1, 2005    Workshop

General Chairs
    Jim Glass, MIT, USA
    Richard Rose, McGill University, Canada
Technical Chairs
    Michael Picheny, IBM, USA
    Renato de Mori, Avignon, France
    Richard Stern, CMU, USA
Publicity Chair
    Ruhi Sarikaya, IBM, USA
Publications Chair
    Dilek Hakkani-Tur, AT&T, USA
Local Arrangements Chair:
    Juan Nolazco, Monterrey, Mexico
Demonstrations Chair
    Anand Venkataraman, SRI, USA
back to top

Call for Papers

2005 IEEE Workshop on

Applications of Signal Processing

to Audio and Acoustics

Mohonk Mountain House
New Paltz, New York
October 16-19, 2005

The 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'05) will be held at the Mohonk Mountain House in New Paltz, New York, and is sponsored by the Audio & Electroacoustics committee of the IEEE Signal Processing Society. The objective of this workshop is to provide an informal environment for the discussion of problems in audio and acoustics and the signal processing techniques leading to novel solutions. Technical sessions will be scheduled in the morning and before and after dinner. Afternoons will be left free for informal meetings among workshop participants. Papers describing original research and new concepts are solicited for technical sessions on, but not limited to, the following topics:

Acoustic Scenes
- Scene Analysis: Source Localization, Source Separation, Room Acoustics
- Signal Enhancement: Echo Cancellation, Dereverberation, Noise Reduction, Restoration
- Multichannel Signal Processing for Audio Acquisition and Reproduction
- Virtual Acoustics via Loudspeakers or Headphones
Audio Coding
- Waveform Coding and Parameter Coding
- Spatial Audio Coding
- Internet Audio
- Musical Signal Analysis: Segmentation, Classi¯cation, Transcription
- Digital Rights
- Mobile Devices
Hearing and Perception
- Auditory Perception, Spatial Hearing, Quality Assessment
- Hearing Aids
- Signal Analysis and Synthesis Tools
- Creation of Musical Sounds: Waveforms, Instrument Models, Singing

Submission of four-page paper: April 15, 2005
Notification of acceptance: June 27, 2005
Early registration until: September 1, 2005

Workshop Committee

General Chair:
Walter Kellermann
Multimedia Communications and Signal Processing
University of Erlangen-Nuremberg, Germany

Technical Program Chair:
Rudolf Rabenstein

Finance Chair:
Michael Brandstein

Heinz Teutsch

Schuyler Quackenbush
Audio Research Labs, USA

Local Arrangements:
Jingdong Chen

Ursula Arnold

Far East Liaison:
Shoji Makino
NTT Communication Science Laboratories, Japan
back to top

Call for Papers

Speech Communication Journal

Special Issue on

Spoken Language Understanding for Conversational Systems

Paper submission deadline has been extended by one month to January 1st, 2005!

The success of a conversational system depends on a synergistic integration of technologies such as speech recognition, spoken language understanding (SLU), dialog modeling, natural language generation, speech synthesis and user interface design. In this special issue, we will address the SLU component of a conversational system and its relation to the speech recognizer and the dialog model. In particular, we aim to bring together techniques that address the issue of robustness of SLU to speech recognition errors, language variability and dysfluencies in speech with issues of output representations from SLU that provide greater flexibility to the dialog model.

The topic of robust SLU has received much attention during the DARPA funded ATIS program of the 1990s and more recently the DARPA Communicator program. In parallel to that research, a number of real-world conversational systems have been deployed to date. However, the techniques for robust SLU have branched out in many different directions. They have been influenced by many recent areas such as information extraction, question answering and machine learning.

The objective of this issue is to provide the speech and language processing community with a forum for presenting recent advances, perspectives and research directions in SLU for conversational systems. The special issue follows on the related HLT/NAACL 2004 Workshop and will address related topics such as:

Guest Editors:

Dr. Srinivas Bangalore, AT&T Labs - Research,
Dr. Dilek Hakkani-Tür, AT&T Labs - Research,
Dr. Gokhan Tur, AT&T Labs - Research,

Important Dates:

Extended Submission deadline: January 1st, 2005 (early submission is encouraged)
Notification of acceptance: April 1st, 2005
Final manuscript due: June 1st, 2005
Tentative publication date: September 1st, 2005

Submission Procedure:

Prospective authors should follow the regular guidelines of the Speech Communication Journal for electronic submission ( During submission authors must select the Section as "Special Issue Paper", not "Regular Paper", and the title of the special issue should be referenced in the "Comments" page along with any other information.
back to top

EURASIP Journal on Applied Signal Processing

Advances in Multimicrophone Speech Processing

Call for Papers

Speech quality may significantly deteriorate in the presence of interference, especially when the speech signal is also subject to reverberation. Consequently, modern communications systems, such as cellular phones, employ some speech enhancement procedure at the preprocessing stage, prior to further processing (e.g., speech coding).

Generally, the performance of single-microphone techniques is limited, since these techniques can utilize only spectral information. Especially for the dereverberation problem, no adequate single-microphone enhancement techniques are presently available. Hence, in many applications such as hands-free mobile telephony, voice-controlled systems, and teleconferencing and hearing instruments, a growing tendency exists to move from single-microphone systems to multimicrophone systems. Although multimicrophone systems come at an increased cost, they exhibit the advantage of incorporating both spatial and spectral information.

The use of multimicrophone systems raises many practical considerations such as tracking the desired speech source and robustness to unknown microphone positions. Furthermore, due to the increased computational load, real-time algorithms are more difficult to obtain and hence the efficiency of the algorithms becomes a major issue.

The main focus of this special issue is on emerging methods for speech processing using multimicrophone arrays.

Topics of interest include (but are not limited to):

Authors should follow the EURASIP JASP manuscript format described at the journal site Prospective authors should submit an electronic copy of their complete manuscript through the EURASIP JASP manuscript tracking system at, according to the following timetable:

Manuscript Due February 1, 2005
Acceptance Notification June 1, 2005
Final Manuscript Due September 1, 2005
Publication Date 1st Quarter, 2006

Guest Editors:

Jacob Benesty, Université du Québec, Canada

Joerg Bitzer, University of applied science Oldenburg, Germany

Israel Cohen, Technion – IIT, Israel

Simon Doclo, Katholieke Universiteit Leuven, Belgium

Sharon Gannot, Bar-Ilan University, Israel

Rainer Martin, Ruhr-Universitat Bochum, Germany

Sven Nordholm, Curtin University of Technology, Australia

back to top

Jim Flanagan Retires from CAIP Center and From Rutgers University

Jim Flanaga
Larry Rabiner and Jim Flanagan
 Jim Flanagan and Larry Rabiner sharing a story at Jim's retirement dinner

Jim Flanagan retired from his position as director of the Rutgers Center for Advanced Information Processing (CAIP) and as Vice President for Research at Rutgers University in September.  He has been responsible for so many advances in so many areas of acoustics, speech science, and signal processing that a special session was organized in his honor at the November, 2004 meeting of the Acoustical Society of America by Soran Dusan and Larry Rabiner.   The session was entitled "Speech communication and signal processing in acoustics: Fifty years of progress in speech communication: Honoring the contributions of James L. Flanagan (4 Mbyte pdf file)."  The following announcement is an excerpt from CAIP Update Newsletter, Volume 16, Number 2, December, 2004:

CAIP Director Jim Flanagan Retires, Receives Many Honors

Jim Flanagan, Director of the CAIP Center and Rutgers Vice President for Research, and Board of Governors Professor in Electrical and Computer Engineering retired effective 30 September 2004.

Jim has since been receiving a steady stream of thanks and accolades for his services to this Center, to the University, and to the scientific research community at large. A few examples of this recognition follow. 

The William Gould Dow Distinguished Lecture
On 20 October 2004, Jim presented the sixth William Gould Dow Distinguished Lecture, sponsored by the Department of Electrical Engineering and Computer Science at the University of Michigan’s College of Engineering. The lectureship is highest external honor bestowed by the department and recognizes outstanding contributions in the field of Electrical Engineering and Computer Science.

ASA Special Session
The Acoustical Society of America’s 148th Meeting, November 15-19, in San Diego, California held a Special Session in Jim’s honor entitled “Speech Communication and Signal Processing in Acoustics: Fifty Years of Progress in Speech Communication: Honoring the Contributions of James L. Flanagan.”

IEEE Medal of Honor
The Board of Directors of the Institute of Electrical and Electronic Engineers (IEEE) Flanagan as a recipient of the 2005 IEEE Medal of Honor, its highest accolade, and one of the most prestigious honors in the field. The award is in recognition of Jim’s “sustained leadership and outstanding contributions to speech technology.” Details of the award presentation will be announced.

CAIP & University Recognition
Closer to home, Jim was feted in a CAIP Center luncheon on 29 September and a Rutgers-wide reception held 27 October at the President’s House. Here, Jim’s industry and university colleagues thanked him for his major contributions to the field and to their own careers. The remarks by Rutgers former President Lawrence, who labored many years along with Jim, were especially poignant. Anne Thomas, past Chair, Rutgers Board of Governors, cited Jim’s constant ability to quickly assemble information of highest quality in support of the Board’s decision-making.
If recent events give any clue, Professor Flanagan appears embarked on a course of continued distinguished scientific and technical contributions in both academic and industry sectors.
back to top

Software development engineer, Speech Recognition

Speech Component Group
Microsoft Corporation
Redmond WA
Contact: Yifan Gong (

Speech recognition (SR) technology is being used more and more as part of natural user interfaces. Microsoft has shipped multiple products with SR technology inside. The speech component group is looking for a developer passionate about the opportunity that speech technology provides to work on our acoustic and language modeling technologies. The position offers opportunities of innovation on the next wave Microsoft platforms in the server, embedded, and desktop markets for various languages.


Job responsibilities include the design, implementation, and analysis of advanced acoustic and language modeling technologies, and the design, development and optimization of speech technology processes used to advance the state of the art in automatic speech recognition. You will be required to design efficient algorithms for speech recognition performance improvement, run experiments, create tools and scripts to process large datasets based on internal and external research. Rigorous attention to detail is required and an ability to analyze results to dynamically choose an optimal course for the development is a must.


Qualifications include a passion for speech technology; a MS or PhD degree on Computer Science or related disciplines; 3 or more years of experience programming in C/C++ and strong computer science skills; ability to multitask and handle ambiguities; ability to identify and solve problems in complicated SR systems. A strong background on speech recognition technology, statistical modeling, pattern recognition or signal processing is highly preferred.

 back to top

ASR Researchers Take New Positions

The STC Newsletter would like to provide announcements of  professors, researchers, and developers in the speech area taking new positions.  If you have moved lately or are in the process of moving  to a new position in the new future, send your new contact information to the STC Newsletter so it can be posted in the next edition.  

Bill Byrne
University Lecturer in Speech Processing
Machine Intelligence Laboratory
Cambridge University Engineering Dept.
Trumpington Street, Cambridge CB2 1PZ U.K.

Yifan Gong
Room 2387 Building 17
Microsoft Corporation
1, Microsoft Way
, WA 98052-6399


back to top

Links to Upcoming Conferences and Workshops

(Organized by Date)

4th International Conference on Spoken Language Processing
Hong Kong, China, December 15-18, 2004

Philadelphia, Pennsylvania, May, 2005

SIGdial Workshop on Discourse and Dialog
Lisbon, Portugal , September 2-3, 2005

EUROSPEECH 2005 9th European Conference on Speech Communication and Technology
Lisbon, Portugal, September 4-8, 2005

Disfluency in Spontaneous Speech
Aix-en-Provence, September 10-12, 2005

IEEE WASPAA2005 Workshop on Applications of Signal Processing to Audio and Acoustics

New Paltz, New York, October 16-19, 2005

SPECOM 2005 - 10th International Conf. on Speech and Computers
Patras, Greece, October 17-19, 2005

IEEE ASRU2005 Automatic Speech Recognition and Understanding Workshop
Cancun, Mexico, November 27 - December 1, 2005

Toulouse, France May 15-19, 2006

Pittsburgh, PA, USA September 17-21, 2006

Hawaii, USA, 2007

Antwerp, Belgium, August 27-31, 2007

back to top