stcnewsletter August 2003

Welcome to the IEEE Signal Processing Society Speech Technical Committee (STC) newsletter. As always, contributions of events, publications, workshops, and career information to the newsletter are welcome (rose@ece.mcgill.ca).

Seven New Members Elected to the STC

In October, seven new members were elected to the Speech Technical Committee. They will each serve three year terms on the committee. They will join the existing members who can be found on the STC web page http://ewh.ieee.org/soc/sps/stc/. Perspective members are nominated by existing committee members in technology areas with retiring members, and the STC elects new members from these nominees. Each member serves a three year term. The new members are given here along with short descriptions of their background:

Hisashi Kawai – ATR Spoken Language Translation Research Laboratories
Tomoko Matsui - Institute of Statistical Mathematics, Tokyo, Japan
Jean-Claude Junqua - Matsushita Electric Industrial Co. in Osaka, Japan
Shri Narayanan – Dept. of Electrical Engineering, University of Southern California
Roger Moore - Department of Computer Science, University of Sheffield, Sheffield, UK
Yunxin Zhao – Department of Computer Science, University of Missouri at Columbia
Peter Kabal – Dept. of Electrical and Computer Engineering, McGill Univ., Montreal

Hisashi Kawai

Hisashi Kawai is the head of Department of Speech Synthesis, ATR Spoken Language Translation Research Laboratories in

Japan

. He received Dr.

Eng.

degree from the University of Tokyo, in 1989. Since then he has been working on all parts of TTS including text processing, prosody generation, and waveform generation. He has also experience in telephone-based speech recognition. He is now directing the development of ATR's third generation TTS system named XIMERA that is one of the best quality TTS at present based on corpus-based technologies.
back

Tomoko Matsui

Professor Matsui has been a frontier researcher in speaker recognition ever since she joined NTT Speech Research in 1988. She has developed key technologies in this area with Dr. Sadaoki Furui; including: posterior probability based score normalization, text-prompt based speaker recognition system, and noise robust HMM compositions. She worked at ATR, Kyoto, Japan, from 1998 to 2002 and visited Speech Research at Bell Labs, Murray Hill, NJ, in 2001, where she extended the speaker recognition/verification algorithms to topic detection and utterance verification. She joined the Institute of Statistical Mathematics, Tokyo, Japan, in 2003 as an associate professor. Currently, she is investigating an unconventional approach to speaker recognition, "a dual penalized logistic regression machine," which has a potential to deliver an even better discrimination performance than the current main stream approach, the GMM-based method. Professionally, she has been actively involved in various volunteering work in IEEE, Acoustic Society of Japan, and IEICE (Japan), including key posts like the secretariat of HSC2001, associate editor of IEICE Transactions on Information and Systems and a steering committee member of the society.

back

Jean-Claude Junqua
Dr. Junqua is currently with the AV Core Technology Development Center (ACC) at Matsushita Electric Industrial Co. in Osaka, Japan. He served as Vice President and Director of Panasonic Speech Technology Laboratory. He has served as chairman at several international conferences and has participated in several international scientific committees. He is the author/co-author of more than 100 articles and 80 patents with two books entitled "Robustness in Automatic Speech Recognition" and "Robust Speech Recognition in Embedded Systems and PC Applications". He co-edited a book on "Robustness in Languages and Speech Technology", has served as associate editor for the IEEE Transactions on SAP, and he is currently a member of Speech Communication and ACM editorial boards.
back

Shri Narayanan
Shrikanth Narayanan (Ph.D'95, UCLA), was with AT&T (both AT&T Bell Labs and AT&T Research) from 1995-2000. Currently he is an Associate Professor of Electrical Engineering, with joint appointments in Linguistics and Computer Science at the University of Southern California (USC). He is a research area director of the Integrated Media Systems Center, an NSF Engineering Research Center on multimedia, at USC. He is a member of Tau-Beta-Pi, Phi Kappa Phi and Eta-Kappa-Nu, a senior member of IEEE and a recipient of an NSF CAREER award, 2003 USC Engineering Faculty research Award, and a fellowship from the Center for Interdisciplinary research. His research interests are in multimodal signal processing and interpretation with applications to human-machine interfaces. NSF, NIH, and DARPA support his work. He has published over 130 papers and holds 3 U.S. patents. He has served as technical chair of several workshops and has served as associate editor for the SAP Transactions.
back

Roger Moore
Dr. Moore is currently with the Speech and Hearing Research Group (SPandH) in the Department of Computer Science at the University of Sheffield in Sheffield, UK. He served as head of the UK Governments Speech Research Unit (SRU) from 1985 until its privatisation in 1999. He then served as Chief Scientific Officer of the resulting company - 20/20 Speech Ltd. He has authored and co-authored over 100 scientific publications in Speech Technology algorithms, applications and assessment. He was visiting Professor in the Dept. Phonetics & Linguistics, University College London.

Dr. Moore is the immediate past-President of both the International Speech Communication Association (ISCA) and the Permanent Council of the International Conferences on Spoken Language Processing (ICSLP). He is a member of the Editorial/Advisory Boards for the scientific Journals Computer Speech & Language and Speech Communication. He was awarded the UK Institute of Acoustics Tyndall Medal for "distinguished work in the field of speech research and technology" and was awarded the NATO RTO Scientific Achievement Award for "repeated contribution in scientific and technological cooperation".
back

Yunxin Zhao

Dr. Yunxin Zhao is currently professor of the Department of Computer Science, University of Missouri at Columbia. Her research interests are in spoken language processing, signal and speech processing, statistical pattern recognition, multimedia interface, and biomedical applications. Dr. Zhao was associate editor of IEEE Transactions on Speech and Audio Processing. She has published over 100 journal and conference papers and received six

United States

patents. She received 1995 NSF Career award and several other research grant awards from US federal and private funding agencies. Dr. Zhao was listed in American Men and Women of Science, February 1998.
back

Peter Kabal
Professor Kabal is Professor of Electrical and Computer Engineering at McGill University in Montreal. He has carried out research on the application of signal processing to speech coding and enhancement, data transmission, filtering and multiplexing, and multi-sensor arrays over a period of more than 25 years. He has contributed to nearly 100 journal and conference papers in these areas over the last 10 years alone. He has supervised over sixty graduate students that are now working in academia, government, and industry. His deep background in speech, signal processing, and communications brings a broad perspective on topics in the SPS that is currently lacking in the STC. Prof. Kabal has also served in visiting positions in laboratories in Sweden, Australia, and the United States. He has served as a consultant to major telecommunications companies including Nortel Networks and AT&T. Professor Kabal served as Co-Technical Chairman for the IEEE Int. Conf. Acoustics, Speech and Signal Processing held in Montreal in May 2004. He has served for many years as a Member of the Technical Committee for the IEEE Speech Coding Workshop. He also served as Vice-Technical Program Chair for the IEEE International Conference on Communications, held in Montreal in June 1997 (ICC'97).
back

Review Process for ICASSP 2005

The Speech Technical Committee (STC) received 572 papers for ICASSP 2005. This represents nearly a 9% increase from ICASSP 2004 and ICASSP 2003. About 51% of the papers were rejected. A rigorous review process was implemented in which each paper was essentially reviewed by at least 3 people; one being STC member and 2 external reviewers. The review process included 150 external reviewers and 46 STC members; 9 of the STC members acted as Area Chairs, working closely with the STC team members to integrate the final scores, define accept/reject, form technical sessions and identify session chairs.

For ICASSP 2005, the STC adopted a new speech EDICS classification of topics in order for ICASSP to align with recent advancements in speech and language processing research. As a result of this change, we experienced nearly 50% increase in speaker recognition papers, nearly 50% increase in spoken language technology papers and a modest 5% of new submissions in the areas of speech data mining and multimodal/multimedia Human/Computer interfaces. Detailed statistics of the paper submission is shown in the Table below.

The STC has been successful in attracting two very strong special sessions on Human Language Technology that will be organized by M. Ostendorf, E. Shriberg and A. Stolke. We have also been successful in hosting a unique tutorial on "Machine Learning for Speech and Language processing" that will be given by J. Bilmes and P. Haffner.

Interspeech 2005 / Eurospeech

September 4--8, 2005
www.interspeech2005.org

Theme: Ubiquitous Speech Processing

CALL FOR PAPERS
CALL FOR TUTORIALS
CALL FOR SPECIAL SESSIONS

ISCA, toghether with the Interspeech 2005 - Eurospeech organizing committee, would like to encourage submission of papers for the upcoming conference.

The deadline for full paper submission (4 pages) is April 8, 2005. Paper submission is done exclusively via the conference website, using the submission guidelines. No previously published papers should be submitted. Each corresponding author will be notified by e-mail of the acceptance of the paper by June 10, 2005. Minor updates will be allowed during June 10 - June 17.

We encourage proposals for half-day pre-conference tutorials to be held on September 4, 2005. Those interested in organizing a tutorial should send a one-page description to tutorials@interspeech2005.org by January 14, 2005.

INTERSPEECH'2005 also welcomes proposals for special sessions. Each special session normally consists of 6 invited papers, although other formats may be considered. The topics of the special sessions should be important, new, emerging areas of interest to the speech processing community, yet have little overlap with the regular sessions. They
could also be interdisciplinary topics that encourage cross-fertilization of fields or topics investigated by members of other societies that are becoming of keen interest to the speech and language community. Special session papers follow the same submission format as regular papers. Proposals for special sessions should be sent to special_sessions@interspeech2005.org by January 14, 2005.

Proposals for Tutorials and Special Sessions due by: January 14, 2005
Full paper submission deadline:                           April 8, 2005
Notification of paper acceptance/rejection            June 10, 2005
Early registration deadline:                                  June 25, 2005
For further information:                                         www.interspeech2005.org
or send email to                                               info@interspeech2005.org

L2F - Spoken Language Systems Laboratory, INESC ID Lisboa
Rua Alves Redol, 9 - 1000-029 Lisbon - Portugal
Phone:+351 213100268 Fax: +351 213145843 www.l2f.inesc-id-pt
back to top

CALL FOR PAPERS

IEEE ASRU 2005

Automatic Speech Recognition and Understanding Workshop

The Ninth biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU) will be held November 27-December 1, 2005. The ASRU Workshops have a tradition of bringing together researchers from academia and industry in an intimate and collegial setting to discuss problems of common interest in automatic speech recognition and understanding. Papers in all areas of human language technology are encouraged to be submitted, with emphasis placed on automatic speech recognition and understanding technology, speech to text systems, spoken dialog systems, multilingual language processing, robustness in ASR, spoken document retrieval, and speech-to-speech translation.

The workshop program will consist of invited lectures, oral and poster presentations, and panel discussions. Ample time will be allowed for informal discussions and to enjoy the impressive tropical setting. The workshop website http://www.asru2005.org will be accessible by January, 2005.

PAPER SUBMISSION:
Prospective authors are invited to submit full-length, 4-6 page papers, including figures and references, to www.asru2005.org . All papers will be handled and reviewed electronically. The ASRU 2005 website will provide you with further details. Please note that the submission dates for papers are strict deadlines.

SPECIAL SESSIONS:
Special sessions proposals should be submitted by June 15, 2005, to asru05-tc@lists.csail.mit.edu and must include a topical title, rationale, session outline, contact information, and a description of how the session will be organized.

TENTATIVE DATES

May 1, 2005            Workshop registration opens
July 1, 2005              Camera-ready paper submission deadline
August 15, 2005        Paper Acceptance / Rejection notices mailed
Sept. 15, 2005          Revised Papers Due and Author Registration Deadline
Oct. 1, 2005             Hotel Reservation and Workshop Registration
Nov. 27 – Dec.1, 2005    Workshop

ORGANIZING COMMITTEE

General Chairs
    Jim Glass, MIT, USA
    Richard Rose, McGill University, Canada
Technical Chairs
    Michael Picheny, IBM, USA
    Renato de Mori, Avignon, France
    Richard Stern, CMU, USA
Publicity Chair
    Ruhi Sarikaya, IBM, USA
Publications Chair
    Dilek Hakkani-Tur, AT&T, USA
Local Arrangements Chair:
    Juan Nolazco, Monterrey, Mexico
Demonstrations Chair
    Anand Venkataraman, SRI, USA
back to top

Call for Papers

Speech Communication Journal

Special Issue on

Spoken Language Understanding for Conversational Systems

Paper submission deadline has been extended by one month to January 1st, 2005!

The success of a conversational system depends on a synergistic integration of technologies such as speech recognition, spoken language understanding (SLU), dialog modeling, natural language generation, speech synthesis and user interface design. In this special issue, we will address the SLU component of a conversational system and its relation to the speech recognizer and the dialog model. In particular, we aim to bring together techniques that address the issue of robustness of SLU to speech recognition errors, language variability and dysfluencies in speech with issues of output representations from SLU that provide greater flexibility to the dialog model.

The topic of robust SLU has received much attention during the DARPA funded ATIS program of the 1990s and more recently the DARPA Communicator program. In parallel to that research, a number of real-world conversational systems have been deployed to date. However, the techniques for robust SLU have branched out in many different directions. They have been influenced by many recent areas such as information extraction, question answering and machine learning.

The objective of this issue is to provide the speech and language processing community with a forum for presenting recent advances, perspectives and research directions in SLU for conversational systems. The special issue follows on the related HLT/NAACL 2004 Workshop and will address related topics such as:

Guest Editors:

Important Dates:

Extended Submission deadline: January 1st, 2005 (early submission is encouraged)
Notification of acceptance: April 1st, 2005
Final manuscript due: June 1st, 2005
Tentative publication date: September 1st, 2005

Submission Procedure:

EURASIP Journal on Applied Signal Processing

Speech quality may significantly deteriorate in the presence of interference, especially when the speech signal is also subject to reverberation. Consequently, modern communications systems, such as cellular phones, employ some speech enhancement procedure at the preprocessing stage, prior to further processing (e.g., speech coding).

Generally, the performance of single-microphone techniques is limited, since these techniques can utilize only spectral information. Especially for the dereverberation problem, no adequate single-microphone enhancement techniques are presently available. Hence, in many applications such as hands-free mobile telephony, voice-controlled systems, and teleconferencing and hearing instruments, a growing tendency exists to move from single-microphone systems to multimicrophone systems. Although multimicrophone systems come at an increased cost, they exhibit the advantage of incorporating both spatial and spectral information.

The use of multimicrophone systems raises many practical considerations such as tracking the desired speech source and robustness to unknown microphone positions. Furthermore, due to the increased computational load, real-time algorithms are more difficult to obtain and hence the efficiency of the algorithms becomes a major issue.

The main focus of this special issue is on emerging methods for speech processing using multimicrophone arrays.

Authors should follow the EURASIP JASP manuscript format described at the journal site http://asp.hindawi.com/. Prospective authors should submit an electronic copy of their complete manuscript through the EURASIP JASP manuscript tracking system at http://www.mstracking.com/asp/, according to the following timetable:

Guest Editors:

Jim Flanagan Retires from CAIP Center and From Rutgers University

Speech recognition (SR) technology is being used more and more as part of natural user interfaces. Microsoft has shipped multiple products with SR technology inside. The speech component group is looking for a developer passionate about the opportunity that speech technology provides to work on our acoustic and language modeling technologies. The position offers opportunities of innovation on the next wave Microsoft platforms in the server, embedded, and desktop markets for various languages.

Job responsibilities include the design, implementation, and analysis of advanced acoustic and language modeling technologies, and the design, development and optimization of speech technology processes used to advance the state of the art in automatic speech recognition. You will be required to design efficient algorithms for speech recognition performance improvement, run experiments, create tools and scripts to process large datasets based on internal and external research. Rigorous attention to detail is required and an ability to analyze results to dynamically choose an optimal course for the development is a must.

Qualifications include a passion for speech technology; a MS or PhD degree on Computer Science or related disciplines; 3 or more years of experience programming in C/C++ and strong computer science skills; ability to multitask and handle ambiguities; ability to identify and solve problems in complicated SR systems. A strong background on speech recognition technology, statistical modeling, pattern recognition or signal processing is highly preferred.

ASR Researchers Take New Positions

Yifan Gong
Room 2387 Building 17
Microsoft Corporation
1, Microsoft Way
Redmond, WA 98052-6399
USA
425-705-9555
ygong@microsoft.com

Links to Upcoming Conferences and Workshops

4th International Conference on Spoken Language Processing
Hong Kong, China, December 15-18, 2004

EUROSPEECH 2005 9th European Conference on Speech Communication and Technology
Lisbon, Portugal, September 4-8, 2005
http://www.interspeech2005.org/

IEEE ASRU2005 Automatic Speech Recognition and Understanding Workshop
Cancun, Mexico, November 27 - December 1, 2005
http://www.asru2005.org

Area	Number of Papers
Speech Production		6
Speech Perception		8
Speech Analysis		58
Speech Synthesis		46
Speech Coding		32
Speech Enhancement		48
Acoustic Modeling		64
Robustness		74
Adaptation/Normalization		26
General Topics in ASR		35
Language Modeling		16
Large-vocabulary ASR		24
Speaker Recognition		64
Spoken Language Processing		45
Human/Computer Interaction		5
Multimedia/Multimodal		8
Data Mining and Search		13

Manuscript Due	February 1, 2005
Acceptance Notification	June 1, 2005
Final Manuscript Due	September 1, 2005
Publication Date	1st Quarter, 2006

Seven New Members Elected to the STC

Review Process for ICASSP 2005

Interspeech 2005 / Eurospeech

September 4--8, 2005
www.interspeech2005.org

Theme: Ubiquitous Speech Processing

CALL FOR PAPERS
CALL FOR TUTORIALS
CALL FOR SPECIAL SESSIONS

CALL FOR PAPERS

IEEE ASRU 2005

Automatic Speech Recognition and Understanding Workshop

Cancun, Mexico
November 27 – December 1, 2005

Call for Papers

2005 IEEE Workshop on

Applications of Signal Processing

to Audio and Acoustics

Call for Papers

Speech Communication Journal

Special Issue on

Spoken Language Understanding for Conversational Systems

Paper submission deadline has been extended by one month to January 1st, 2005!

Guest Editors:

Important Dates:

Submission Procedure:

EURASIP Journal on Applied Signal Processing

Guest Editors:

Jim Flanagan Retires from CAIP Center and From Rutgers University

ASR Researchers Take New Positions

Links to Upcoming Conferences and Workshops


Jim Flanagan and Larry Rabiner sharing a story at Jim's retirement dinner

Seven New Members Elected to the STC

Review Process for ICASSP 2005

Interspeech 2005 / Eurospeech

September 4--8, 2005 www.interspeech2005.org

Theme: Ubiquitous Speech Processing

CALL FOR PAPERS CALL FOR TUTORIALS CALL FOR SPECIAL SESSIONS

CALL FOR PAPERS

IEEE ASRU 2005

Automatic Speech Recognition and Understanding Workshop

Cancun, Mexico November 27 – December 1, 2005

Call for Papers

2005 IEEE Workshop on

Applications of Signal Processing

to Audio and Acoustics

Call for Papers

Speech Communication Journal

Special Issue on

Spoken Language Understanding for Conversational Systems

Paper submission deadline has been extended by one month to January 1st, 2005!

Guest Editors:

Important Dates:

Submission Procedure:

EURASIP Journal on Applied Signal Processing

Guest Editors:

Jim Flanagan Retires from CAIP Center and From Rutgers University

ASR Researchers Take New Positions

Links to Upcoming Conferences and Workshops

September 4--8, 2005
www.interspeech2005.org

CALL FOR PAPERS
CALL FOR TUTORIALS
CALL FOR SPECIAL SESSIONS

Cancun, Mexico
November 27 – December 1, 2005