stcnewsletter August 2003

Welcome to the ninth IEEE Signal Processing Society Speech Technical Committee (STC) newsletter. Contributions of events, publications, workshops, and career information to the newsletter are welcome (rose@ece.mcgill.ca).

Summary of ICASSP 2004 Submissions from the

Conference Technical Program Chairs

[top of page]

4th International Symposium on

Chinese Spoken Language Processing

December 15-18, 2004

Hong Kong

http://www.se.cuhk.edu.hk/~iscslp/index.html

Preliminary Call for Papers

The 4th International Symposium on Chinese Spoken Language Processing (ISCSLP'04) will be held during December 16-18, 2004 in Hong Kong. ISCSLP is a conference for scientists, researchers, and practitioners to report and discuss the latest progress in all the scientific and technological aspects of the Chinese spoken language processing. The series of conferences have been held biennially in different Asia Pacific cities: 1998 in Singapore, 2000 in Beijing, and 2002 in Taipei. ISCSLP has become the world's largest and most comprehensive technical conference focused on Chinese spoken language processing and its applications. The ISCSLP'04 will feature world-class plenary speakers, tutorials, and a number of lecture and poster sessions on the following topics:

    * Speech Production and Perception
    * Phonetics and Phonology
    * Speech Analysis
    * Speech Coding
    * Speech Enhancement
    * Speech Recognition
    * Speech Synthesis
    * Language Modeling and Spoken Language Understanding
    * Spoken Dialog Systems
    * Spoken Language Translation
    * Speaker and Language Recognition
    * Indexing, Retrieval and Authoring of Speech Signals
    * Multi-Model Interface including Spoken Language Processing
    * Spoken Language Resources and Technology Evaluation
    * Applications of Spoken Language Processing Technology
    * Others

Hong Kong, better known as the Pearl of the Orient, is a place where East meets West. Shopping, dining, sightseeing, as well as world-class events and attractions are all conveniently available within a short distance. As the "City of Life" in Asia with multi-culture heritage and kaleidoscopic living style, Hong Kong buzzes with unique tourist attractions that are beyond compare in the region. You are cordially invited to attend ISCSLP'04 and to experience the fascination of Hong Kong that is unmatched anywhere in the world.

The working language of ISCSLP is English. Prospective authors are invited to submit full-length, four-page papers for presentation in any of the areas listed above. All ISCSLP'04 papers will be handled and reviewed electronically and details can be found in the conference web-site http://www.iscslp2004.org. Please note that following important dates and plan your schedule well in advance.

Schedule of Important Dates:
Four page full paper submission to be received by       July 23, 2004
Notification of acceptance mailed out by       September 20, 2004
Camera ready papers to be received by          October 8, 2004
Early registration November 12,2004
back to top

Call for participation

HLT-NAACL 2004 Workshop on

Spoken Language Understanding for Conversational Systems and

Higher Level Linguistic Information for Speech Processing

Friday, May 7, 2004
Park Plaza Hotel, Boston, USA
http://www.research.att.com/~dtur/NAACL04-Workshop/
http://www.speech.sri.com/hlt-workshop/

The success of a conversational system depends on a synergistic integration of technologies such as speech recognition, spoken language understanding (SLU), dialog modeling, natural language generation, speech synthesis and user interface design. In this workshop, we address the issue of improving the robustness of the speech recognition and SLU components by exploiting higher level linguistic knowledge, meta-information and machine learning techniques.

The first part of the workshop will focus on robust SLU in conversational systems, which has received much attention during the DARPA funded ATIS program of the 1990s and more recently the DARPA Communicator program. In parallel to that research, a number of real-world conversational systems have been deployed to date. However, the techniques for robust SLU have branched out in many different directions. They have been influenced by many recent areas such as information extraction, question answering and machine learning. Data driven approaches to understanding are rapidly gaining prominence. There has been a substantial increase in interest in information extraction from the NLP community, question-answering in the information retrieval community, and spoken dialog systems in the speech processing community. Spoken language understanding is an especially attractive topic for cross-fertilization of ideas between speech, IR, and NLP communities.

Going beyond SLU and dialog systems, the second part of the workshop will address use of high-level knowledge for improved speech recognition accuracy. The challenging robustness issues in speech recognition such as compensation for acoustic confusability resulting from noisy environments and unexpected channel and speaker mismatch can potentially be aided by the use of linguistic information such as prosody, syntax, semantics, and pragmatics and even high-level meta-information, such as personal information stored in a database or dialogue and pragmatic coherence constraints. However, current state-of-the-art speech recognizers do not explicitly use such information and rely mainly on information encoded in statistical N-gram language models. The papers here show the potential of high-level information to not only improve word accuracy but also to help disambiguate the recognized words, thus benefitting downstream processing and SLU in particular.

----------------------------------------------------------------

Invited Talks:

Renato De Mori, Univ Avignon, France
Sentence Interpretation using Stochastic Finite State Transducers

Roberto Pieraccini, IBM TJ Watson Research Center, USA
Spoken Language Understanding: The Research/Industry Chasm

----------------------------------------------------------------

Program:

8:45-9:00 Welcome

9:00-9:50 Invited Talk: Sentence Interpretation using Stochastic
Finite State Transducers, Renato De Mori

9:50-10:00 Break

10:00-10:30 Hybrid Statistical and Structural Semantic Modeling for
Thai Multi-Stage Spoken Language Understanding, Chai Wutiwiwatchai and
Sadaoki Furui

10:30-11:00 Interactive Machine Learning Techniques for Improving SLU
Models, Lee Begeja, Bernard Renger, David Gibbon, Zhu Liu and Behzad
Shahraray

11:00-11:30 Virtual Modality: a Framework for Testing and Building
Multimodal Applications, Peter Pal Boda and Edward Filisko

11:30-12:00 Automatic Call Routing with Multiple Language Models,
Qiang Huang and Stephen Cox

12:00-1:00 Lunch

1:00-1:30 Error Detection and Recovery in Spoken Dialogue Systems,
Edward Filisko and Stephanie Seneff

1:30-2:00 Robustness Issues in a Data-Driven Spoken Language
Understanding System, Yulan He and Steve Young

2:00-2:50 Invited Talk: Spoken Language Understanding: the
Research/Industry Chasm, Roberto Pieraccini

2:50-3:00 Break

3:00-3:30 Using Higher-level Linguistic Knowledge for Speech
Recognition Error Correction in a Spoken Q/A Dialog, Minwoo Jeong,
Byeongchang Kim and Gary Geunbae Lee

3:30-4:00 Speech Recognition Models of the Interdependence Among
Syntax, Prosody, and Segmental Acoustics, Mark Hasegawa-Johnson,
Jennifer Cole, Chilin Shih, Ken Chen, Aaron Cohen, Sandra Chavarria,
Heejin Kim, Taejin Yoon, Sarah Borys and Jeung-Yoon Choi

4:00-4:30 Modeling Prosodic Consistency for Automatic Speech
Recognition: Preliminary Investigations, Ernest Pusateri and James
Glass

4:30-5:00 Assigning Domains to Speech Recognition Hypotheses, Klaus
R�ggenmann and Iryna Gurevych

5:00-5:30 Context Sensing using Speech and Common Sense, Nathan Eagle
and Push Singh

----------------------------------------------------------------

Co-chairs:

Srinivas Bangalore, AT&T Labs - Research
Dilek Hakkani-T�r, AT&T Labs - Research
Gokhan Tur, AT&T Labs - Research
Yuqing Gao, IBM TJ Watson Research Center
Hong-Kwang Jeff Kuo, IBM TJ Watson Research Center
Andreas Stolcke, SRI & ICSI

----------------------------------------------------------------

Program Committee:

Frederic Bechet, Univ. of Avignon, France
Jerome Bellegarda, Apple Computer, USA
Jennifer Chu-Carroll, IBM TJ Watson Research Center, USA
Ciprian Chelba, Microsoft, USA
Stephen Cox, Univ. of East Anglia, UK
Sadaoki Furui, Tokyo Institute of Technology, Japan
Allen Gorin, AT&T Labs - Research, USA
Roberto Gretter, ITC-IRST, Italy
Julia Hirschberg, Columbia University, USA
Dan Jurafsky, University of Colorado, USA
Sanjeev Khudanpur, Johns Hopkins University, USA
Helen Meng, CUHK, Hong Kong
Prem Natarajan, BBN, USA
Hermann Ney, RWTH Aachen, Germany
Martha Palmer, University of Pennsylvania, USA
Barbara Peskin, ICSI, USA
Roberto Pieraccini, IBM TJ Watson Research Center, USA
Manny Rayner, NASA, USA
Brian Roark, AT&T Labs - Research, USA
Roni Rosenfeld, Carnegie Mellon University, USA
Stephanie Seneff, MIT, USA
Elizabeth Shriberg, SRI, USA
Amanda Stent, Stony Brook Univ., USA
back to top

CALL FOR PAPERS

Robust 2004: COST278 Workshop on

Robustness Issues in Conversational Interaction

August 30 and 31, 2004

University of East Anglia, Norwich, UK

http://www.cmp.uea.ac.uk/robust04/

A workshop on robustness issues for conversational interaction, organized by COST (European Cooperation in the field of Scientific and Technical Research) action 278, "Spoken Language Interaction in Telecommunication", will be held on August 30th and 31st, 2004 at the University of East Anglia, Norwich, UK.

The objective of this two day workshop is to bring together researchers from both universities and industry to consider different methods of achieving robustness in conversational interaction.

The workshop is aimed at robustness against all effects which are known to degrade the performance of each individual component of a conversational interaction system.

Different approaches for compensating againt these effects will form the main theme of the the workshop. A broad list of topics includes (not limited to):

Submission and further details
Prospective authors are invited to submit four-page papers describing original work in any of the areas relevant to the workshop.
Email enquiries can be sent to robust04@cmp.uea.ac.uk
Participation to the workshop will be restricted to around 50 people.

Important dates
Submission deadline: June 18th 2004
Notification of acceptance: July 9th 2004
Workshop: August 30th and 31st 2004

Rich Transcription 2004 Meeting Recognition Workshop

ICASSP 2004 in Montreal

May 17, 2004

NIST is conducting a community-wide evaluation of speech-based meeting recognition technologies in March and a 1-day workshop, "Rich Transcription 2004 Meeting Recognition Workshop", on May 17 at ICASSP 2004 in Montreal. While a portion of the workshop will be devoted to discussion of the results of the evaluation, the goal of the workshop is to provide an overview of the state-of-the-art in meeting recognition technologies and discuss plans for future work and collaborations.

Huge efforts are being expended in mining information in newswire, news broadcasts, and conversational speech and in developing interfaces to metadata extracted in these domains. However, until recently, relatively little has been done to address such applications in the more challenging and equally important meeting domain.

The development of smart meeting room core technologies that can automatically recognize and extract important information from multi-media sensor inputs will provide an invaluable resource for a variety of business, academic, and governmental applications. Such metadata will provide the basis for the development of second-tier meeting applications that can automatically process, categorize, and index meetings. Third-tier applications will provide a context-aware collaborative interface between live meeting participants, remote participants, meeting archives and vast online resources.

The meeting domain has several important properties not found in other domains and which are not currently being focused on in other research programs: multiple forums and vocabularies, highly-interactive/simultaneous speech, multiple distant microphones, multiple camera views, and multi-media/multi-modal information integration.

The Rich Transcription 2004 Spring Meeting Recognition Workshop at ICASSP 2004 on May 17 in Montreal will bring together the community of researchers working in this new and challenging domain to discuss the challenges, the current state-of-the-art, and future plans and collaborations. Discussions will include the results of the March 2004 Rich Transcription Meeting Recognition Evaluation including both Speech-to-Text Transcription and Speaker Segmentation technologies, related research work in the meeting domain, related governmental programs, and future collaborations.

Workshop Participation

While RT-04 Spring Recognition Evaluation participants will have automatic slots in the workshop, researchers working in related areas (speech technologies, vision technologies, behavioral sciences, etc.) in the meeting domain will also present their work. Additionally, a certain number of non-presenters will be permitted to attend the workshop on an invited basis. Please contact us at rteval@nist.gov if you are interested in attending. While a portion of the workshop will be devoted to discussion of the results of the evaluation, the goal of the workshop is to provide an overview of the state-of-the-art in meeting recognition technologies and discuss plans for future work and collaborations.

Evaluation

The RT-04 Spring Recognition Evaluation is part of the NIST Rich Transcription Evaluation series and will include both speaker segmentation and speech-to-text transcription tasks in the meeting domain. The test set will be approximately 90 minutes in length and will be comprised of 8˜11-minutes meeting exerpts collected at CMU, ICSI, the LDC, and NIST.

Colloquium in Honor of Ron Schafer

Georgia Institute of Technology, Atlanta, Georgia

Friday, October 31, 2003 GCATT Building

The position is associated with the European project DIVINES, a STREP/ 6th Frame Program. The aim of the project is to analyse the reasons why recognizers are unable to reach the human recognition rates even in the case of lack of semantic content. All weaknesses will be analyzed at the level of feature extraction, phone and lexical models. Focus will be put on intrinsic variabilities of speech in quiet and noisy environment as well as in read and spontaneous speech. The analysis will not be restricted to tests on several databases with different features and models but will go into the detailed behavior of the algorithms and models. Suggestions of new solutions will arise and be experimented. The duration of the project is for 3 years.

Links to Upcoming Conferences and Workshops

ICA2004 18th International Congress on Acoustics
Kyoto, Japan, April 4-9, 2004
http://www.ica2004.or.jp

ITCC04 - International Conference on Information Technology Coding and Computing
Las Vegas, Nevada, April 5-7, 2004
http://www.itcc.info

NIST Rich Transcription 2004 Meeting Recognition Workshop
Montreal, Canada, May 17, 2004
john.garofolo@nist.gov

Odyssey2004 - ISCA Workshop on Speaker and Language Recognition
Toledo, Spain, May 31 - June 1, 2004
http://www.odyssey04.org/

IEEE2004 Workshop on Signal Processing Advances in Wireless Communications
Lisbon Portugal, July 11 - 14, 2004
http://spawc2004.isr.ist.utl.pt

SCI2004 - 8th World Conference on Systemics, Cybernetics, and Informatics
Orlando, Florida, July 18 - 21, 2004
http://www.iisci.org/sci2004

4th International Conference on Spoken Language Processing
Hong Kong, China, December 15-18, 2004

ICSLP2004 - INTERSPEECH 8th Biennial International Conference on Spoken Language Processing
Jeju Island, Korea, October 4-8, 2004
http://www.icslp2004.org

EUROSPEECH 2005 9th European Conference on Speech Communication and Technology
Lisbon, Portugal, September 4-8, 2005
http://www.interspeech2005.org/

Area	Submissions
Speech Production/Synthesis	46
Speech Analysis / Feature Extraction	86
Speech Coding	46
Speech Enhancement	45
Acoustic Modeling for ASR	61
Robust ASR	100
Confidence/Lexical/Language/LVCSR	44
Adaptation	25
Spoken Language Systems	27
Speaker Rec / Language ID	47

Technical Committee	Submissions
Speech Processing	542
Signal Processing Theory and Methods	362
Signal Processing for Communications	368
Image & Multidimensional Signal Processing	357
Sensor Array & Multi-channel Signal Processing	184
Audio & Electroacoustics	153
Industry Technology Track	105
Design & Implementation of SP Systems	110
Multimedia Signal Processing	81
Machine Learning for Signal Processing	187
Signal Processing Education	11
Special Sessions	84


The newly retired Ronald W. Schafer	Ron Schafer with Larry Rabiner
STC Newsletter archive photos of R.W. Schafer and L. R. Rabiner. Actually, the STC Newsletter has no archive. These were actually scanned from the IEEE Transactions on Audio and Electroacoustics