stcnewsletter August 2003

IEEE SPEECH TECHNICAL COMMITTEE NEWSLETTER

March 13, 2004

INTRODUCTION:

Welcome to the IEEE Signal Processing Society Speech Technical Committee (STC) newsletter. As always, contributions of events, publications, workshops, and career information to the newsletter are welcome. Please send to Rick Rose (rose@ece.mcgill.ca). Archives of recent STC Newsletters can be found on the STC website.

SPS NEWS:
ICASSP 2005 Technical Program Preparation (Kenneth Barner and Jean-Christophe Pesquet)

STC NEWS:
STC Awards (Ananth Sankar)

SPECIAL ISSUES OF TRANSACTIONS:
Call for Papers for a Special Issue of the IEEE Transactions on SAP: Progress in Rich Transcription
Call for Papers for a Special Issue of the IEEE Transactions on SAP: Expressive Speech Synthesis

NEW WORKSHOP / EVALUATION ANNOUNCEMENTS:
2005 Human Language Technology Conference on Empirical Methods in Natural Language Technology
2005 AAAI Workshop on Spoken Language Understanding
IEEE 2005 Automatic Speech Recognition and Understanding (ASRU) Workshop
National Institute of Standards 2005 Speaker Identification Evaluation

CAREERS:
Postdoctoral Fellow Position in Speech Recognition and Speech Modeling Available at McGill University (Rick Rose)
Senior R&D Positions Available in Speech Technology (Vishu Viswanathan)
Transitions: ASR Researchers Take new Positions

LINKS TO WORKSHOPS AND CONFERENCES:
Links to conferences and workshops organized by date (Rick Rose)

Summary of ICASSP 2005 Submissions from the

Conference Technical Program Chairs

The following summary of the overall ICASSP2005 technical program preparation was provided by the Technical Program Co-Chairs, Kenneth Barner and Jean-Christophe Pesquet.

The papers submitted to ICASSP were routed to the appropriate TCs for review. The TCs have worked very hard, with the help of external reviewers. To ensure that the papers were thoroughly and fairly reviewed, most submissions received three reviews this year. The review process is a monumental task. The high degree of professionalism demonstrated by the TCs is a major factor contributing to the success of ICASSP. Much of the credit goes to the TC numbers and reviewers, who worked hard under the TC leadership of Mazin Rahim, Antonio Ortega, Alle-Jan van der Veen, Ananthram Swami, Petar Djuric, Alex Gershman, Max Wong, Michael Zoltowski, Tülay Adali, Michael Brandstein, Wayne Burleson, Yu Hen Hu, Eli Saber, and Huseyin Abut. Several TC chairs were also efficiently seconded by area coordinators who had the responsibility of a group of expert reviewers.

We, as Technical Program Chairs of the conference, worked closely with the TC chairs to put together the final technical program. Conference Management Services, and in particular, Lance Cotton and Billene Mercer, provided the excellent infrastructure and support that enabled that technical program to come together. We want to
express our special thanks to all these people, to all the contributing authors, and to the special session chairs who organize outstanding sessions on timely topics.

In this year's ICASSP Technical Program, we have organized the papers into 11 technical tracks, comprising 70 lecture and 80 poster sessions. Among the 1430 accepted papers, many, in fact most, will be poster presentations. The choice of oral or poster was made by the TCs based entirely on subject grouping.

The breakdown of submissions by technical committee is given in the following table:

Technical Committee Submissions	Submissions
Speech Processing	571
Image & Multidimensional Signal Processing	520
Signal Processing for Communications	405
Signal Processing Theory and Methods	378
Sensor Array & Multi-channel Signal Processing	189
Machine Learning for Signal Processing	179
Audio & Electroacoustics	152
Design & Implementation of SP Systems	82
Multimedia Signal Processing	79
Industry Technology Track	60
Signal Processing Education	18
Special Sessions	104

2004 IEEE Signal Processing Awards

2004 was a very successful year for the Speech area. Four out of the seven SPS awards were won by Speech researchers. In 2004, the Speech Technical Committee (STC) formed an Awards Subcommittee to coordinate the process of nominating candidates for the three Paper Awards and the four Major Awards. The committee members were Ramesh Gopinath, Li Deng, Alan Black, Kazunori Mano, Isabel Trancoso, and Ananth Sankar. The STC Awards Committee received a total of 23 nominations for the 7 categories from 12 individual nominators. This was followed by a vote within the STC to choose the final 7 nominations for the Speech area. The final nominations were then reviewed and revised by the Awards Subcommittee before submission to the Awards Board. Particular care was taken to highlight the contributions of the final nominees.

Speech won two of the three Paper Awards, and two of the four Major Awards. There were multiple winners in some categories. The Speech winners are listed below. The total number of winners in each category is in parenthesis.

Technical Achievement Award (2): Prof. Steve Young

Meritorious Service Award (1): Prof. Andreas Spanias

Young Author Paper Award (2): G. Tzanetakis and P. Cook, Musical genre classification of audio signals, IEEE Trans. on Speech and Audio Processing, vol. 10, pp. 293-302, July 2002. The young author was George Tzanetakis

Best paper award (4): E. Bocchieri and B. K.W. Mak, Subspace distribution clustering hidden Markov model, IEEE Trans. on Speech and Audio Processing,
vol. 9, pp. 264-275, March 2001.

The STC takes great pride in the achievement of our Speech winners, and extends our heartiest congratulations to them.
top of page

Special Issue of
The IEEE Transactions on Speech and Audio Processing
Progress in Rich Transcription

Over the past several years, Rich Transcription has emerged as an interdisciplinary field combining automatic speech recognition, speaker identification, and natural language processing with the goal of producing richly annotated speech transcriptions that are useful both to human readers and to automated programs for indexing, retrieval and analysis. The key problems include making more accurate speech transcription technology; improving speaker recognition technology; developing fundamentally new techniques for annotating dialog with semantic intent; and enriching ASR output to present it in a maximally informative manner. These various goals interact with each other, and exploiting synergistic uses of the disparate forms of analysis is critical. With its focus on fundamental research in human communication, Rich Transcription is key to governmental applications in data mining, and to commercial applications such as call center automation and monitoring.

The purpose of this special issue is to present recent advances in all areas of Rich Transcription for Speech, Audio, and Spoken Language Dialog. Original, previously unpublished submissions for the following areas are encouraged:

Speech Recognition algorithms and methods
Natural Language Processing for Rich Transcription
Speaker Recognition algorithms and methods
Algorithms for exploiting large amounts of training data
Novel Approaches to feature extraction and ASR
Unsupervised and semi-supervised training
Paradigms for Data Collection
Usability and human factors studies
Performance analysis and evaluation
Applications in marketing, business & security
Tools and solutions for rich transcription
Incorporation of prosodic and supralexical information

Submission procedure:

Prospective authors should prepare manuscripts according to the Information for Authors as published in any recent issue of the Transactions and as available on the web at http://www.ieee.org/organizations/society/sp/infotsa.html. Note that all rules will apply with regard to submission lengths, mandatory overlength page charges, and color charges.

Manuscripts should be submitted electronically through the online IEEE manuscript submission system at http://sps-ieee.manuscriptcentral.com/. When selecting a manuscript type, authors must click on "Special Issue of T-SA on Progress in Rich Transcription." Authors should follow the instructions for the IEEE Transactions on Speech and Audio Processing and indicate in the Comments to the Editor-in-Chief that the manuscript is submitted for publication in the Special Issue on Progress in Rich Transcription. We require a completed copyright form to be signed and faxed to 1-732-562-8905 at the time of submission. Please indicate the manuscript number on the top of the page.

Schedule:

Submission deadline: 1 October 2005

Notification of acceptance: 1 April 2006

Final manuscript due: 31 May 2006

Tentative publication date: September 2006

Guest Editors:

Dr. Geoffrey Zweig	IBM, Yorktown Heights, NY.	gzweig@us.ibm.com
Dr. John Makhoul	BBN Technologies, Cambridge MA.	makhoul@bbn.com
Dr. Barbara Peskin	ICSI, Berkeley, CA.	barbara@icsi.berkeley.edu
Dr. Phil Woodland	Cambridge University, Cambridge, U.K.	pcw@eng.cam.ac.uk
Dr. Andreas Stolcke	SRI International, Menlo Park, CA.	stolcke@speech.sri.com

top of page

Special Issue of
The IEEE Transactions on Speech and Audio Processing
Expressive Speech Synthesis

Expressive Speech Synthesis (ESS) is a multidisciplinary research area that addresses one of the most complex problems in speech and language processing. The challenges posed by ESS have been the subject of several collaborative research projects across universities and laboratories around the world. Over the last decade ESS has benefited from advances in speech and language processing as well as from the availability of large conversational-speech databases. These advances have spurred research on the expressiveness of speech and on conveying paralinguistic information including emotion, speaker-state, and speaker-listener relationships. There have also been substantial efforts towards automating database creation and evaluating the quality of speech synthesised for a variety of tasks that require not just the transmission of information, but also the expression of affect.

The purpose of this special issue is to present recent advances in Expressive Speech Synthesis. Original, previously unpublished research is sought in all areas relevant to the field. In particular, submissions on theory and methods for the following areas are encouraged:

Signal processing techniques for ESS
Speech corpus design for ESS
Input text processing and annotation for ESS
Prosody processing in ESS
Voice quality issues in ESS
Unit selection algorithms for ESS
Unit concatenation algorithms for ESS
Expression of affect and emotion in ESS
ESS in systems and applications
Multilingual and appplication-specific ESS
Expressive speech annotation and labelling
Evaluation of expressivity in speech

Submission procedure:

Manuscripts should be submitted electronically through the online IEEE manuscript submission system at http://sps-ieee.manuscriptcentral.com/. When selecting a manuscript type, authors must click on "Special Issue of T-SA on Expressive Speech Synthesis." Authors should follow the instructions for the IEEE Transactions on Speech and Audio Processing and indicate in the Comments to the Editor-in-Chief that the manuscript is submitted for publication in the Special Issue on Statistical and Perceptual Audio Processing. We require a completed copyright form to be signed and faxed to 1-732-562-8905 at the time of submission. Please indicate the manuscript number on the top of the page.

Schedule:

Submission deadline: 1 June 2005

Notification of acceptance: 1 December 2005

Final manuscript due: 28 February 2006

Tentative publication date: May 2006

Guest Editors:

Dr. Nick Campbell	ATR Network Informatics Research Labs, Kyoto, Japan	nick@atr.jp
Dr. Wael Hamza	IBM T.J. Watson Research Center, Yorktown Heights, USA	hamzaw@us.ibm.com
Dr. Harald Hoge	SIEMENS AG Central Technology, Germany	harald.hoege@siemens.com
Dr. Tao Jianhua	Pattern Recognition Laboratory, the Chinese Academy of Sciences	jhtao@nlpr.ia.ac.cn
Dr. Gerard Bailly	Institut de la Communication Parlee, France	bailly@icp.inpg.fr

top of page

HLT/EMNLP 2005 Call for Papers

Human Language Technology Conference
on Empirical Methods in Natural Language Processing

October 6-8, 2005
Vancouver, B.C., Canada

http://www.cs.utexas.edu/~ml/HLT-EMNLP05/

Submission deadline: June 3, 2005

HLT/EMNLP 2005 continues the conference series jointly sponsored by the Human Language Technology Advisory Board (HLT) and the Association for Computational Linguistics (ACL). This year's conference is co-sponsored by SIGDAT, the ACL's special interest group on linguistic data and corpus-based approaches to NLP, which has traditionally sponsored the Empirical Methods in Natural Language Processing (EMNLP) Conferences. The joint conference provides a unified forum for researchers across a spectrum of disciplines to present recent, high-quality, cutting-edge work, to exchange ideas, and to explore emerging new research directions. The conference especially encourages submissions that discuss synergistic combinations of language technologies (e.g., Speech with Information Retrieval, Machine Translation with Speech, Question Answering with Natural Language Processing, etc.). Particular consideration will be given to papers addressing novel learning tasks and evaluation metrics in speech, natural language processing and information retrieval, including e.g.:

learning tasks insufficiently addressed in the past, e.g. collaborative learning, learning in the presence of background knowledge, or finding anomalies in data;
limits of standard evaluation methods on new tasks;
novel performance measures incorporating user preferences, competence, or relevance to a given problem;
learning and optimization algorithms addressing the above, e.g. novel statistical methods or cognitively inspired solutions.

We are interested in papers from academia, government, and industry on all areas of traditional interest to the HLT and SIGDAT communities, as well as aligned fields, including but not limited to:

Speech processing, including:
- Speech recognition
- Speech generation
- Speech summarization
- Rich transcription: annotation of speech signals with metalinguistic information, such as speaker identity, attitude, emotion, etc.
- Speech-based human-computer interfaces
Text summarization
Question answering
Paraphrasing
Computational analysis of phonology, morphology, prosody, syntax, semantics, pragmatics, discourse, style
Statistical techniques for language processing, including:
- Corpus-based language modeling
- Lexical and knowledge acquisition
Language generation and text planning
Sentence parsing and discourse analysis
Multilingual processing, including:
- Machine translation of speech and text
- Cross-language information retrieval
- Multi-lingual speech recognition and language identification
Evaluation, including:
- Glass-box evaluation of HLT systems and system components
- Back-box evaluation of HLT systems in application settings
Development of language resources, including:
- Lexicons and ontologies
- Treebanks, proposition banks, and frame banks
Understanding of human communication, including:
- Natural language interfaces
- Dialogue structure and dialogue systems
- Message and narrative understanding systems
Information extraction from multiple media
Information retrieval, including:
- Formal models, clustering and classification
- Web mining for IR
- Natural language processing for IR
- Spoken IR
- Metadata annotation and XML IR

Important Dates:

Submission deadline	June 3, 2005
Notification of acceptance	July 29, 2005
Submission of camera-ready papers	August 12, 2005
Conference	October 6-8, 2005

Submissions

Submissions must describe original, completed, unpublished work, and include concrete evaluation results when appropriate. Papers being submitted to other meetings must provide this information (see submission format). In the event of multiple acceptances, authors are requested to immediately notify the HLT/EMNLP program chair and to choose which meeting to present and publish the work at as soon as possible. We cannot accept for publication or presentation work that will be (or has been) published elsewhere.

Papers must be submitted electronically in Postscript (PS) or Portable Document Format (PDF). They should follow the ACL formatting guidelines and should not exceed eight (8) pages in two-column format, including references and illustrations. Papers exceeding the maximum length may be rejected without review. Authors are encouraged to use the style files provided on the HLT/EMNLP 2005 website. We strongly prefer submissions in PS format. Any author who submits in PDF must assume the responsibility for ensuring that fonts are treated properly so that the paper will print (not just view) anywhere. (This may involve reading the manual.) DOC/RTF formats cannot be accepted.

Reviewing will be blind. No information identifying the authors should be in the paper: this includes not only the authors' names and affiliations, but also self-references that reveal authors' identities; for example, "We have previously shown (Smith 1999)" should be changed to "Smith (1999) has previously shown". Names and affiliations should be listed on a separate identification page.

Papers must be submitted electronically by 12 a.m. GMT on June 3, 2005, through the conference website. In addition, information about each paper must be provided, including:

Paper title
Authors' names, affiliations, and contact information
Contact author
A short list of keywords (selected from a predefined list)
Abstract (no more than 300 words)
A statement whether the paper has been or will be submitted to other conferences

Authors who cannot submit a file electronically should contact the program chairs before the due date to arrange alternative forms of submission.

After notifications of acceptance have been issued, authors will have the opportunity to revise their submissions in accordance with reviewers' comments. The due date for the final submission of camera-ready papers is August 12, 2005.
top of page

SLU 2005

AAAI Workshop on

Spoken Language Understanding

Held in conjunction with
The Twentieth National Conference on Artificial Intelligence - AAAI 2005

July 9 or 10, 2005, Pittsburgh, Pennsylvania

Call for Papers

Accepted Papers

Workshop Program

Instructions for Authors

Program Committee

Workshop Description

Natural language processing (NLP) has been one of the defining subtopics of AI since its early days. In recent times, NLP has predominantely been about text understanding and building associated resources for the purposes of information-extraction, question-answering and text mining. Many of these tasks have nourished the creation and development of extensive ontologies, practical semantic representations, and novel machine learning techniques. In a spirit similar to the workshop at HLT-NAACL 2004 on this topic, our attempt is to broaden the scope of language understanding to include understanding of spoken language (SLU) in the context of applications such as speech mining and human-machine interactive spoken dialog systems. We aim to bring together techniques that address the issue of robustness of SLU to speech recognition errors, language variability and dysfluencies in speech with issues of semantic representation that provide greater flexibility and portability to a dialog model. We believe spoken language understanding is an especially attractive topic for cross-fertilization of ideas between AI, IR, NLP, Speech and Semantic Web communities.

Workshop Topics

We invite submissions covering the full range of topics related to Spoken Language Understanding. Topics of interest include (but are not limited to):

Approaches to building an SLU

rule-based, data-driven, or hybrid
automatic adaptation across domains

Approaches to robustness in SLU

Handling uncertain and erroneous input
Handling dysfluencies and language variations

Tighter integration of Speech Recognition and SLU

Exploiting weighted packed representation of hypotheses
Exploiting prosodic and emotional cues from speech

Approaches to semantic representations provided by SLU

Combining shallow and deep representations
Representations permitting robust inference mechanisms

Tools and Data Resources
Issues and metrics for evaluation of SLU
SLU in the context of Applications

multilingual systems
multimodal systems
tutoring systems
speech mining systems
spoken dialog systems

Paper Submission

All submissions must be sent to gtur@research.att.com, with the subject line "AAAI-05 SLU Workshop paper submission". Please use the AAAI prescribed formatting instructions available at http://www.aaai.org/Workshops/. Papers must be 5 to 8 pages long, including all references and figures. All papers must be submitted in either PDF (preferred) or postscript format. If any special fonts are used, they must be included in the submission. The papers must be original, and have not been published. Note that reviewing will NOT be blind, the paper submissions may include the authors' names and affiliations.

Important Dates

April 20, 2005: Deadline for electronic submission
May 11, 2005: Notification of acceptance or rejection
May 18, 2005: Submission of camera-ready papers

Workshop Co-Chairs

Srinivas Bangalore, AT&T Labs - Research, NJ, USA. e-mail: srini@research.att.com.
Dilek Hakkani-TÃ¼r, AT&T Labs - Research, NJ, USA. e-mail: dtur@research.att.com.
Gokhan Tur, AT&T Labs - Research, NJ, USA. e-mail: gtur@research.att.com.

Previous Workshops

HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems

top of page

CALL FOR PAPERS

IEEE ASRU 2005

Automatic Speech Recognition and Understanding Workshop

Fiesta Americana Grand Coral Beach Resort

Cancun, Mexico

November 27 – December 1, 2005

The Ninth biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU) will be held November 27-December 1, 2005. The ASRU Workshops have a tradition of bringing together researchers from academia and industry in an intimate and collegial setting to discuss problems of common interest in automatic speech recognition and understanding. Papers in all areas of human language technology are encouraged to be submitted, with emphasis placed on automatic speech recognition and understanding technology, speech to text systems, spoken dialog systems, multilingual language processing, robustness in ASR, spoken document retrieval, and speech-to-speech translation.

The workshop program will consist of invited lectures, oral and poster presentations, and panel discussions. Ample time will be allowed for informal discussions and to enjoy the impressive tropical setting. The workshop website http://www.asru2005.org will be accessible by January, 2005.

PAPER SUBMISSION:
Prospective authors are invited to submit full-length, 4-6 page papers, including figures and references, to www.asru2005.org . All papers will be handled and reviewed electronically. The ASRU 2005 website will provide you with further details. Please note that the submission dates for papers are strict deadlines.

SPECIAL SESSIONS:
Special sessions proposals should be submitted by June 15, 2005, to asru05-tc@lists.csail.mit.edu and must include a topical title, rationale, session outline, contact information, and a description of how the session will be organized.

TENTATIVE DATES

May 1, 2005            Workshop registration opens
July 1, 2005              Camera-ready paper submission deadline
August 15, 2005        Paper Acceptance / Rejection notices mailed
Sept. 15, 2005          Revised Papers Due and Author Registration Deadline
Oct. 1, 2005             Hotel Reservation and Workshop Registration
Nov. 27 – Dec.1, 2005    Workshop

ORGANIZING COMMITTEE

General Chairs
    Jim Glass, MIT, USA
    Richard Rose, McGill University, Canada
Technical Chairs
    Michael Picheny, IBM, USA
    Renato de Mori, Avignon, France
    Richard Stern, CMU, USA
Publicity Chair
    Ruhi Sarikaya, IBM, USA
Publications Chair
    Dilek Hakkani-Tur, AT&T, USA
Local Arrangements Chair:
    Juan Nolazco, Monterrey, Mexico
Demonstrations Chair
    Anand Venkataraman, SRI, USA

TECHNICAL COMMITTEE
Alex Acero,                                        Sadaoki Furui, Univ. of Tokyo
Srini Bangalore,                                  J.L. Gauvain, LIMSI
Jerome Bellegarda, Apple                     Yuqing Gao, IBM
Mary Harper,                                       Hermann Ney, RTWH Aachen
Julia Hirshberg, Columbia University    Joe Picone, Mississippi State University
Helen Meng, CUHK                             Abeer Alwan, UCLA
Roberto Pieraccini, IBM                      Jeff Bilmes, Univ. of Washington
Alex Rudnicky, CMU                           Herve Boulard, IDIAP
Stephanie Seneff, MIT                           Dan Ellis, Columbia University
Liz Shriberg, SRI                                   Mark Hasegawa-Johnson, UIUC
Gokhan Tur, AT&T                               Hynek Hermansky, IDIAP
Wayne Ward, Univ. of Colorado            Chris Wellekens, EURECOM
Steve Young, Cambridge University      Chin-Hui Lee, Georgia Tech
Eric Fosler-Lussier, Ohio State University   Shri Narayanan, USC

back to top

The 2005 National Institute of Standards

Speaker Recognition Evaluation

NIST has been coordinating Speaker Recognition Evaluations since 1996. Each evaluation begins with the announcement of the official evaluation plan which clearly states the rules and tasks involved with the evaluation. The evaluation culminates with a follow-up workshop, where NIST reports the official results and researchers share in their findings.

Brief History

Since 1996, over 40 research sites have participated in our evaluations. Each year, new researchers in industry and universities are encouraged to participate. Collaboration between universities and industries, is also welcomed. The overall goals of the evaluations have always been to drive the technology forward, to measure the state-of-the-art, and to find the most promising algorithmic approaches.

The 2005 NIST Speaker Recognition Evaluation

The 2005 NIST Speaker Recognition evaluation is part of an ongoing series of yearly evaluations conducted by NIST. These evaluations provide an important contribution to the direction of research efforts and the calibration of technical capabilities. They are intended to be of interest to all researchers working on the general problem of text independent speaker recognition. To this end the evaluation was designed to be simple, to focus on core technology issues, to be fully supported, and to be accessible.

Non-LDC members are required to sign the LDC's license agreement before being granted access to SRE-05 data.

*Evaluation Schedule*
12-10-04	Posting of the official evaluation specification document
03-01-2005	Last day to register for participation
04-04-2005	Evaluation begins
04-28-2005	Site submissions due at NIST
05-05-2005	First release of results to the participants
05-27-2005	Site workshop presentations/talks due at NIST
June 6-8, 2005	Evaluation workshop, Eastern United States

More Information

To find out more about previous evaluations and access related publications, go to the NIST speaker idenitfication web page. To register your desire to participate in future evaluations, to obtain more information about our evaluations, or to be notified of the next evaluation or developments there of, please e-mail Dr. Alvin Martin at NIST.

Two Senior R&D Positions Available

Speech Technologies R&D Lab

Texas Instruments, Dallas, TX

Handset Acoustic Signal Processing: The available position involves work as part of an R&D team designing, testing, and tuning acoustic solutions for wireless handsets in support of TI's wireless business. Responsibilities also include working collaboratively with our wireless product group and providing consulting support.

Qualifications: Ph.D. in EE or MSEE with equivalent experience. Strong background and experience in digital signal processing, with emphasis on speech processing and its applications. Demonstrated experience over 3-5 years in design, testing, and tuning of acoustic-related issues including acoustic echo cancellation, noise suppression, AGC, and compressor/limiter, especially as they relate to wireless handsets. Knowledge and experience in multi-microphone based speech acquisition is desirable. Demonstrated software development experience in C/Unix and Matlab. Effective oral and written communication skills. Background and experience in speech coding algorithms is a plus as the position is in TI's speech coding R&D team.

Speech Recognition: The available position requires a candidate with strong prior experience in automatic speech recognition research and development. In R&D support of TI product groups, the Speech Technologies R&D laboratory designs and develops speaker-independent, speaker-dependent, and speaker-adaptive speech recognition systems for hand-held and hands-free voice input, focusing on recognition accuracy and robustness under adverse conditions as in mobile environments, small foot-print solutions, dynamic vocabulary and grammar, and multiple language recognition.

Qualifications: The candidate must have a PhD or MS with equivalent experience in EE or CS, a strong background and experience in speech recognition technology, 3-5 years of experience in design and implementation of robust, high-performance speech recognition algorithmsand associated application development including API's, an interest andcommitment to solve real-world problems and bring algorithms to products, and demonstrated programming abilities. Prior experience in letter-to-phoneme mapping, as would be needed for robust speaker-independent name recognition, and in dealing with multiple languages will be a plus.

Qualified candidates may send a letter and a resume by e-mail to Vishu Viswanathan (v-viswanathan@ti.com)
back to top

Postdoctoral Fellowship Position in Speech Recognition

and Speech Modeling at McGill University

We are seeking a postdoctoral fellow to perform basic research as part of a three year project in the general area automatic speech recognition. The position will be in the Electrical and Computer Engineering Department at McGill University in Montreal, Canada. The candidate should have hands-on experience with speech processing systems and should have a strong background in statistical modeling, signal processing, and/or speech analysis. The candidate should also have facility with the use of high level programming languages for developing protoype systems and simulations. Fluency in English is required and the ability to work in a small team is also important.

The position is in support of a Canadian funded NSERC project that is being conducted in association with the European 6th Frame Project DIVINES. The overall goal of the project is to overcome deficiencies in existing acoustic feature analysis and phonetic and lexical modeling techiques. This will be accomplished through a methodology involving the diagnosis and modeling of instrinsic variabilities in ASR under a variety of conditions.

McGill University is located in Montreal, an exciting cosmopolitan city in the Province of Quebec. Montreal is the home of a number of speech recognition Research and Development Laboratories including Centre Recherche Informatique Montreal (CRIM), ScanSoft Canada, Nuance Canada, Nu Echo, and others. Montreal is composed of a bilingual population with a blend of European and North American culture. Qualified applicants are invited to submit a resume together with the names and addresses of two references by email to:

McGill University
Richard Rose
McGill University
Department of Electrical and Computer Engineering
McConnell Engineering Building, Room 813
3480 University Street
Montreal, Quebec
H3A 2A7

email:

phone: 514-398-1749
fax: 514-398-4470 tment of Electrical & Computer Engineering

back to top

ASR Researchers Take New Positions

The STC Newsletter would like to provide announcements of professors, researchers, and developers in the speech area taking new positions. If you have moved lately or are in the process of moving to a new position in the new future, send your new contact information to the STC Newsletter so it can be posted in the next edition.

In July 2004, Roland Kuhn moved from the Panasonic Speech Technology Laboratory in Santa Barbara to Ottawa, Canada. Since then, his research has focused on statistically-oriented machine translation. His new contact information is:

Links to Upcoming Conferences and Workshops

(Organized by Date)

ICASSP2005
Philadelphia, Pennsylvania, May, 2005
http://www.icassp2005.org/

Auditory-Visual Speech Processing (AVSP 2005)
Vancouver Island, British Columbia, Canada, July24-27
http://marcs.uws.edu.au/links/avisa/avsp05

SIGdial Workshop on Discourse and Dialog
Lisbon, Portugal , September 2-3, 2005
http://www.sigdial.org/workshops/workshop6

ITRW 2005 Workshop on DSP for IN-VEHICLE and MOBILE SYSTEMS
Sesimbra, Portugal, September 3, 2005
http://dspincars.sdsu.edu/

EUROSPEECH 2005 9th European Conference on Speech Communication and Technology
Lisbon, Portugal, September 4-8, 2005
http://www.interspeech2005.org/

Disfluency in Spontaneous Speech
Aix-en-Provence, September 10-12, 2005
http://www.up.univ-mrs.fr/delic/Diss05

IEEE WASPAA2005 Workshop on Applications of Signal Processing to Audio and Acoustics
New Paltz, New York, October 16-19, 2005
http://www.LNT.de/~WASPAA05/

SPECOM 2005 - 10th International Conf. on Speech and Computers
Patras, Greece, October 17-19, 2005
http://www.wcl.ee.upatras.gr/specom2005.htm

IEEE ASRU2005 Automatic Speech Recognition and Understanding Workshop
Cancun, Mexico, November 27 - December 1, 2005
http://www.asru2005.org

ICASSP2006
Toulouse, France May 15-19, 2006

INTERSPEECH 2006 - ICSLP
Pittsburgh, PA, USA September 17-21, 2006
http://www.interspeech2006.org/

ICASSP2007
Honolulu, Hawaii, USA, 2007, April 17-20

INTERSPEECH 2007
Antwerp, Belgium, August 27-31, 2007
http://www.interspeech2007.org/

Submission deadline:	1 October 2005
Notification of acceptance:	1 April 2006
Final manuscript due:	31 May 2006
Tentative publication date:	September 2006

Submission deadline:	1 June 2005
Notification of acceptance:	1 December 2005
Final manuscript due:	28 February 2006
Tentative publication date:	May 2006

Summary of ICASSP 2005 Submissions from the

Conference Technical Program Chairs

Special Issue of The IEEE Transactions on Speech and Audio Processing Progress in Rich Transcription

Special Issue of The IEEE Transactions on Speech and Audio Processing Expressive Speech Synthesis

HLT/EMNLP 2005 Call for Papers

Human Language Technology Conference on Empirical Methods in Natural Language Processing

October 6-8, 2005 Vancouver, B.C., Canada

http://www.cs.utexas.edu/~ml/HLT-EMNLP05/

Submission deadline: June 3, 2005

Spoken Language Understanding

CALL FOR PAPERS

IEEE ASRU 2005

Automatic Speech Recognition and Understanding Workshop

Fiesta Americana Grand Coral Beach Resort

Cancun, Mexico

November 27 – December 1, 2005

The 2005 National Institute of Standards

Speaker Recognition Evaluation

Two Senior R&D Positions Available

Speech Technologies R&D Lab

Texas Instruments, Dallas, TX

Postdoctoral Fellowship Position in Speech Recognition

and Speech Modeling at McGill University

ASR Researchers Take New Positions

Links to Upcoming Conferences and Workshops

Special Issue of
The IEEE Transactions on Speech and Audio Processing
Progress in Rich Transcription

Special Issue of
The IEEE Transactions on Speech and Audio Processing
Expressive Speech Synthesis

Human Language Technology Conference
on Empirical Methods in Natural Language Processing

October 6-8, 2005
Vancouver, B.C., Canada