December 12, 2004
INTRODUCTION:
Welcome to the IEEE Signal Processing Society Speech Technical Committee (STC)
newsletter. As always, contributions
of events, publications, workshops, and career information to the
newsletter are welcome (rose@ece.mcgill.ca).
LINKS TO WORKSHOPS AND CONFERENCES:
Links to conferences and
workshops organized by date (Rick Rose)
Professor
Matsui has been a frontier researcher in speaker recognition ever since
she joined NTT Speech Research in 1988. She
has developed key technologies in this area with Dr. Sadaoki Furui;
including: posterior probability based score normalization, text-prompt
based speaker recognition system, and noise robust HMM compositions. She worked at ATR,
Jean-Claude Junqua
Dr. Junqua is currently with the AV Core
Technology
Development Center (ACC) at Matsushita Electric Industrial Co. in
back
Shri Narayanan
back
Roger Moore
Dr. Moore is currently with the Speech and Hearing
Research
Group (SPandH) in the Department of
Computer Science
at the
Dr. Moore is the immediate past-President of both
the
International Speech Communication Association (ISCA) and the Permanent
Council
of the International Conferences on Spoken Language Processing (ICSLP). He is a member of the Editorial/Advisory
Boards for the scientific Journals Computer Speech & Language and
Speech
Communication. He was awarded the UK
Institute of Acoustics Tyndall Medal for "distinguished work in the
field
of speech research and technology" and was awarded the NATO RTO
Scientific
Achievement Award for "repeated contribution in scientific and
technological cooperation".
back
Peter Kabal
back
The
Speech Technical Committee (STC) received 572 papers for ICASSP
2005. This represents nearly a 9% increase from ICASSP 2004 and ICASSP
2003. About 51% of the papers were rejected. A rigorous review process
was implemented in which each paper was essentially reviewed by at
least 3 people; one being STC member and 2 external reviewers. The
review process included 150 external reviewers and 46 STC members; 9
of the STC members acted as Area Chairs, working closely with the STC
team members to integrate the final scores, define accept/reject, form
technical sessions and identify session chairs.
For
ICASSP 2005,
the STC adopted a new speech EDICS classification of topics in order
for ICASSP to align with recent advancements in speech and language
processing research. As a result of this change, we experienced nearly
50% increase in speaker recognition papers, nearly 50% increase in
spoken language technology papers and a modest 5% of new submissions
in the areas of speech data mining and multimodal/multimedia
Human/Computer interfaces. Detailed statistics of the paper
submission is shown in the Table below.
The
STC has been successful in attracting two very strong special
sessions on Human Language Technology that will be organized by M.
Ostendorf,
E. Shriberg and A. Stolke. We have also been successful in hosting a
unique
tutorial on "Machine Learning for Speech and Language processing" that
will be
given by J. Bilmes and
P. Haffner.
Area |
Number
of Papers |
|
Speech Production |
6 |
|
Speech Perception |
8 |
|
Speech Analysis |
58 |
|
Speech Synthesis |
46 |
|
Speech Coding |
32 |
|
Speech Enhancement |
48 |
|
Acoustic Modeling |
64 |
|
Robustness |
74 |
|
Adaptation/Normalization |
26 |
|
General Topics in ASR |
35 |
|
Language Modeling |
16 |
|
Large-vocabulary ASR |
24 |
|
Speaker Recognition |
64 |
|
Spoken Language Processing |
45 |
|
Human/Computer Interaction |
5 |
|
Multimedia/Multimodal |
8 |
|
Data Mining and Search |
13 |
ISCA, toghether with the Interspeech 2005 - Eurospeech organizing committee, would like to encourage submission of papers for the upcoming conference.
TOPICS OF
INTEREST:
The deadline for full paper submission (4 pages) is April 8, 2005. Paper submission is done exclusively via the conference website, using the submission guidelines. No previously published papers should be submitted. Each corresponding author will be notified by e-mail of the acceptance of the paper by June 10, 2005. Minor updates will be allowed during June 10 - June 17.
PROPOSALS FOR TUTORIALS AND SPECIAL SESSIONS
We encourage proposals for half-day pre-conference tutorials to be held on September 4, 2005. Those interested in organizing a tutorial should send a one-page description to tutorials@interspeech2005.org by January 14, 2005.
INTERSPEECH'2005 also welcomes proposals for
special
sessions. Each special session normally consists of 6 invited papers,
although other formats may be considered. The topics of the special
sessions should be important, new, emerging areas of interest to the
speech processing community, yet have little overlap with the regular
sessions. They
could also be interdisciplinary topics that encourage
cross-fertilization of fields or topics investigated by
members of other societies that are becoming of keen interest to the
speech and language community. Special session papers follow the
same submission format as regular papers. Proposals for special
sessions
should be sent to special_sessions@interspeech2005.org by January
14, 2005.
IMPORTANT DATES:
Proposals for Tutorials and Special Sessions
due by: January 14, 2005
Full paper submission deadline:
April 8, 2005
Notification of paper acceptance/rejection
June 10, 2005
Early registration deadline:
June 25, 2005
For further information:
www.interspeech2005.org
or send email to
info@interspeech2005.org
ORGANIZOR:
L2F - Spoken Language Systems Laboratory, INESC
ID Lisboa
Rua Alves Redol, 9 - 1000-029 Lisbon - Portugal
Phone:+351 213100268
Fax: +351 213145843
www.l2f.inesc-id-pt
back
to top
The Ninth biannual IEEE workshop on Automatic
Speech Recognition and Understanding (ASRU) will be held November
27-December 1, 2005. The ASRU Workshops have a tradition of
bringing together researchers from academia and industry in an intimate
and collegial setting to discuss problems of common interest in
automatic speech recognition and understanding. Papers in all areas of
human language technology are encouraged to be submitted, with emphasis
placed on automatic speech recognition and understanding technology,
speech to text systems, spoken dialog systems, multilingual language
processing, robustness in ASR, spoken document retrieval, and
speech-to-speech translation.
The workshop program will consist of invited lectures, oral and poster
presentations, and panel discussions. Ample time will be allowed for
informal discussions and to enjoy the impressive tropical
setting. The workshop website http://www.asru2005.org will be
accessible by January, 2005.
PAPER
SUBMISSION:
Prospective authors are invited to submit full-length, 4-6 page papers,
including figures and references, to www.asru2005.org . All papers will
be handled and reviewed electronically. The ASRU 2005 website will
provide you with further details. Please note that the submission dates
for papers are strict deadlines.
SPECIAL SESSIONS:
Special sessions proposals should be submitted by June 15, 2005, to
asru05-tc@lists.csail.mit.edu
and must include a topical title, rationale,
session outline, contact information, and a description of how the
session will be organized.
TENTATIVE
DATES
May 1, 2005
Workshop registration opens
July 1, 2005
Camera-ready paper submission deadline
August 15, 2005 Paper Acceptance /
Rejection notices mailed
Sept. 15, 2005 Revised
Papers Due and Author Registration Deadline
Oct. 1, 2005
Hotel Reservation and Workshop Registration
Nov. 27 – Dec.1, 2005 Workshop
ORGANIZING
COMMITTEE
General Chairs
Jim Glass, MIT, USA
Richard Rose, McGill University, Canada
Technical Chairs
Michael Picheny, IBM, USA
Renato de Mori, Avignon, France
Richard Stern, CMU, USA
Publicity Chair
Ruhi Sarikaya, IBM, USA
Publications Chair
Dilek Hakkani-Tur, AT&T, USA
Local Arrangements Chair:
Juan Nolazco, Monterrey, Mexico
Demonstrations Chair
Anand Venkataraman, SRI, USA
back
to top
The 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'05) will be held at the Mohonk Mountain House in New Paltz, New York, and is sponsored by the Audio & Electroacoustics committee of the IEEE Signal Processing Society. The objective of this workshop is to provide an informal environment for the discussion of problems in audio and acoustics and the signal processing techniques leading to novel solutions. Technical sessions will be scheduled in the morning and before and after dinner. Afternoons will be left free for informal meetings among workshop participants. Papers describing original research and new concepts are solicited for technical sessions on, but not limited to, the following topics:
Acoustic Scenes
- Scene Analysis: Source Localization, Source Separation, Room Acoustics
- Signal Enhancement: Echo Cancellation, Dereverberation, Noise Reduction, Restoration
- Multichannel Signal Processing for Audio Acquisition and Reproduction
- Virtual Acoustics via Loudspeakers or Headphones
Audio Coding
- Waveform Coding and Parameter Coding
- Spatial Audio Coding
- Internet Audio
- Musical Signal Analysis: Segmentation, Classi¯cation, Transcription
- Digital Rights
- Mobile Devices
Hearing and Perception
- Auditory Perception, Spatial Hearing, Quality Assessment
- Hearing Aids
Music
- Signal Analysis and Synthesis Tools
- Creation of Musical Sounds: Waveforms, Instrument Models, Singing
Submission of four-page paper: April 15, 2005
Notification of acceptance: June 27, 2005
Early registration until: September 1, 2005
Workshop Committee
General Chair:
Walter Kellermann
Multimedia Communications and Signal Processing
University of Erlangen-Nuremberg, Germany
wk@LNT.de
Technical Program Chair:
Rudolf Rabenstein
rabe@LNT.de
Finance Chair:
Michael Brandstein
msb@ll.mit.edu
Publications:
Heinz Teutsch
teutsch@LNT.de
Publicity:
Schuyler Quackenbush
Audio Research Labs, USA
srq@audioresearchlabs.com
Local Arrangements:
Jingdong Chen
jingdong@research.bell-labs.com
Registration:
Ursula Arnold
arnold@LNT.de
Far East Liaison:
Shoji Makino
NTT Communication Science Laboratories, Japan
maki@cslab.kecl.ntt.co.jp
back to top
The success of a conversational system
depends on a synergistic
integration of technologies such as speech recognition, spoken language
understanding (SLU), dialog modeling, natural language generation,
speech
synthesis and user interface design. In this special issue, we will
address the
SLU component of a conversational system and its relation to the speech
recognizer and the dialog model. In particular, we aim to bring
together
techniques that address the issue of robustness of SLU to speech
recognition
errors, language variability and dysfluencies in speech with issues of
output
representations from SLU that provide greater flexibility to the dialog
model.
The
topic
of robust SLU has received much attention during the DARPA funded ATIS
program
of the 1990s and more recently the DARPA Communicator program. In
parallel to
that research, a number of real-world conversational systems have been
deployed
to date. However, the techniques for robust SLU have branched out in
many
different directions. They have been influenced by many recent areas
such as
information extraction, question answering and machine learning.
The
objective of this issue is to provide the speech and language
processing
community with a forum for presenting recent advances, perspectives and
research directions in SLU for conversational systems. The special
issue
follows on the related HLT/NAACL
2004 Workshop and will address related topics such as:
Dr.
Srinivas Bangalore, AT&T
Labs -
Research, srini@research.att.com
Dr.
Dilek
Hakkani-Tür, AT&T
Labs - Research,
dtur@research.att.com
Dr.
Gokhan
Tur, AT&T Labs - Research,
gtur@research.att.com
Extended Submission deadline: January
1st, 2005
(early submission is encouraged)
Notification of acceptance:
April
1st, 2005
Final manuscript due: June
1st,
2005
Tentative publication date: September
1st, 2005
Speech quality may significantly deteriorate in the presence of interference, especially when the speech signal is also subject to reverberation. Consequently, modern communications systems, such as cellular phones, employ some speech enhancement procedure at the preprocessing stage, prior to further processing (e.g., speech coding).
Generally, the performance of single-microphone techniques is limited, since these techniques can utilize only spectral information. Especially for the dereverberation problem, no adequate single-microphone enhancement techniques are presently available. Hence, in many applications such as hands-free mobile telephony, voice-controlled systems, and teleconferencing and hearing instruments, a growing tendency exists to move from single-microphone systems to multimicrophone systems. Although multimicrophone systems come at an increased cost, they exhibit the advantage of incorporating both spatial and spectral information.
The use of multimicrophone systems raises many practical considerations such as tracking the desired speech source and robustness to unknown microphone positions. Furthermore, due to the increased computational load, real-time algorithms are more difficult to obtain and hence the efficiency of the algorithms becomes a major issue.
The main focus of this special issue is on emerging methods for speech processing using multimicrophone arrays.
Topics of interest include (but are not limited to):
Authors should follow the EURASIP JASP manuscript format described at the journal site http://asp.hindawi.com/. Prospective authors should submit an electronic copy of their complete manuscript through the EURASIP JASP manuscript tracking system at http://www.mstracking.com/asp/, according to the following timetable:
Manuscript Due February 1, 2005 Acceptance Notification June 1, 2005 Final Manuscript Due September 1, 2005 Publication Date 1st Quarter, 2006
Jacob Benesty, Université du Québec, Canada
Joerg Bitzer, University of applied science Oldenburg, Germany
Israel Cohen, Technion – IIT, Israel
Simon Doclo, Katholieke Universiteit Leuven, Belgium
Sharon Gannot, Bar-Ilan University, Israel
Rainer Martin, Ruhr-Universitat Bochum, Germany
Sven Nordholm, Curtin University of Technology, Australia
back to top Jim Flanagan and Larry Rabiner sharing a story at
Jim's retirement dinner |
Software
development engineer, Speech Recognition
Speech recognition (SR) technology is being used more and more
as part of natural user interfaces. Microsoft has shipped multiple
products
with SR technology inside. The speech component group is looking for a
developer passionate about the opportunity that speech technology
provides to
work on our acoustic and language modeling technologies. The position
offers
opportunities of innovation on the next wave Microsoft platforms in the
server,
embedded, and desktop markets for various languages.
Job responsibilities include the design, implementation, and analysis of advanced acoustic and language modeling technologies, and the design, development and optimization of speech technology processes used to advance the state of the art in automatic speech recognition. You will be required to design efficient algorithms for speech recognition performance improvement, run experiments, create tools and scripts to process large datasets based on internal and external research. Rigorous attention to detail is required and an ability to analyze results to dynamically choose an optimal course for the development is a must.
Qualifications include a passion for speech technology; a MS or PhD degree on Computer Science or related disciplines; 3 or more years of experience programming in C/C++ and strong computer science skills; ability to multitask and handle ambiguities; ability to identify and solve problems in complicated SR systems. A strong background on speech recognition technology, statistical modeling, pattern recognition or signal processing is highly preferred.
Yifan
Gong
Room
2387 Building 17
Microsoft
Corporation
1, Microsoft Way
Redmond, WA 98052-6399
USA
425-705-9555
ygong@microsoft.com
4th International Conference on
Spoken Language Processing
Hong Kong, China, December 15-18, 2004
ICASSP2005
Philadelphia, Pennsylvania, May, 2005
http://www.icassp2005.org/
SIGdial Workshop on Discourse and Dialog
Lisbon, Portugal , September 2-3, 2005
http://www.sigdial.org/workshops/workshop6
EUROSPEECH 2005 9th European Conference on Speech Communication
and Technology
Lisbon, Portugal, September 4-8, 2005
http://www.interspeech2005.org/
Disfluency in Spontaneous Speech
Aix-en-Provence, September 10-12, 2005
http://www.up.univ-mrs.fr/delic/Diss05
SPECOM 2005 - 10th International Conf. on Speech and Computers
Patras, Greece, October 17-19, 2005
http://www.wcl.ee.upatras.gr/specom2005.htm
IEEE ASRU2005 Automatic Speech Recognition and Understanding
Workshop
Cancun, Mexico, November 27 - December 1, 2005
http://www.asru2005.org
ICASSP2006
Toulouse, France May 15-19, 2006
INTERSPEECH 2006 - ICSLP
Pittsburgh, PA, USA September 17-21, 2006
http://www.interspeech2006.org/
ICASSP2007
Hawaii, USA, 2007
INTERSPEECH 2007
Antwerp, Belgium, August 27-31, 2007
http://www.interspeech2007.org/