Welcome to the Winter 2011 edition of the IEEE Speech and Language Processing Technical Committee's Newsletter.
In this issue we are pleased to provide another installment of brief articles representing a diversity of views and backgrounds. This issue includes articles from 13 guest contributors, and our own 8 staff reporters and editors.
We believe the newsletter is an ideal forum for updates, reports, announcements and editorials which don't fit well with traditional journals. We welcome your contributions, as well as calls for papers, job announcements, comments and suggestions. You can submit job postings here, and reach us at speechnewseds [at] listserv (dot) ieee [dot] org.
Finally, to subscribe the Newsletter, send an email with the command "subscribe speechnewsdist" in the message body to listserv [at] listserv (dot) ieee [dot] org.
Jason Williams, Editor-in-chief
Pino Di Fabbrizio, Editor
Martin Russell, Editor
Chuck Wooters, Editor
An update on ICASSP 2011, what to do when authors can't present papers, and a look ahead.
The IEEE Signal Processing Society, our parent organization, also produces a monthly newsletter, "Inside Signal Processing".
The ISCA Student Advisory Committee (ISCA-SAC) was established in 2005 by the International Speech Communication Association (ISCA) to organize and coordinate student-driven projects. After our work in 2008 and 2009 we are looking forward to a busy 2011.
We are delighted to host the first "Show & Tell" event at Interspeech 2011. Show & Tell will provide researchers, technologists and practitioners from academia, industry and government the opportunity to demonstrate their latest research systems and interact with the attendees in an informal setting.
This article describes some of the main achievements to date in the EC FP7 project "CLASSiC", which ends in early 2011. The project focuses on statistical methods for dialogue processing, and is one of the largest current European research projects in speech and language technology. It is coordinated by Heriot-Watt University, and is a collaboration between Cambridge University, the University of Geneva, University of Edinburgh, Supelec, and France Telecom / Orange Labs.
In November, an AAAI symposium sought out to address a growing problem in robotics - how should we build dialog systems for robots? This meeting brought together researchers from communities that traditionally haven't had a clear line of communication - human-robot interaction and spoken dialog. We had a chance to interview one of the organizers about the workshop and existing challenges facing human-robot dialog researchers.
This article covers recent work in distant speech processing, including the EU-funded DICIT project.
This short article presents some brief thoughts on regional accent and dialect in speech and language technology research, and their relationship with language recognition.
Over the past few years, IBM Research has been actively involved a project to build a computer system, known as Watson, to compete at the human championship level on the quiz show Jeopardy!. After four years of intense research, Watson can perform on the Jeopardy! show at the level of human expertise in terms of precision, confidence and speed. The official first-even man vs. machine Jeopardy! competition will air on television February 14, 15, and 16.
Structure makes text coherent and meaningful. Discourse relations (also known as rhetorical and coherence relations) link clauses in text and compose overall text structure. Discourse relations are used in natural language processing, including text summarization and natural language generation. In this article we discuss human agreement in discourse annotation task and review approaches to automatic discourse identification.
Superlectures.com is an innovative lecture video portal that enables users to search for spoken content. This brings a significant speed-up in accessing lecture video recordings. The aim of this portal is to make video content easily searchable as any textual document. The speech processing system automatically recognizes and indexes Czech and English spoken words.
SLTC Newsletter, February 2011
This is the first SLTC Newsletter of 2011, and also represents my first contribution as SLTC Chair. I want to extend a sincere thanks and appreciation to Steve Young, the outgoing SLTC Chair for his outstanding leadership, dedication, and commitment to the Speech and Language Processing TC. His hard work has resulted in great progress for speech and language within the IEEE Signal Processing Society. I also want to extend a warm welcome to our new vice-Chair, Doug O'Shaughnessy. I look forward to working with Doug and the rest of the SLTC Members over the next two years.
Well, it just seems like IEEE ICASSP-2010 was just here, and we're now well into the ICASSP-2011 process. One of the challenges which IEEE Signal Processing Society has moved forward on this round is to better articulate the process for presenting papers at ICASSP and ICIP. The process of determining which papers were not presented has been left to the local organizers. With so many travel restrictions these days, including visa's, etc., as well as what constitutes a personal reason, it is often hard to resolve the "no-show" papers. We all expect papers which have been accepted to be presented, and the high quality of papers accepted to ICASSP (and ICIP) should result in all of these papers being presented. I am happy to say that IEEE SPS, based on feedback from the ICASSP-2010 Technical Committee, has drafted formal guidelines which make it easier to resolve these issues (if you do not know, when a paper is accepted, at least one author must be registered at the full rate, and present the paper; if he/she cannot present the paper for any reason, it is the author's responsibility to find a suitable person technically knowledgeable on the subject matter to present the oral/poster). Posting these guidelines when authors register and making this clear will help reduce conflicts and ensure all accepted papers are presented.
Now, with respect to ICASSP 2011, we are only several months away and all acceptance/rejection notes have been sent out as of last week. This year, we had 691 papers submitted to the Speech-Language areas, 582 in speech processing (Area 13) and 109 in language processing (Area 14). In total, 2,616 reviews were completed, and 96.8% of the papers had 4 or more reviews completed (plus a Meta-Review from an SLTC Member). Many thanks to all of the TC members for their work in the reviewing/meta-review process here – this has been a major accomplishment. In particular, the enormous efforts of our Area Chairs, Pascale Fung, TJ Hazen, Brian Kingsbury and David Suendermann cannot be underestimated. Over the past three months, they have been hard at work resolving countless issues. What is left to do now is determine best student paper nominees and session chairs. Many thanks to all who participated in this process!
The process of running the SLTC represents the work of many volunteers, and I ask for your continued time in seeing that we continue the advancements seen in the last three years. The number of TC members is expanded which should help in addressing the range of duties/tasks we have to accomplish. Also, the vice-Chair will help ensure that we have continuity as we move forward. I will be sending out the Sub-Committee list in the next few weeks. We need to continue to work towards having speech and language research recognized in the SPS, including paper awards, fellow nominations, and service/technical awards. The 2010 SLT Workshop in Berkeley, CA (http://www.slt2010.org/) was very successful. Finally, while speech and language continues to expand in the IEEE Signal Processing Society, one of our interests is to reach out to other parts of the world to include new members. I encourage you to renew or establish new collaborations with places that have not seen much representation at ICASSP. With +6000 languages spoken in this world, we should be able to reduce communication barriers for speech processing and language technology advancements in all languages.
I look forward to seeing all the SLTC members in Prague at this ICASSP!
John H.L. Hansen
January 2011
John H.L. Hansen is Chair, Speech and Language Technical Committee.
SLTC Newsletter, February 2011
The ISCA Student Advisory Committee (ISCA-SAC) was established in 2005 by the International Speech Communication Association (ISCA) to organize and coordinate student-driven projects. After our work in 2008 and 2009 we are looking forward to a busy 2011.
The ISCA-SAC were proud to host student-oriented events during Interspeech, held this year in Makuhari, Japan. The jewel in the crown was perhaps the student panel session, building on the success of the event at the previous year's conference. The theme of this year's panel session was "2010-2020 - Speech Technology in the Next Decade". The event gave research students an invaluable opportunity to discuss the likely challenges and themes for speech research over the next ten years, with prominent figures in the field. The contributing speakers and panelists were Alan Black (Carnegie Mellon University), Nick Campbell (Trinity College Dublin), Ciprian Chelba (Google) and Bowen Zhou (IBM Watson Research Center).
A very successful social reception was also held for students during the conference, allowing members to network in a more relaxed environment.
After an overhaul of the main ISCA website, the new ISCA students website has gone live after many months of development. The website features an interactive blog, a new forum, links to various ISCA resources and the ISCA grant system. Take a look now at http://www.isca-students.org/.
The 2010 Young Researchers' Roundtable on Spoken Dialog Systems (YRDDS) was held at Waseda University, Tokyo in September and met with great success. The event is designed for students, post docs, and junior researchers working in research related to spoken dialogue systems. ISCA-SAC have a history of contributing to the event, with past and present members of the ISCA-SAC on the organising committee. ISCA-SAC looks forward to the 2011 event in Portland, Oregan.
In 2011 Interspeech will be held in Florence, Italy. ISCA-SAC are very excited to be planning a number of student-oriented events for the conference, and looks forward to seeing you there.
The ultimate goal of ISCA-SAC is to drive student-oriented events and support structures in speech and language research (read our mission at http://www.isca-students.org/?q=mission). To fulfil this we rely on our student volunteers.
Volunteering is a fantastic opportunity to really get involved and noticed in this research community. There are many different ways students can help out, and the nature of the tasks involved makes the work very flexible.
If you're interested in getting involved, contact the committee at volunteer [at] isca-students [dot] org
SLTC Newsletter, February 2011
We are delighted to host the first Show & Tell event at Interspeech 2011. Show & Tell will provide researchers, technologists and practitioners from academia, industry and government the opportunity to demonstrate their latest research systems and interact with the attendees in an informal setting. Demonstrations need to be based on innovations and fundamental research in areas of human speech production, perception, communication, and speech and language technology and systems.
Demonstrations will be peer-reviewed by members of the Interspeech Program Committee, who will judge the originality, significance, quality, and clarity of each submission. At least one author of each accepted submission must register for and attend the conference, and demonstrate the system during the Show & Tell sessions. Each accepted demonstration paper will be allocated 2 pages in the conference proceeding.
At the conference, all accepted demonstrations will be evaluated and considered for the Best Show & Tell Award.
Chair: Mazin Gilbert (AT&T Labs Research, USA)
Important dates:
More information:
SLTC Newsletter, February 2011
This article describes some of the main achievements to date in the EC FP7 project "CLASSiC", which ends in early 2011. The project focuses on statistical methods for dialogue processing, and is one of the largest current European research projects in speech and language technology. It is coordinated by Heriot-Watt University, and is a collaboration between Cambridge University, the University of Geneva, University of Edinburgh, L'Ecole Supérieure d'électricité (SUPELEC), and France Telecom / Orange Labs
The overall goal of the CLASSiC project has been to develop statistical machine learning methods for the deployment of accurate and robust spoken dialogue systems (SDS). These systems can learn from experience - either from dialogue data that has already been collected, or online through interactions with users. We have deployed systems (for data collection and evaluation) for tourist information, customer support, and appointment scheduling. One system, for appointment scheduling, has been available for public use in France since March 2010.
CLASSiC has proposed and developed a unified treatment of uncertainty across the entire SDS architecture (speech recognition, spoken language understanding, dialogue management, natural language generation, and speech synthesis). This architecture allows multiple possible analyses (e.g. n-best lists of ASR hypothesis, distributions over user goals) to be represented, maintained, and reasoned with robustly and efficiently. It supports a layered hierarchy of supervised learning and reinforcement learning methods, in order to facilitate mathematically principled optimisation and adaptation techniques. However, the CLASSiC architecture still maintains the modularity of traditional SDS, allowing the separate development of statistical models of speech recognition, spoken language understanding, dialogue management, natural language generation and speech synthesis. For more details, see citations below, or Deliverable 5.1.2.
A system "belief state" showing a probability distribution
over possible user goals (size of the bar on the left indicates
relative probability of the corresponding meanings on the right).
Progress is being made in several areas:
The CLASSiC systems and components have been evaluated both in simulation and in trials with real users, both in laboratory conditions and "in the wild" (i.e. with real users outside of the lab). The final evaluation of the systems is ongoing at the time of writing. (Please see our publications page for the referenced papers.)
We have obtained the following evaluation results to date:
CLASSiC partners have deployed the technology in the Spoken Dialogue Challenge and the CONLL shared tasks on syntactic-semantic dependency parsing. The CLASSiC technologies were amongst the top performers on these tasks.
The CLASSiC project will release freely available dialogue data to the research community at the end of the project (project Deliverable D6.5). This can be expected towards the middle of 2011. This data will consist of anonymised system audio, logs, transcriptions, and some annotated data from several of the CLASSiC dialogue systems. The released data will amount to several thousand dialogues.
The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 216594 (CLASSiC project).
Thanks to the CLASSiC partners for providing input to this article.
For more information, see:
Oliver Lemon is a Professor in the School of Mathematics and Computer Science at Heriot-Watt University, Edinburgh, where he leads the Interaction Lab. He is the Coordinator of the CLASSiC Project.