1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid Hidden Markov-deep neural network model-based automatic speech recognition (ASR) systems. Such E2E systems are attractive because they do not require initial alignments between input acoustic features and output graphemes or words. Very deep convolutional networks and recurrent neural networks have also been very successful in ASR systems due to their added expressive power and better generalization. ASR is often not the end goal of real-world speech information processing systems. Instead, an important end goal is information retrieval, in particular keyword search (KWS), that involves retrieving speech documents containing a user-specified query from a large database. Conventional keyword search uses an ASR system as a front-end that converts the speech database into a finitestate transducer (FST) index containing a large number of likely word or sub-word sequences for each speech segment, along with associated confidence scores and time stamps. A user-specified text query is then composed with this FST index to find the putative locations of the keyword along with confidence scores. More recently, inspired by E2E approaches, ASR-free keyword search systems have been proposed with limited success. Machine learning methods have also been very successful in QuestionAnswering, parsing, language translation, analytics and deriving representations of morphological units, words or sentences. Challenges such as the Zero Resource Speech Challenge aim to construct systems that learn an end-to-end Spoken Dialog (SD) system, in an unknown language, from scratch, using only information available to a language learning infant (zero linguistic resources). The principal objective of the recently concluded IARPA Babel program was to develop a keyword search system that delivers high accuracy for any new language given very limited transcribed speech, noisy acoustic and channel conditions, and limited system build time of one to four weeks. This special issue will showcase the power of novel machine learning methods not only for ASR, but for keyword search and for the general processing of speech and language.
Topics of interest in the special issue include (but are not limited to):
- Manuscript submission:
- First review completed:
- Revised Manuscript Due:
- Second Review Completed:
- Final Manuscript Due:
- Publication: December. 2017
- Nancy F. Chen, Institute for Infocomm Research (I2R), A*STAR, Singapore
- Mary Harper, Army Research Laboratory, USA
- Brian Kingsbury, IBM Watson, IBM T.J. Watson Research Center, USA
- Kate Knill, Cambridge University, U.K.
- Bhuvana Ramabhadran, IBM Watson, IBM T.J. Watson Research Center, USA
© Copyright 2019 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.