Skip to main content

Semi-Supervised Seq2seq Joint-Stochastic-Approximation Autoencoders With Applications to Semantic Parsing

Developing Semi-Supervised Seq2Seq (S4) learning for sequence transduction tasks in natural language processing (NLP), e.g. semantic parsing, is challenging, since both the input and the output sequences are discrete. This discrete nature makes trouble for methods which need gradients either from the input space or from the output space.

Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality

Utilizing a human-perception-related objective function to train a speech enhancement model has become a popular topic recently. The main reason is that the conventional mean squared error (MSE) loss cannot represent auditory perception well. One of the typical human-perception-related metrics, which is the perceptual evaluation of speech quality (PESQ), has been proven to provide a high correlation to the quality scores rated by humans.

An Assessment of Paralinguistic Acoustic Features for Detection of Alzheimer's Dementia in Spontaneous Speech

Speech analysis could provide an indicator of Alzheimer's disease and help develop clinical tools for automatically detecting and monitoring disease progression. While previous studies have employed acoustic (speech) features for characterisation of Alzheimer's dementia, these studies focused on a few common prosodic features, often in combination with lexical and syntactic features which require transcription.

Pragmatic Aspects of Discourse Production for the Automatic Identification of Alzheimer's Disease

Clinical literature provides convincing evidence that language deficits in Alzheimer's disease (AD) allow for distinguishing patients with dementia from healthy subjects. Currently, computational approaches have widely investigated lexicosemantic aspects of discourse production, while pragmatic aspects like cohesion and coherence, are still mostly unexplored.

Diagnosis of Obstructive Sleep Apnea Using Speech Signals From Awake Subjects

Obstructive sleep apnea (OSA) is a sleep disorder in which pharyngeal collapse during sleep causes complete (apnea) or partial (hypopnea) airway obstruction. OSA is common and can have severe implications, but often remains undiagnosed. The most widely used objective measure of OSA severity is the number of obstructive events per hour of sleep, known as the apnea-hypopnea index (AHI).

Modeling Obstructive Sleep Apnea Voices Using Deep Neural Network Embeddings and Domain-Adversarial Training

Obstructive Sleep Apnea (OSA) is a sleep breathing disorder affecting at least 3–7% of male adults and 2–5% of female adults between 30 and 70 years. It causes recurrent partial or total obstruction episodes at the level of the pharynx which causes cessation of breath during sleep. 

Introduction to the Issue on Automatic Assessment of Health Disorders Based on Voice, Speech, and Language Processing

Approximately one-fifth of the world's population suffer or have suffered from voice and speech production disorders due to diseases or some other dysfunction. Thus, there is a clear need for objective ways to evaluate the quality of voice and speech as well as its link to vocal fold activity, to evaluate the complex interaction between the larynx and voluntary movements of the articulators (i.e., lips, teeth, tongue, velum, jaw, etc), or to evaluate disfluencies at the language level.

Gustavo Rohde

University of Virginia Charlottesville, VA, USA

Timothy Davidson

McMaster University Ontario, Canada