Skip to main content

JSTSP Articles

Towards Better Domain Adaptation for Self-Supervised Models: A Case Study of Child ASR

Recently, self-supervised learning (SSL) from unlabelled speech data has gained increased attention in the automatic speech recognition (ASR) community. Typical SSL methods include autoregressive predictive coding (APC), Wav2vec2.0, and hidden unit BERT (HuBERT). However, SSL models are biased to the pretraining data. When SSL models are finetuned with data from another domain, domain shifting occurs and might cause limited knowledge transfer for downstream tasks.

Read more

Improving Automatic Speech Recognition Performance for Low-Resource Languages With Self-Supervised Models

Speech self-supervised learning has attracted much attention due to its promising performance in multiple downstream tasks, and has become a new growth engine for speech recognition in low-resource languages. In this paper, we exploit and analyze a series of wav2vec pre-trained models for speech recognition in 15 low-resource languages in the OpenASR21 Challenge.

Read more

Self-Supervised Language Learning From Raw Audio: Lessons From the Zero Resource Speech Challenge

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. 

Read more

Self-Supervised Speech Representation Learning: A Review

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. 

Read more

Editorial Editorial of Special Issue on Self-Supervised Learning for Speech and Audio Processing

The papers in this special section focus on self-supervised learning for speech and audio processing. A current trend in the machine learning community is the adoption of self-supervised approaches to pretrain deep networks. Self-supervised learning utilizes proxy-supervised learning tasks (or pretext tasks) - for example, distinguishing parts of the input signal from distractors or reconstructing masked input segments conditioned on unmasked segments—to obtain training data from unlabeled corpora. 

Read more

Rate-Splitting Multiple Access for Multi-Antenna Joint Radar and Communications

Dual-Functional Radar-Communication (DFRC) is a promising paradigm to achieve Integrated Sensing and Communication (ISAC) in beyond 5G. In parallel, Rate-Splitting Multiple Access (RSMA), relying on multi-antenna Rate-Splitting (RS) by splitting messages into common and private streams at the transmitter and Successive Interference Cancellation (SIC) at the receivers, has emerged as a new strategy for multi-user multi-antenna communications systems. I

Read more

Cognitive Opportunistic Navigation in Private Networks With 5G Signals and Beyond

A receiver architecture is proposed to cognitively extract navigation observables from fifth generation (5G) new radio (NR) signals of opportunity. Unlike conventional opportunistic receivers which require knowledge of the signal structure, particularly the reference signals (RSs), the proposed cognitive opportunistic navigation (CON) receiver requires knowledge of only the frame duration and carrier frequency of the signal. In 5G NR, some of these RSs are only transmitted on demand, which limits the existing opportunistic...

Read more

Secret Key Generation Using Short Blocklength Polar Coding Over Wireless Channels

This paper investigates the problem of secret key generation from correlated Gaussian random variables in the short blocklength regime. Short blocklengths are commonly employed in massively connected IoT sensor networks in 5G and beyond wireless systems. Polar codes have previously been shown to be applicable to the secret key generation problem, and are known to perform well for short blocklengths in the channel coding context. Inspired by these findings, we propose an explicit protocol based on polar codes for generating secret keys in the short blocklength regime.

Read more