Rethinking Bayesian Learning for Data Analysis: The art of prior and inference in sparsity-aware modeling

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Rethinking Bayesian Learning for Data Analysis: The art of prior and inference in sparsity-aware modeling

By: 
Lei Cheng; Feng Yin; Sergios Theodoridis; Sotirios Chatzis; Tsung-Hui Chang

Sparse modeling for signal processing and machine learning, in general, has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning (SAL) consists of two major paths paved by 1) discriminative methods that establish direct input–output mapping based on a regularized cost function optimization and 2) generative methods that learn the underlying distributions. The latter, more widely known as Bayesian methods , enable uncertainty evaluation with respect to the performed predictions. Furthermore, they can better exploit related prior information and also, in principle, can naturally introduce robustness into the model, due to their unique capacity to marginalize out uncertainties related to the parameter estimates. Moreover, hyperparameters (tuning parameters) associated with the adopted priors, which correspond to cost function regularizers, can be learned via the training data and not via costly cross-validation techniques, which is, in general, the case with the discriminative methods.

Sparse modeling for signal processing and machine learning, in general, has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning (SAL) consists of two major paths paved by 1) discriminative methods that establish direct input–output mapping based on a regularized cost function optimization and 2) generative methods that learn the underlying distributions. The latter, more widely known as Bayesian methods, enable uncertainty evaluation with respect to the performed predictions. Furthermore, they can better exploit related prior information and also, in principle, can naturally introduce robustness into the model, due to their unique capacity to marginalize out uncertainties related to the parameter estimates. Moreover, hyperparameters (tuning parameters) associated with the adopted priors, which correspond to cost function regularizers, can be learned via the training data and not via costly cross-validation techniques, which is, in general, the case with the discriminative methods.

To implement SAL, the crucial point lies in the choice of the function regularizer for discriminative methods and the choice of the prior distribution for Bayesian learning. Over the past decade or so, due to the intense research on deep learning, emphasis has been put on discriminative techniques. However, a comeback of Bayesian methods is taking place that sheds new light on the design of deep neural networks (DNNs), which also establish firm links with Bayesian models, such as Gaussian processes (GPs), and also inspire new paths for unsupervised learning, such as Bayesian tensor decomposition. The goal of this article is two-fold. First, it aims to review, in a unified way, some recent advances in incorporating sparsity-promoting priors into three highly popular data modeling/analysis tools, namely, DNNs, GPs, and tensor decomposition. Second, it reviews their associated inference techniques from different aspects, including evidence maximization via optimization and variational inference (VI) methods. Challenges, such as the small data dilemma, automatic model structure search, and natural prediction uncertainty evaluation, are also discussed. Typical signal processing and machine learning tasks are considered, such as time series prediction, adversarial learning, social group clustering, and image completion. Simulation results corroborate the effectiveness of the Bayesian path in addressing the aforementioned challenges and its outstanding capability of matching data patterns automatically.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel