United Human Pose: Integrating Domain Knowledge and Machine Learning

PhD Thesis: United Human Pose: Integrating Domain Knowledge and Machine Learning
By: Shuangjun Liu, Augmented Cognition Lab (ACLab), ECE Department, Northeastern
University Advisor: Sarah Ostadabbas
With the great endeavor of computer vision community, 2D human pose estimation has achieved considerable success in recent years, from the introduction of single-person pose estimation models such as the convolutional pose machine and stacked hourglass models to multi-person pose estimation networks such as OpenPose. These successes are greatly owed to the recent accessibility of large-scale well-annotated 2D human pose datasets, whose creation was possible due to the facts that (1) 2D human poses during certain daily activities are fairly easy to capture/collect and (2) annotating 2D poses via crowd-sourcing is a straightforward and affordable task. However, such convenience no longer holds when it comes to poses during specific activities such as sleeping or when advanced pose regression tasks such as 3D human poses are targeted. Starting with a general purpose human pose estimation goal, it is evident that the existing large-scale human pose datasets are all collected in an ``easy-to-access'' fashion, and do not contain data from applications where data collection or labelling is expensive (i.e. Small Data domains). This is very common for medical applications where data collection and/or labeling is expensive, individualized, and protected by very strong privacy or classification laws.
In order to address the ``Small Data'' problems especially when inference models with deep structures are used, there are two basic approaches to reduce data needs during model training: (1) incorporate domain knowledge in the learning pipeline through the use of data-driven or simulation-based generative models, and (2) decrease inference model learning complexity via data-efficient machine learning. This thesis study is unfolded around addressing small data problems in the context of human pose estimation by leveraging the existing research and filling in key gaps with original work.
A specific human pose estimation problem in this domain, in-bed human pose estimation is extensively studied with solutions in an increasing order of feasibility, that make use of (1) conventional non-deep inference model, (3) fine-tuning already trained deep model, and (3) building and training a pose model from scratch. In order to address the small data challenge in a more general way, data augmentation approaches via both 2D and 3D haven also been studied including: (1) a pose guided approach to augment existing 2D human figure images while preserving their background, and (2) a semi-supervised data augmentation approach via 3D graphic engine and test its effectiveness against real human data.