Skip to main content

Global 3D Non-Rigid Registration of Deformable Objects Using a Single RGB-D Camera

We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations and achieves high-quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking-based methods since only a sparse set of scans is needed.

Very Low Bitrate Semantic Compression of Airplane Cockpit Screen Content

This paper addresses the problem of encoding the video generated by the screen of an airplane cockpit. As other computer screens, cockpit screens consist of computer-generated graphics often atop a natural background. Existing screen content coding schemes fail notably in preserving the readability of textual information at the low bitrates required in avionic applications. 

Robust Alignment for Panoramic Stitching Via an Exact Rank Constraint

We study the problem of image alignment for panoramic stitching. Unlike most existing approaches that are feature-based, our algorithm works on pixels directly, and accounts for errors across the whole images globally. Technically, we formulate the alignment problem as rank-1 and sparse matrix decomposition over transformed images, and develop an efficient algorithm for solving this challenging non-convex optimization problem.

Foreground Fisher Vector: Encoding Class-Relevant Foreground to Improve Image Classification

Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features.

A Hardware Realization of Superresolution Combining Random Coding and Blurring

Resolution enhancements are often desired in imaging applications where high-resolution sensor arrays are difficult to obtain. Many computational imaging methods have been proposed to encode high-resolution scene information on low-resolution sensors by cleverly modulating light from the scene before it hits the sensor. 

Mixed Integer Programming For Sparse Coding: Application to Image Denoising

Dictionary learning for sparse representations is generally conducted in two alternating steps-sparse coding and dictionary updating. In this paper, a new approach to solve the sparse coding step is proposed. Because this step involves an 0 -norm, most, if not all, existing solutions only provide a local or approximate solution. Instead, a real 0 optimization is considered for the sparse coding problem providing a global solution. 

Physics-Based Learned Design: Optimized Coded-Illumination for Quantitative Phase Imaging

Coded illumination can enable quantitative phase microscopy of transparent samples with minimal hardware requirements. Intensity images are captured with different source patterns, then a nonlinear phase retrieval optimization reconstructs the image. The nonlinear nature of the processing makes optimizing the illumination pattern designs complicated. 

Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification

Short duration text-independent speaker verification remains a hot research topic in recent years, and deep neural network based embeddings have shown impressive results in such conditions. Good speaker embeddings require the property of both small intra-class variation and large inter-class difference, which is critical for the ability of discrimination and generalization.

Effective Subword Segmentation for Text Comprehension

Representation learning is the foundation of machine reading comprehension and inference. In state-of-the-art models, character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word.