Skip to main content

DCT-Based Air Interface Design for Function Computation

With the integration of communication and computing, it is expected that part of the computing is transferred to the transmitter side. In this paper we address the general problem of Frequency Modulation (FM) for function approximation through a communication channel. We exploit the benefits of the Discrete Cosine Transform (DCT) to approximate the function and design the waveform. In front of other approximation schemes, the DCT uses basis of controlled dynamic, which is a desirable property for a practical implementation. 

Robust Gridless Estimation of Angles and Delays for Full-Dimensional Wideband mmWave Channels

In this paper, we consider robust channel estimation for a millimeter wave (mmWave) massive MIMO system with uniform planar arrays (UPA). For many gridless angle estimation methods of mmWave channels, the channel gains needs to be time-invariant during training. We propose a gridless method that is applicable to time-invariant and time-varying channels, and the proposed method is robust to channel variations. 

Diversity Order Analysis for Quantized Constant Envelope Transmission

Quantized constant envelope (QCE) transmission is a popular and effective technique to reduce the hardware cost and improve the power efficiency of 5G and beyond systems equipped with large antenna arrays. It has been widely observed that the number of quantization levels has a substantial impact on the system performance.

Recallable Question Answering-Based Re-Ranking Considering Semantic Region for Cross-Modal Retrieval

Question answering (QA)-based re-ranking methods for cross-modal retrieval have been recently proposed to further narrow down similar candidate images. The conventional QA-based re-ranking methods provide questions to users by analyzing candidate images, and the initial retrieval results are re-ranked based on the user's feedback. Contrary to these developments, only focusing on performance improvement makes it difficult to efficiently elicit the user's retrieval intention.

Unified Adaptive Relevance Distinguishable Attention Network for Image-Text Matching

Image-text matching, as a fundamental cross-modal task, bridges the gap between vision and language. The core is to accurately learn semantic alignment to find relevant shared semantics in image and text. Existing methods typically attend to all fragments with word-region similarity greater than empirical threshold zero as relevant shared semantics, e.g. , via a ReLU operation that forces the negative to zero and maintains the positive.

Block Division Convolutional Network With Implicit Deep Features Augmentation for Micro-Expression Recognition

Despite the development of computer vision techniques, the micro-expression (ME) recognition task still remains a great challenge because MEs have very low intensity and short duration. However, the ME recognition is of great significance since it provides important clues for real affective states detection. This paper proposes a novel Block Division Convolutional Network (BDCNN) with the implicit deep features augmentation. 

Deep Margin-Sensitive Representation Learning for Cross-Domain Facial Expression Recognition

Cross-domain Facial Expression Recognition (FER) aims to safely transfer the learned knowledge from labeled source data to unlabeled target data, which is challenging due to the subtle difference between various expressions and the large discrepancy between domains. Existing methods mainly focus on reducing the domain shift for transferable features but fail to learn discriminative representations for recognizing facial expression, which may result in negative transfer under cross-domain settings.

3D Holoscopic Image Compression Based on Gaussian Mixture Model

We introduce a Gaussian Mixture Model (GMM) framework for 3D holoscopic image compression in this paper. The elemental-images of the 3D holoscopic image are predicted using GMM and the parameters of GMM are estimated using the common Expectation-Maximization (EM) algorithm. GMM Model Optimization (GMO) is used in this framework to select the optimal number of distributions and avoid local optimum of EM at the same time.

Fast Human Pose Estimation in Compressed Videos

Current approaches for human pose estimation in videos can be categorized into per-frame and warping-based methods. Both approaches have their pros and cons. For example, per-frame methods are generally more accurate, but they are often slow. Warping-based approaches are more efficient, but the performance is usually not good. To bridge the gap, in this paper, we propose a novel fast framework for human pose estimation to meet the real-time inference with controllable accuracy degradation in compressed video domain.