1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
Robustness and stability of image-reconstruction algorithms have recently come under scrutiny. Their importance to medical imaging cannot be overstated. We review the known results for the topical variational regularization strategies (
This letter proposes a generalised extended nested array with multiple subarrays (GENAMS) array via the maximum inter-element spacing (IES) constraint principle. Based on the IES set patterns of the two-sides extended nested array and the flexible extended nested array with multiple subarrays type-2, a generalised IES set pattern is derived.
3D face reconstruction from a single image still suffers from low accuracy and inability to recover textures in invisible regions. In this paper, we propose a method for generating a 3D portrait with complete texture. The coarse face-and-head model and texture parameters are obtained using 3D Morphable Model fitting. We design an image-geometric inverse renderer that acquires normal, albedo, and light to jointly reconstruct the facial details.
Iterative hard thresholding (IHT) and hard thresholding pursuit (HTP) are two kinds of classical hard thresholding-based algorithms widely used in compressed sensing. Restricted isometry constant (RIC) of sensing matrix which ensures the convergence of iterative algorithms plays a key role in guaranteeing successful recovery. In the analysis of sufficient condition to ensure recovery performance, the RIC
The correlation filter(CF)-based tracker is a classic and effective model in the field of visual tracking. For a long time, most CF-based trackers solved filters using only ridge regression equations with
Image registration is a basic task in computer vision, for its wide potential applications in image stitching, stereo vision, motion estimation, and etc. Most current methods achieve image registration by estimating a global homography matrix between candidate images with point-feature-based matching or direct prediction. However, as real-world 3D scenes have point-variant photograph distances (depth), a unified homography matrix is not sufficient to depict the specific pixel-wise relations between two images.
Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this letter, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification.
The prominent success of neural networks, mainly in computer vision tasks, is increasingly shadowed by their sensitivity to small, barely perceivable adversarial perturbations in image input. In this article, we aim at explaining this vulnerability through the framework of sparsity. We show the connection between adversarial attacks and sparse representations, with a focus on explaining the universality and transferability of adversarial examples in neural networks.
An online topology estimation algorithm for nonlinear structural equation models (SEM) is proposed in this paper, addressing the nonlinearity and the non-stationarity of real-world systems. The nonlinearity is modeled using kernel formulations, and the curse of dimensionality associated with the kernels is mitigated using random feature approximation.
The Discrete Wavelet Transform (DWT) has gained attention in the area of Multi-Carrier Modulation (MCM) because it can overcome some well known limitations of Discrete Fourier Transform (DFT) based MCM systems. Its improved spectral containment removes the need for a cyclic prefix, be it that appropriate equalization then has to be added as the cyclic convolution property no longer holds. Most DWT based MCM systems in the literature use Time-domain EQualizers (TEQs) to mitigate the channel distortion.
With the integration of communication and computing, it is expected that part of the computing is transferred to the transmitter side. In this paper we address the general problem of Frequency Modulation (FM) for function approximation through a communication channel. We exploit the benefits of the Discrete Cosine Transform (DCT) to approximate the function and design the waveform. In front of other approximation schemes, the DCT uses basis of controlled dynamic, which is a desirable property for a practical implementation.
In this paper, we consider robust channel estimation for a millimeter wave (mmWave) massive MIMO system with uniform planar arrays (UPA). For many gridless angle estimation methods of mmWave channels, the channel gains needs to be time-invariant during training. We propose a gridless method that is applicable to time-invariant and time-varying channels, and the proposed method is robust to channel variations.
Quantized constant envelope (QCE) transmission is a popular and effective technique to reduce the hardware cost and improve the power efficiency of 5G and beyond systems equipped with large antenna arrays. It has been widely observed that the number of quantization levels has a substantial impact on the system performance.
Question answering (QA)-based re-ranking methods for cross-modal retrieval have been recently proposed to further narrow down similar candidate images. The conventional QA-based re-ranking methods provide questions to users by analyzing candidate images, and the initial retrieval results are re-ranked based on the user's feedback. Contrary to these developments, only focusing on performance improvement makes it difficult to efficiently elicit the user's retrieval intention.
Image-text matching, as a fundamental cross-modal task, bridges the gap between vision and language. The core is to accurately learn semantic alignment to find relevant shared semantics in image and text. Existing methods typically attend to all fragments with word-region similarity greater than empirical threshold zero as relevant shared semantics, e.g. , via a ReLU operation that forces the negative to zero and maintains the positive.
Recent advances in unsupervised domain adaptation (UDA) techniques have witnessed great success in cross-domain computer vision tasks, enhancing the generalization ability of data-driven deep learning architectures by bridging the domain distribution gaps.
Despite the development of computer vision techniques, the micro-expression (ME) recognition task still remains a great challenge because MEs have very low intensity and short duration. However, the ME recognition is of great significance since it provides important clues for real affective states detection. This paper proposes a novel Block Division Convolutional Network (BDCNN) with the implicit deep features augmentation.
Cross-domain Facial Expression Recognition (FER) aims to safely transfer the learned knowledge from labeled source data to unlabeled target data, which is challenging due to the subtle difference between various expressions and the large discrepancy between domains. Existing methods mainly focus on reducing the domain shift for transferable features but fail to learn discriminative representations for recognizing facial expression, which may result in negative transfer under cross-domain settings.
We introduce a Gaussian Mixture Model (GMM) framework for 3D holoscopic image compression in this paper. The elemental-images of the 3D holoscopic image are predicted using GMM and the parameters of GMM are estimated using the common Expectation-Maximization (EM) algorithm. GMM Model Optimization (GMO) is used in this framework to select the optimal number of distributions and avoid local optimum of EM at the same time.
Current approaches for human pose estimation in videos can be categorized into per-frame and warping-based methods. Both approaches have their pros and cons. For example, per-frame methods are generally more accurate, but they are often slow. Warping-based approaches are more efficient, but the performance is usually not good. To bridge the gap, in this paper, we propose a novel fast framework for human pose estimation to meet the real-time inference with controllable accuracy degradation in compressed video domain.