TIP Featured Articles

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TIP Featured Articles

Collective activity recognition, which tells what activity a group of people is performing, is a cutting-edge research topic in computer vision. Different from action performed by individuals, collective activity needs to consider the complex interactions among different people.

Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been proposed in the literature.

The prevailing characteristics of micro-videos result in the less descriptive power of each modality. The micro-video representations, several pioneer efforts proposed, are limited in implicitly exploring the consistency between different modality information but ignore the complementarity. In this paper, we focus on how to explicitly separate the consistent features and the complementary features from the mixed information and harness their combination to improve the expressiveness of each modality.

To promote the applications of semantic segmentation, quality evaluation is important to assess different algorithms and guide their development and optimization. In this paper, we establish a subjective semantic segmentation quality assessment database based on the stimulus-comparison method. Given that the database reflects the relative quality of semantic segmentation result pairs...

In this paper, we present a novel Bayesian classification framework of the matrix variate Bingham distributions with the inclusion of its normalizing constant and develop a consistent general parametric modeling framework based on the Grassmann manifolds. To calculate the normalizing constants of the Bingham model, this paper extends the method of saddle-point approximation (SPA) to a new setting.

In a typical communication pipeline, images undergo a series of processing steps that can cause visual distortions before being viewed. Given a high quality reference image, a reference (R) image quality assessment (IQA) algorithm can be applied after compression or transmission. However, the assumption of a high quality reference image is often not fulfilled in practice, thus contributing to less accurate quality predictions when using stand-alone R IQA models.

In depth map coding, rate-distortion optimization for those pixels that will cause occlusion in view synthesis is a rather challenging task, since the synthesis distortion estimation is complicated by the warping competition and the occlusion order can be easily changed by the adopted optimization strategy. 

Visual attention is an important mechanism in the human visual system (HVS) and there have been numerous saliency detection algorithms designed for 2D images/video recently. However, the research for fixation detection of stereoscopic video is still limited and challenging due to the complicated depth and motion information. 

In this paper, a self-guiding multimodal LSTM (sgLSTM) image captioning model is proposed to handle an uncontrolled imbalanced real-world image-sentence dataset. We collect a FlickrNYC dataset from Flickr as our testbed with 306,165 images and the original text descriptions uploaded by the users are utilized as the ground truth for training.

We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations and achieves high-quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking-based methods since only a sparse set of scans is needed.

Pages

SPS on Twitter

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar