IEEE Transactions on Image Processing

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Geometric partitioning has attracted increasing attention by its remarkable motion field description capability in the hybrid video coding framework. However, the existing geometric partitioning (GEO) scheme in Versatile Video Coding (VVC) causes a non-negligible burden for signaling the side information. Consequently, the coding efficiency is limited. In view of this, we propose a spatio-temporal correlation guided geometric partitioning (STGEO) scheme to efficiently describe the object information in the motion field of video coding.

Most existing trackers use bounding boxes for object tracking. However, the background contained in the bounding box inevitably decreases the accuracy of the target model, which affects the performance of the tracker and is particularly pronounced for non-rigid objects. To address the above issue, this paper proposes a novel hybrid level set model, which can robustly address the issue of topology changing, occlusions and abrupt motion in non-rigid object tracking by accurately tracking the object contour. 

Multi-view clustering aims at simultaneously obtaining a consensus underlying subspace across multiple views and conducting clustering on the learned consensus subspace, which has gained a variety of interest in image processing. In this paper, we propose the Semi-supervised Structured Subspace Learning algorithm for clustering data points from Multiple sources (SSSL-M). We explicitly extend the traditional multi-view clustering with a semi-supervised manner and then build an anti-block-diagonal indicator matrix with small amount of supervisory information to pursue the block-diagonal structure of the shared affinity matrix. 

Diversity “multiple description” (MD) source coding promises graceful degradation in the presence of a priori unknown number of erased packets in the channel. A simple coding scheme for the case of two packets consists of oversampling the source by a factor of two and delta-sigma quantization. This approach was applied successfully to JPEG-based image coding over a lossy packet network, where the interpolation and splitting into two descriptions are done in the discrete cosine transform (DCT) domain.

Video surveillance and its applications have become increasingly ubiquitous in modern daily life. In video surveillance system, video coding as a critical enabling technology determines the effective transmission and storage of surveillance videos. In order to meet the real-time or time-critical transmission requirements of video surveillance systems, the low-delay (LD) configuration of the advanced high efficiency video coding (HEVC) standard is usually used to encode surveillance videos.

RGB-thermal salient object detection (SOD) aims to segment the common prominent regions of visible image and corresponding thermal infrared image that we call it RGBT SOD. Existing methods don’t fully explore and exploit the potentials of complementarity of different modalities and multi-type cues of image contents, which play a vital role in achieving accurate results.

We propose a neural network model to estimate the current frame from two reference frames, using affine transformation and adaptive spatially-varying filters. The estimated affine transformation allows for using shorter filters compared to existing approaches for deep frame prediction. The predicted frame is used as a reference for coding the current frame.

Radial distortion has widely existed in the images captured by popular wide-angle cameras and fisheye cameras. Despite the long history of distortion rectification, accurately estimating the distortion parameters from a single distorted image is still challenging. The main reason is that these parameters are implicit to image features, influencing the networks to learn the distortion information fully.

The performance of ellipse fitting may significantly degrade in the presence of outliers, which can be caused by occlusion of the object, mirror reflection or other objects in the process of edge detection. In this paper, we propose an ellipse fitting method that is robust against the outliers, and thus maintaining stable performance when outliers can be present.

Gait recognition aims to recognize persons' identities by walking styles. Gait recognition has unique advantages due to its characteristics of non-contact and long-distance compared with face and fingerprint recognition. Cross-view gait recognition is a challenge task because view variance may produce large impact on gait silhouettes.

Pages

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel