IEEE Transactions on Image Processing

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Hashing is a promising approach for compact storage and efficient retrieval of big data. Compared to the conventional hashing methods using handcrafted features, emerging deep hashing approaches employ deep neural networks to learn both feature representations and hash functions, which have been proven to be more powerful and robust in real-world applications. 

Fractional interpolation is used to provide sub-pixel level references for motion compensation in the interprediction of video coding, which attempts to remove temporal redundancy in video sequences. Traditional handcrafted fractional interpolation filters face the challenge of modeling discontinuous regions in videos, while existing deep learning-based methods are either designed for a single quantization parameter (QP), only generating half-pixel samples, or need to train a model for each sub-pixel position.

Recent studies have shown the effectiveness of using depth information in salient object detection. However, the most commonly seen images so far are still RGB images that do not contain the depth data. 

Deep convolutional neural networks (CNNs) have revolutionized the computer vision research and have seen unprecedented adoption for multiple tasks, such as classification, detection, and caption generation. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance.

Defocus blur detection is an important and challenging task in computer vision and digital imaging fields. Previous work on defocus blur detection has put a lot of effort into designing local sharpness metric maps. 

A novel scheme of edge detection based on the physical law of diffusion is presented in this paper. Though the most current studies are using data based methods such as deep neural networks, these methods on machine learning need big data of labeled ground truth as well as a large amount of resources for training. On the other hand, the widely used traditional methods are based on the gradient of the grayscale or color of images with using different sorts of mathematical tools to accomplish the mission.

Retrieving specific persons with various types of queries, e.g., a set of attributes or a portrait photo has great application potential in large-scale intelligent surveillance systems. In this paper, we propose a richly annotated pedestrian (RAP) dataset which serves as a unified benchmark for both attribute-based and image-based person retrieval in real surveillance scenarios. Typically, previous datasets have three improvable aspects, including limited data scale and annotation types, heterogeneous data source, and controlled scenarios.

Image annotation aims to annotate a given image with a variable number of class labels corresponding to diverse visual concepts. In this paper, we address two main issues in large-scale image annotation: 1) how to learn a rich feature representation suitable for predicting a diverse set of visual concepts ranging from object, scene to abstract concept and 2) how to annotate an image with the optimal number of class labels.

Zero-shot learning (ZSL) for visual recognition aims to accurately recognize the objects of unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. However, the semantic gap across visual features and their underlying semantics is still a big obstacle in ZSL. Conventional ZSL methods construct that the mapping typically focus on the original visual features that are independent of the ZSL tasks, thus degrading the prediction performance.

The IEEE Transactions on Image Processing covers novel theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications.

Pages

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel