IEEE TMM Article

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

IEEE TMM Article

Deep learning-based blind image deblurring plays an essential role in solving image blur since all existing kernels are limited in modeling the real world blur. Thus far, researchers focus on powerful models to handle the deblurring problem and achieve decent results. For this work, in a new aspect, we discover the great opportunity for image enhancement (e.g., deblurring) directly from RAW images and investigate novel neural network structures benefiting RAW-based learning.

The pedestrian attribute recognition aims at generating the structured description of pedestrian, which plays an important role in surveillance. However, it is difficult to achieve accurate recognition results due to diverse illumination, partial body occlusion and limited resolutions. Therefore, this paper proposes a comprehensive relationship framework for comprehensively describing and utilizing relations among attributes, describing different type of relations in the same dimension, and implementing complex transfers of relations in a GCN manner. 

In this paper, we present LensCast, a novel cross-layer video transmission framework for wireless networks, which seamlessly integrates millimeter wave (mmWave) lens multiple-input multiple-output (MIMO) with robust video transmission. LensCast is designed to exploit the video content diversity at the application layer, together with the spatial path diversity of lens antenna array at the physical layer, to achieve graceful video transmission performance under varying channel conditions.

Low light images suffer from a low dynamic range and severe noise due to low signal-to-noise ratio (SNR). In this paper, we propose joint contrast enhancement and noise reduction of low light images via just-noticeable-difference (JND) transform. We adopt the JND transform to achieve both contrast enhancement and noise reduction based on human visual perception.

Omnidirectional video, also known as 360-degree video, has become increasingly popular nowadays due to its ability to provide immersive and interactive visual experiences. However, the ultra high resolution and the spherical observation space brought by the large spherical viewing range make omnidirectional video distinctly different from traditional 2D video. To date, the video quality assessment (VQA) for omnidirectional video is still an open issue

Multi-modal hashing focuses on fusing different modalities and exploring the complementarity of heterogeneous multi-modal data for compact hash learning. However, existing multi-modal hashing methods still suffer from several problems, including: 1) Almost all existing methods generate unexplainable hash codes. They roughly assume that the contribution of each hash code bit to the retrieval results is the same, ignoring the discriminative information embedded in hash learning and semantic similarity in hash retrieval.

In multi-view subspace clustering, the low-rankness of the stacked self-representation tensor is widely accepted to capture the high-order cross-view correlation. However, using the nuclear norm as a convex surrogate of the rank function, the self-representation tensor exhibits strong connectivity with dense coefficients. When noise exists in the data, the generated affinity matrix may be unreliable for subspace clustering as it retains the connections across inter-cluster samples due to the lack of sparsity.

The video captioning task aims to describe video content using several natural-language sentences. Although one-step encoder-decoder models have achieved promising progress, the generations always involve many errors, which are mainly caused by the large semantic gap between the visual domain and the language domain and by the difficulty in long-sequence generation.

The prevailing use of both images and text to express opinions on the web leads to the need for multimodal sentiment recognition. Some commonly used social media data containing short text and few images, such as tweets and product reviews, have been well studied. However, it is still challenging to predict the readers’ sentiment after reading online news articles, since news articles often have more complicated structures, e.g., longer text and more images.

Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages.

Pages

SPS on Twitter

  • SPS is proud to participate in IEEE's new Multiple Society Discount Program! Join two or more participating societi… https://t.co/BnwcM7O7iu
  • IEEE Day is October 4th. Celebrate IEEE Day by attending a local event. Visit the IEEE Day site for a complete list… https://t.co/mESJHTn7ek
  • The Biomedical Imaging and Signal Processing Webinar Series continues on Tuesday, 4 October when Selin Aviyente pre… https://t.co/Gl4bHlWbqh
  • On Wednesday, 26 October, join Dr. DeLiang Wang for a new SPS webinar, "Neural Spectrospatial Filter" - register no… https://t.co/vUkiWC4Am8
  • Join Dr. Peilan Wang and Dr Jun Fang for "Channel State Information Acquisition for Intelligent Reflecting Surface-… https://t.co/jOhyA10xuG

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar