TMM Volume 23 | 2021 | IEEE Signal Processing Society

2022

TMM Volume 23 | 2021

Semantic-Driven Interpretable Deep Multi-Modal Hashing for Large-Scale Multimedia Retrieval

TMM Volume 23 | 2021

Multi-modal hashing focuses on fusing different modalities and exploring the complementarity of heterogeneous multi-modal data for compact hash learning. However, existing multi-modal hashing methods still suffer from several problems, including: 1) Almost all existing methods generate unexplainable hash codes. They roughly assume that the contribution of each hash code bit to the retrieval results is the same, ignoring the discriminative information embedded in hash learning and semantic similarity in hash retrieval.

On Reliable Multi-View Affinity Learning for Subspace Clustering

TMM Volume 23 | 2021

TMM Featured Articles

In multi-view subspace clustering, the low-rankness of the stacked self-representation tensor is widely accepted to capture the high-order cross-view correlation. However, using the nuclear norm as a convex surrogate of the rank function, the self-representation tensor exhibits strong connectivity with dense coefficients. When noise exists in the data, the generated affinity matrix may be unreliable for subspace clustering as it retains the connections across inter-cluster samples due to the lack of sparsity.

Deep Reinforcement Polishing Network for Video Captioning

TMM Volume 23 | 2021

TMM Featured Articles

The video captioning task aims to describe video content using several natural-language sentences. Although one-step encoder-decoder models have achieved promising progress, the generations always involve many errors, which are mainly caused by the large semantic gap between the visual domain and the language domain and by the difficulty in long-sequence generation.

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

TMM Volume 23 | 2021

TMM Featured Articles

The prevailing use of both images and text to express opinions on the web leads to the need for multimodal sentiment recognition. Some commonly used social media data containing short text and few images, such as tweets and product reviews, have been well studied. However, it is still challenging to predict the readers’ sentiment after reading online news articles, since news articles often have more complicated structures, e.g., longer text and more images.

Dense Video Captioning Using Graph-Based Sentence Summarization

TMM Volume 23 | 2021

TMM Featured Articles

Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages.

Salient Object Detection by Fusing Local and Global Contexts

TMM Volume 23 | 2021

TMM Featured Articles

Benefiting from the powerful discriminative feature learning capability of convolutional neural networks (CNNs), deep learning techniques have achieved remarkable performance improvement for the task of salient object detection (SOD) in recent years.

A Brain-Media Deep Framework Towards Seeing Imaginations Inside Brains

TMM Volume 23 | 2021

TMM Featured Articles

While current research on multimedia is essentially dealing with the information derived from our observations of the world, internal activities inside human brains, such as imaginations and memories of past events etc., could become a brand new concept of multimedia, for which we coin as “brain-media”.

A New Image Compression Algorithm Based on Non-Uniform Partition and U-System

TMM Volume 23 | 2021

TMM Featured Articles

JPEG lossy image compression is a still image compression algorithm model that is currently widely used in major network media. However, it is unsatisfactory in the quality of compressed images at low bit rates. The objective of this paper is to improve the quality of compressed images and suppress blocking artifacts by improving the JPEG image compression model at low bit rates.

Adversarial Learning for Personalized Tag Recommendation

TMM Volume 23 | 2021

TMM Featured Articles

We have recently seen great progress in image classification due to the success of deep convolutional neural networks and the availability of large-scale datasets. Most of the existing work focuses on single-label image classification. However, there are usually multiple tags associated with an image. The existing works on multi-label classification are mainly based on lab curated labels.

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

TMM Volume 23 | 2021

The mnemonic descent method (MDM) algorithm is the first end-to-end recurrent convolutional system for high-accuracy face alignment. However, the heavy computational complexity and high memory access demands make it difficult to satisfy the requirements of real-time applications. To address this problem, an improved MDM (I-MDM) algorithm is proposed for efficient hardware implementation based on several hardware-oriented optimizations.

light_bulb_general.jpg

Call for Proposals for 2026 Signal Processing Cup

lrac2025_vertical_text (2).png

Take Part in the 2025 Low-Resource Audio Codec (LRAC) Challenge

webinar_IFSTC_general.jpg

SPS SPTM TC Webinar: Unlimited Sensing: Redefining Digital Acquisition, Representation and Signal Processing

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

TMM Volume 23 | 2021

Transactions on Multimedia

Publications & Resources

For Authors

lrac2025_vertical_text (2).png

congratulations.jpg

CAI_2027_Call_for_Proposals.png

Top Reasons to Join SPS Today!

Semantic-Driven Interpretable Deep Multi-Modal Hashing for Large-Scale Multimedia Retrieval

On Reliable Multi-View Affinity Learning for Subspace Clustering

Deep Reinforcement Polishing Network for Video Captioning

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

Dense Video Captioning Using Graph-Based Sentence Summarization

Salient Object Detection by Fusing Local and Global Contexts

A Brain-Media Deep Framework Towards Seeing Imaginations Inside Brains

A New Image Compression Algorithm Based on Non-Uniform Partition and U-System

Adversarial Learning for Personalized Tag Recommendation

A 460 GOPS/W Improved Mnemonic Descent Method-Based Hardwired Accelerator for Face Alignment

Pages

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

TMM Volume 23 | 2021

Search form

You are here

Transactions on Multimedia

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

Pages

SPS Social Media

IEEE SPS Educational Resources