TMM Featured Articles

Semantic-Driven Interpretable Deep Multi-Modal Hashing for Large-Scale Multimedia Retrieval

TMM Volume 23 | 2021

Multi-modal hashing focuses on fusing different modalities and exploring the complementarity of heterogeneous multi-modal data for compact hash learning. However, existing multi-modal hashing methods still suffer from several problems, including: 1) Almost all existing methods generate unexplainable hash codes. They roughly assume that the contribution of each hash code bit to the retrieval results is the same, ignoring the discriminative information embedded in hash learning and semantic similarity in hash retrieval.

On Reliable Multi-View Affinity Learning for Subspace Clustering

TMM Volume 23 | 2021

TMM Featured Articles

In multi-view subspace clustering, the low-rankness of the stacked self-representation tensor is widely accepted to capture the high-order cross-view correlation. However, using the nuclear norm as a convex surrogate of the rank function, the self-representation tensor exhibits strong connectivity with dense coefficients. When noise exists in the data, the generated affinity matrix may be unreliable for subspace clustering as it retains the connections across inter-cluster samples due to the lack of sparsity.

Deep Reinforcement Polishing Network for Video Captioning

TMM Volume 23 | 2021

TMM Featured Articles

The video captioning task aims to describe video content using several natural-language sentences. Although one-step encoder-decoder models have achieved promising progress, the generations always involve many errors, which are mainly caused by the large semantic gap between the visual domain and the language domain and by the difficulty in long-sequence generation.

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

TMM Volume 23 | 2021

TMM Featured Articles

The prevailing use of both images and text to express opinions on the web leads to the need for multimodal sentiment recognition. Some commonly used social media data containing short text and few images, such as tweets and product reviews, have been well studied. However, it is still challenging to predict the readers’ sentiment after reading online news articles, since news articles often have more complicated structures, e.g., longer text and more images.

Dense Video Captioning Using Graph-Based Sentence Summarization

TMM Volume 23 | 2021

TMM Featured Articles

Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages.

Salient Object Detection by Fusing Local and Global Contexts

TMM Volume 23 | 2021

TMM Featured Articles

Benefiting from the powerful discriminative feature learning capability of convolutional neural networks (CNNs), deep learning techniques have achieved remarkable performance improvement for the task of salient object detection (SOD) in recent years.

A Brain-Media Deep Framework Towards Seeing Imaginations Inside Brains

TMM Volume 23 | 2021

TMM Featured Articles

While current research on multimedia is essentially dealing with the information derived from our observations of the world, internal activities inside human brains, such as imaginations and memories of past events etc., could become a brand new concept of multimedia, for which we coin as “brain-media”.

A New Image Compression Algorithm Based on Non-Uniform Partition and U-System

TMM Volume 23 | 2021

TMM Featured Articles

JPEG lossy image compression is a still image compression algorithm model that is currently widely used in major network media. However, it is unsatisfactory in the quality of compressed images at low bit rates. The objective of this paper is to improve the quality of compressed images and suppress blocking artifacts by improving the JPEG image compression model at low bit rates.

Adversarial Learning for Personalized Tag Recommendation

TMM Volume 23 | 2021

TMM Featured Articles

We have recently seen great progress in image classification due to the success of deep convolutional neural networks and the availability of large-scale datasets. Most of the existing work focuses on single-label image classification. However, there are usually multiple tags associated with an image. The existing works on multi-label classification are mainly based on lab curated labels.

Auto-Embedding Generative Adversarial Networks For High Resolution Image Synthesis

TMM Volume 21 Issue 12

TMM Featured Articles

Generating images via a generative adversarial network (GAN) has attracted much attention recently. However, most of the existing GAN-based methods can only produce low-resolution images of limited quality. Directly generating high-resolution images using GANs is nontrivial, and often produces problematic images with incomplete objects.

webinar_IFSTC_general.jpg

SPS Webinar: Temporal Context Mining for Learned Video Compression

webinar_blog_nl_lg.jpg

SPS ISAC-TWG Webinar: Sensing with Random Communication Signals

webinar_1.jpg

SPS SPTM TC Webinar: A Deep Dive into Recent Advances in Stochastic Approximation

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

TMM Featured Articles

Transactions on Multimedia

Publications & Resources

For Authors

2025 Certified Chapter Banner (iStock-861165876) (1).jpg

general_get_involved_tc_article_full.jpg

short_course_general.jpg

Top Reasons to Join SPS Today!