Skip to main content

Dense Video Captioning Using Graph-Based Sentence Summarization

Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages.

Diversity Image Coding Using Irregular Interpolation

Diversity “multiple description” (MD) source coding promises graceful degradation in the presence of a priori unknown number of erased packets in the channel. A simple coding scheme for the case of two packets consists of oversampling the source by a factor of two and delta-sigma quantization. This approach was applied successfully to JPEG-based image coding over a lossy packet network, where the interpolation and splitting into two descriptions are done in the discrete cosine transform (DCT) domain.

Quantization Parameter Cascading for Surveillance Video Coding Considering All Inter Reference Frames

Video surveillance and its applications have become increasingly ubiquitous in modern daily life. In video surveillance system, video coding as a critical enabling technology determines the effective transmission and storage of surveillance videos. In order to meet the real-time or time-critical transmission requirements of video surveillance systems, the low-delay (LD) configuration of the advanced high efficiency video coding (HEVC) standard is usually used to encode surveillance videos.

Multi-Interactive Dual-Decoder for RGB-Thermal Salient Object Detection

RGB-thermal salient object detection (SOD) aims to segment the common prominent regions of visible image and corresponding thermal infrared image that we call it RGBT SOD. Existing methods don’t fully explore and exploit the potentials of complementarity of different modalities and multi-type cues of image contents, which play a vital role in achieving accurate results.

De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks

Machine learning techniques have been widely applied to various applications. However, they are potentially vulnerable to data poisoning attacks, where sophisticated attackers can disrupt the learning procedure by injecting a fraction of malicious samples into the training dataset. Existing defense techniques against poisoning attacks are largely attack-specific: they are designed for one specific type of attacks but do not work for other types, mainly due to the distinct principles they follow.

Machine Learning in Wavelet Domain for Electromagnetic Emission Based Malware Analysis

This paper presents a signal processing and machine learning (ML) based methodology to leverage Electromagnetic (EM) emissions from an embedded device to remotely detect a malicious application running on the device and classify the application into a malware family. We develop Fast Fourier Transform (FFT) based feature extraction followed by Support Vector Machine (SVM) and Random Forest (RF) based ML models to detect a malware. 

Beyond Universal Person Re-Identification Attack

Deep learning-based person re-identification (Re-ID) has made great progress and achieved high performance recently. In this paper, we make the first attempt to examine the vulnerability of current person Re-ID models against a dangerous attack method, i.e. , the universal adversarial perturbation (UAP) attack, which has been shown to fool classification models with a little overhead.

Total Utility Metric Based Dictionary Pruning for Sparse Hyperspectral Unmixing

Given a spectral library, sparse unmixing aims to estimate the fractional proportions in each pixel of a hyperspectral image scene. However, the ever-growing dimensionality of spectral dictionaries strongly limits the performance of sparse unmixing algorithms. In this study, we propose a novel dictionary pruning (DP) approach to improve the performance of sparse unmixing algorithms, making them more accurate and time-efficient.

Green Fluorescent Protein and Phase Contrast Image Fusion Via Detail Preserving Cross Network

In cell and molecular biology, the fusion of green fluorescent protein (GFP) and phase contrast (PC) images aims to generate a composite image, which can simultaneously display the functional information in the GFP image related to the molecular distribution of biological living cells and the structural information in the PC image such as nucleus and mitochondria. In this paper, we propose a detail preserving cross network (DPCN), which consists of a structural-guided functional feature extraction branch (SFFEB), a functional-guided structural feature extraction branch (FSFEB) and a detail preserving module (DPM), to address the GFP and PC image fusion issue.