RGB-T Salient Object Detection via Fusing Multi-Level CNN Features

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

RGB-T Salient Object Detection via Fusing Multi-Level CNN Features

Qiang Zhang; Nianchang Huang; Lin Yao; Dingwen Zhang; Caifeng Shan; Jungong Han

RGB-induced salient object detection has recently witnessed substantial progress, which is attributed to the superior feature learning capability of deep convolutional neural networks (CNNs). However, such detections suffer from challenging scenarios characterized by cluttered backgrounds, low-light conditions and variations in illumination. Instead of improving RGB based saliency detection, this paper takes advantage of the complementary benefits of RGB and thermal infrared images. Specifically, we propose a novel end-to-end network for multi-modal salient object detection, which turns the challenge of RGB-T saliency detection to a CNN feature fusion problem. To this end, a backbone network (e.g., VGG-16) is first adopted to extract the coarse features from each RGB or thermal infrared image individually, and then several adjacent-depth feature combination (ADFC) modules are designed to extract multi-level refined features for each single-modal input image, considering that features captured at different depths differ in semantic information and visual details. Subsequently, a multi-branch group fusion (MGF) module is employed to capture the cross-modal features by fusing those features from ADFC modules for a RGB-T image pair at each level. Finally, a joint attention guided bi-directional message passing (JABMP) module undertakes the task of saliency prediction via integrating the multi-level fused features from MGF modules. Experimental results on several public RGB-T salient object detection datasets demonstrate the superiorities of our proposed algorithm over the state-of-the-art approaches, especially under challenging conditions, such as poor illumination, complex background and low contrast.

SPS on Twitter

  • NEW SPS WEBINAR: On Tuesday, 13 December, join Dr. Qian Huang for "Deep Learning for All-in-Focus Imaging" - regist… https://t.co/4AVCabulyP
  • Join the SPS Membership Drive on Monday, 12 December, when SPS members, potential members, and the greater signal p… https://t.co/gtbisawJIK
  • The fundraising deadline to meet our 30 unique donations of US$10 or more is tonight — increase your impact for sig… https://t.co/KTzzCKnEMO
  • Happy ! Celebrate this global day of generosity and community action with the IEEE Foundation and… https://t.co/UvaytMFnQ1
  • The SPS Biomedical Imaging and Signal Processing Technical Committee Webinar Series continues on Tuesday, 6 Decembe… https://t.co/SYEEzoxIAK

SPS Videos

Signal Processing in Home Assistants


Multimedia Forensics

Careers in Signal Processing             


Under the Radar