Multi-Task Learning for Acoustic Event Detection Using Event and Frame Position Information

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Multi-Task Learning for Acoustic Event Detection Using Event and Frame Position Information

By: 
Xianjun Xia; Roberto Togneri; Ferdous Sohel; Yuanjun Zhao; Defeng Huang

Smoke detection plays an important role in industrial safety warning systems and fire prevention. Due to the complicated changes in the shape, texture, and color of smoke, identifying the smoke from a given image still remains a substantial challenge, and this has accordingly aroused a considerable amount of research attention recently. To address the problem, we devise a new deep dual-channel neural network (DCNN) for smoke detection. In contrast to popular deep convolutional networks (e.g., Alex-Net, VGG-Net, Res-Net, and Dense-Net and the DNCNN that is specifically devoted to detecting smoke), our proposed end-to-end network is mainly composed of dual channels of deep subnetworks. In the first subnetwork, we sequentially connect multiple convolutional layers and max-pooling layers. Then, we selectively append the batch normalization layer to each convolutional layer for overfitting reduction and training acceleration. The first subnetwork is shown to be good at extracting the detailed information of smoke, such as texture. In the second subnetwork, in addition to the convolutional, batch normalization, and max-pooling layers, we further introduce two important components. One is the skip connection for avoiding the vanishing gradient and improving the feature propagation. The other is the global average pooling for reducing the number of parameters and mitigating the overfitting issue. The second subnetwork can capture the base information of smoke, such as contours. We finally deploy a concatenation operation to combine the aforementioned two deep subnetworks to complement each other. Based on the augmented data obtained by rotating the training images, our proposed DCNN can promptly and stably converge to the perfect performance. Experimental results conducted on the publicly available smoke detection database verify that the proposed DCNN has attained a very high detection rate that exceeds 99.5% on average, superior to state-of-the-art relevant competitors. 

SPS on Twitter

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar