JDNet: A Joint-Learning Distilled Network for Mobile Visual Food Recognition

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

JDNet: A Joint-Learning Distilled Network for Mobile Visual Food Recognition

By: 
Heng Zhao; Kim-Hui Yap; Alex Chichung Kot; Lingyu Duan

Visual food recognition on mobile devices has attracted increasing attention in recent years due to its roles in individual diet monitoring and social health management and analysis. Existing visual food recognition approaches usually use large server-based networks to achieve high accuracy. However, these networks are not compact enough to be deployed on mobile devices. Even though some compact architectures have been proposed, most of them are unable to obtain the performance of full-size networks. In view of this, this paper proposes a Joint-learning Distilled Network (JDNet) that targets to achieve a high food recognition accuracy of a compact student network by learning from a large teacher network, while retaining a compact network size. Compared to the conventional one-directional knowledge distillation methods, the proposed JDNet has a novel joint-learning framework where the large teacher network and the small student network are trained simultaneously, by leveraging on different intermediate layer features in both network. JDNet introduces a new Multi-Stage Knowledge Distillation (MSKD) for simultaneous student-teacher training at different levels of abstraction. A new Instance Activation Learning (IAL) is also proposed to jointly train student and teacher on instance activation map of each training sample. Experimental results show that the trained student model is able to achieve a state-of-the-art Top-1 recognition accuracy on the benchmark UECFood-256 and Food-101 datasets at 84.0% and 91.2%, respectively, and retaining a 4x smaller network size for mobile deployment.

SPS on Twitter

  • ONE WEEK OUT: The Brain Space Initiative Talk Series continues on Friday, 29 January when Juan (Helen) Zhou present… https://t.co/AUZgTUUG81
  • DEADLINE EXTENDED: There's still time to submit your proposal to host the 2023 IEEE International Symposium on Biom… https://t.co/IyGtA3abJj
  • The 35th Picture Coding Symposium is heading to Bristol, UK and is now accepting papers for their June event! Head… https://t.co/sn8KNrKrBA
  • Deadline to submit to has been extended to 25 January! https://t.co/INpP2KrzoH
  • CALL FOR PAPERS: The 2021 IEEE International Conference on Autonomous Systems is now accepting papers for their Aug… https://t.co/IodWdsdLKc

SPS Videos


Signal Processing in Home Assistants

 


Multimedia Forensics


Careers in Signal Processing             

 


Under the Radar