The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams

You are here

IEEE Transactions on Multimedia

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams

By: 
Wenhui Xiao; Huiguo He; Tingting Wang; Hongyang Chao

It is a research hotspot to restore decoded videos with existing bitstreams by applying deep neural network to improve compression efficiency at decoder-end. Existing research has verified that the utilization of redundancy at decoder-end, which is underused by the encoder, can bring an increase of compression efficiency. However, most existing research neglects the abundant multi-scale information among video frames as a typical type of such redundancy. It remains an interesting yet challenging topic how to build an effective, interpretable and fast deep neural network for the purpose of using the multi-scale similarity at decoder-end and further enhancing compression efficiency. To this end, this paper considers the use of underused inter multi-scale information and proposes the Fast Multi-Scale Deep Decoder (Fast MSDD) for the state-of-the-art video coding standard HEVC. The advantages of Fast MSDD are three-fold. First, it achieves a higher coding efficiency without modifying any encoding algorithm. Second, Fast MSDD is interpretable based on the framework of using the underused redundancy. Third, it guarantees the model's inference speed while fully using the multi-scale similarity among video frames. Extensive experimental results verify Fast MSDD's effectiveness, interpretability, and computational efficiency. Fast MSDD obtains averagely 14.3%, 10.8%, 8.5% and 7.6% BD gains for AI, LP, LB and RA respectively. Compared with our previous work MSDD, Fast MSDD achieves increases of 59.3%, 49.1%, 61.0% and 29.3%. Meanwhile, 16.9%, 11.2%, 9.2% and 8.3% BD gains are observed on videos with scale changes, which validate the interpretability of the proposed method. Furthermore, Fast MSDD can save at most 56.3% time compared to MSDD.

SPS on Twitter

  • DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting… https://t.co/NLH2u19a3y
  • ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in… https://t.co/V6Z3wKGK1O
  • The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat… https://t.co/0aYPMDSWDj
  • CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special… https://t.co/NPCGrSjQbh
  • Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:… https://t.co/4xal7voFER

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel