The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams

Wenhui Xiao; Huiguo He; Tingting Wang; Hongyang Chao

It is a research hotspot to restore decoded videos with existing bitstreams by applying deep neural network to improve compression efficiency at decoder-end. Existing research has verified that the utilization of redundancy at decoder-end, which is underused by the encoder, can bring an increase of compression efficiency. However, most existing research neglects the abundant multi-scale information among video frames as a typical type of such redundancy. It remains an interesting yet challenging topic how to build an effective, interpretable and fast deep neural network for the purpose of using the multi-scale similarity at decoder-end and further enhancing compression efficiency. To this end, this paper considers the use of underused inter multi-scale information and proposes the Fast Multi-Scale Deep Decoder (Fast MSDD) for the state-of-the-art video coding standard HEVC. The advantages of Fast MSDD are three-fold. First, it achieves a higher coding efficiency without modifying any encoding algorithm. Second, Fast MSDD is interpretable based on the framework of using the underused redundancy. Third, it guarantees the model's inference speed while fully using the multi-scale similarity among video frames. Extensive experimental results verify Fast MSDD's effectiveness, interpretability, and computational efficiency. Fast MSDD obtains averagely 14.3%, 10.8%, 8.5% and 7.6% BD gains for AI, LP, LB and RA respectively. Compared with our previous work MSDD, Fast MSDD achieves increases of 59.3%, 49.1%, 61.0% and 29.3%. Meanwhile, 16.9%, 11.2%, 9.2% and 8.3% BD gains are observed on videos with scale changes, which validate the interpretability of the proposed method. Furthermore, Fast MSDD can save at most 56.3% time compared to MSDD.

SPS on Twitter

  • The DEGAS Webinar Series continues on 1 February when Dr. Francesca Parise presents "Tractable Network Intervention…
  • The Brain Space Initiative Talk Series continues this Friday, 27 January when Dr. Fan Lam presents "Quantitative, M…
  • CALL FOR PAPERS: The IEEE Transactions on Multimedia is accepting submissions for a Special Issue on When Multimedi…
  • As part of our 75th anniversary celebration, we're holding monthly trivia contests all year long! Enter now for the…
  • New SPS Webinar: On 13 February, join Dr. Harshit Gupta when he presents "CryoGAN: A New Reconstruction Paradigm fo…

SPS Videos

Signal Processing in Home Assistants


Multimedia Forensics

Careers in Signal Processing             


Under the Radar