Fast Depth and Inter Mode Prediction for Quality Scalable High Efficiency Video Coding

You are here

IEEE Transactions on Multimedia

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Fast Depth and Inter Mode Prediction for Quality Scalable High Efficiency Video Coding

By: 
Dayong Wang; Yu Sun; Ce Zhu; Weisheng Li; Frederic Dufaux

The scalable high efficiency video coding (SHVC) is an extension of high efficiency video coding (HEVC). It introduces multiple layers and inter-layer prediction, thus significantly increases the coding complexity on top of the already complicated HEVC encoder. In inter prediction for quality SHVC, in order to determine the best possible mode at each depth level, a coding tree unit can be recursively split into four depth levels, including merge mode, inter2N×2N, inter2N×N, interN×2N, interN×N, inter2N×nU, inter2N×nD, internL×2N and internRx×2N, intra modes and inter-layer reference (ILR) mode. This can obtain the highest coding efficiency, but also result in very high coding complexity. Therefore, it is crucial to improve coding speed while maintaining coding efficiency. In this research, we have proposed a new depth level and inter mode prediction algorithm for quality SHVC. First, the depth level candidates are predicted based on inter-layer correlation, spatial correlation and its correlation degree. Second, for a given depth candidate, we divide mode prediction into square and non-square mode predictions respectively. Third, in the square mode prediction, ILR and merge modes are predicted according to depth correlation, and early terminated whether residual distribution follows a Gaussian distribution. Moreover, ILR mode, merge mode and inter2N×2N are early terminated based on significant differences in Rate Distortion (RD) costs. Fourth, if the early termination condition cannot be satisfied, non-square modes are further predicted based on significant differences in expected values of residual coefficients. Finally, inter-layer and spatial correlations are combined with residual distribution to examine whether to early terminate depth selection. Experimental results have demonstrated that, on average, the proposed algorithm can achieve a time saving of 71.14%, with a bit rate increase of 1.27%.

SPS on Twitter

  • DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting… https://t.co/NLH2u19a3y
  • ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in… https://t.co/V6Z3wKGK1O
  • The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat… https://t.co/0aYPMDSWDj
  • CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special… https://t.co/NPCGrSjQbh
  • Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:… https://t.co/4xal7voFER

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel