SPS Webinar: Rapid, Accurate, and Explainable Video Quality Assessment for User-Generated Content
Date: 6 September 2024
Time: 10:00 AM ET (New York Time)
Presenter(s): Dr. Zhengzhong Tu
Based on the IEEE Xplore® article:
"RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content", published in the IEEE Open Journal of Signal Processing, June 2021.
Download article: Original article is open access, and publicly available to all for download.
Abstract
In the era of social media, the explosion of user-generated content has posed significant challenges in video quality assessment. Traditional models struggle with the diverse and complex nature of this content, which often includes authentic, sometimes commingled spatial and temporal distortions. In this presentation, our presenter will introduce their Rapid and Accurate Video Quality Evaluator, a novel model designed to efficiently predict the quality of user-generated videos. This model leverages the strengths of both natural scene statistics and deep learning features to capture spatial and temporal distortions effectively. The model offers a significant speed advantage, performing up to 20 times faster than state-of-the-art methods while maintaining high accuracy. Experimental results on large-scale user-generated video databases demonstrate the model's superior performance and computational efficiency, making it highly promising for real-time applications. Additionally, our presenter will briefly discuss their most recent effort – the Comprehensive Video Quality Evaluator, which offers a multi-faceted approach to video quality assessment from technical, aesthetic, and semantic perspectives, enhancing the explainability and robustness of quality predictions. Finally, he will provide some concluding remarks on some takeaways and envision future opportunities in this field.
Biography
Zhengzhong Tu (M’18) received the B.S. and M.S. degrees in electrical engineering from Fudan University, Shanghai, China, in 2016 and 2018, respectively. He obtained his Ph.D. degree in electrical and computer engineering from the University of Texas, Austin, Texas, USA in 2022. His research interests include generative AI, multimodal AI, and their applications in autonomous driving, robotics, and healthcare.
He is currently an Assistant Professor of computer science and engineering at Texas A&M University, College Station, Texas, USA. He worked as a Research Engineer at Google Research from 2022 to 2024, focusing on image enhancement and computational photography. Prior to this, he interned across various product areas within Alphabet, such as Google Research, Pixel Camera, and the YouTube Team.
Dr. Tu has authored over twenty papers in top-tier computer vision venues, including CVPR, ECCV, ICCV, WACV, CoRL, IEEE Transactions on Image Processing, IEEE Open Journal of Signal Processing, IEEE Signal Processing Letters, etc. His awards and honors include the CVPR 2022 Best Paper Nomination Award, the First Place Award for the AI4Streaming workshop with CVPR 2024, and headlines in Google Research's annual blog with featuring in Google I/O media outlets.