SPS Webinar: Quality Assessment for Omnidirectional Video/Images: Spatio-Temporal Distortion Modeling and a Deep Learning Solution

Date: 12 April 2023
Time: 9:30 AM ET (New York Time)
Presenter(s): Dr. Pan Gao, Dr. Alijosa Smolic
Download article: Article freely available on the day of the webinar for 48 hours.

SPS Webinar

Abstract

Omnidirectional video, also known as 360-degree video, has become increasingly popular nowadays due to its ability to provide immersive and interactive visual experiences. However, the ultra-high resolution and the spherical observation space brought by the large spherical viewing range make omnidirectional video distinctly different from traditional 2D video. To date, video quality assessment (VQA) for omnidirectional video is still an open issue.

This talk contains two parts. The first part introduces a spatio–temporal modeling approach. In this approach, we firstly construct a spatio–temporal quality assessment unit to evaluate the average distortion in temporal dimension at the eye fixation level. Then, we give a detailed solution of how to integrate three existing spatial VQA metrics into this approach. Besides, cross-format omnidirectional video distortion measurement is also discussed. Based on the modeling approach, a full reference objective quality assessment metric for omnidirectional video is derived, namely OV-PSNR. Experimental results show that OV-PSNR greatly improves the prediction performance of existing VQA metrics for omnidirectional video.

The second part of this talk will introduce our attempt to using deep learning for blind omnidirectional image quality assessment. We will first talk about the challenges currently faced in this filed, and then provide the details about our proposed model BOIQA. Our model contains two-stage training: one is to pre-train the model to obtain an objective error map with reference image, and the other one is to train the model to predict the score with the inferenced objective error map, where we employ a spatial weight map as a prior to predict human sensitivity. Finally, we show the performance of our BOIQA on the datasets CVIQ and OIQA.

Presenter's Biography

Pan GaoPan Gao received the Ph.D. degree in electronic engineering from University of Southern Queensland (USQ), Toowoomba, Australia, in 2017.

Since 2016, he has been with the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China, where he is currently an Associate Professor. From 2018 to 2019, he was a Postdoctoral Research Fellow at the School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland, working on the V-SENSE project.

Dr. Gao has published more than 40 papers in prestigious journals, such as IEEE T-MM, T-IP and T-CSVT, etc., and top-ranked conferences, such as IJCAI, ICCV, ICASSP, ICIP, etc. He is a co-organizer of a workshop on Immersive Media Compression in ICME 2023. His research interests include video compression, video/image quality assessment, computer vision, and deep learning.

 

 

 

Alijosa SmolicAlijosa Smolic received his PhD from Aachen University of Technology (RWTH) in 2001 and was lecturer at the Technical University of Berlin (2003-2009) and ETH Zurich (2009-2016).

He has been a lecturer in AR/VR at Lucerne University of Applied Sciences and Arts since 2022. Before that, he was SFI Research Professor of Creative Technologies at Trinity College Dublin (TCD, 2016-2021), where he was heading the research group V-SENSE, on visual computing, combining computer vision, computer graphics and media technology, to extend the dimensions of visual sensation. His research interests include immersive technologies such as AR, VR, volumetric video, 360/omni-directional video, light-fields, and VFX/animation, with a special focus on deep learning in visual computing. Before joining TCD, Dr. Smolic was with Disney Research Zurich as Senior Research Scientist and Head of the Advanced Video Technology group (2009-2016), and with the Fraunhofer Heinrich-Hertz-Institut (HHI), Berlin, also heading a research group as Scientific Project Manager (2001-2009). At Disney Research he led over 50 R&D projects in the area of visual computing that have resulted in numerous publications and patents, as well as technology transfers to a range of Disney business units.

Dr. Smolic served as Associate Editor of the IEEE Transactions on Image Processing and the Signal Processing: Image Communication journal. He was Guest Editor for the Proceedings of the IEEE, IEEE Transactions on CSVT, IEEE Signal Processing Magazine, and other scientific journals. Dr. Smolic is also co-founder of the start-up company Volograms, which commercializes volumetric video content creation. He received the IEEE ICME Star Innovator Award 2020 for his contributions to volumetric video content creation, TCD's Campus Company Founders Award 2020, as well as several best paper awards.