In recent years the ubiquity of mobile computing platforms such as smartphones and tablet devices has rapidly increased. These devices provide a range of interaction in an untethered environment unimaginable a decade previously. With this ability to interact with services and individuals, comes the need to accurately authenticate the identity of the person requesting the transaction, many of which may carry financial or legally-binding instruction.
The IEEE Transactions on Image Processing covers novel theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications.
Zero-shot learning (ZSL) for visual recognition aims to accurately recognize the objects of unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. However, the semantic gap across visual features and their underlying semantics is still a big obstacle in ZSL. Conventional ZSL methods construct that the mapping typically focus on the original visual features that are independent of the ZSL tasks, thus degrading the prediction performance.
Image annotation aims to annotate a given image with a variable number of class labels corresponding to diverse visual concepts. In this paper, we address two main issues in large-scale image annotation: 1) how to learn a rich feature representation suitable for predicting a diverse set of visual concepts ranging from object, scene to abstract concept and 2) how to annotate an image with the optimal number of class labels.