Automatic caption generation is the task of producing a natural-language utterance (usually a sentence) that describes the visual content of an image. Practical applications of automatic caption generation include leveraging descriptions for image indexing or retrieval, and helping those with visual impairments by transforming visual signals into information that can be communicated via text-to-speech technology. The CVPR 2019 Conceptual Captions Challenge is based on two separate test sets:
T1) a blind test set that participants do not have direct access to.
T2) an open test set that participants have access to and can download.