Image, Video, and Multidimensional Signal Processing

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Drones, or general UAVs, equipped with cameras have been fast deployed to a wide range of applications, including agricultural, aerial photography, fast delivery, and surveillance. Consequently, automatic understanding of visual data collected from these platforms become highly demanding, which brings computer vision to drones more and more closely. We are excited to present a large-scale benchmark with carefully annotated ground-truth for various important computer vision tasks, named VisDrone, to make vision meet drones.

Automatic caption generation is the task of producing a natural-language utterance (usually a sentence) that describes the visual content of an image. Practical applications of automatic caption generation include leveraging descriptions for image indexing or retrieval, and helping those with visual impairments by transforming visual signals into information that can be communicated via text-to-speech technology. The CVPR 2019 Conceptual Captions Challenge is based on two separate test sets:

T1) a blind test set that participants do not have direct access to. 

We will organize the first Learning from Imperfect Data (LID) challenge on object semantic segmentation and scene parsing, which includes two competition tracks:

Track1: Object semantic segmentation with image-level supervision

Track2: Scene parsing with point-based supervision

Habitat Challenge is an autonomous navigation challenge that aims to benchmark and accelerate progress in embodied AI. In its first iteration, Habitat Challenge 2019 is based on the PointGoal task defined in Anderson et. al. We will have two tracks for the PointGoal task:

  1. RGB track: input modality for agent is RGB image.
  2. RGBD track: input modalities for agent are RGB image and Depth.

The main goal of the CARLA Autonomous Driving Challenge is to achieve driving proficiency in realistic traffic situations.

Immense opportunity exists to make transportation systems smarter, based on sensor data from traffic, signaling systems, infrastructure, and transit.  Unfortunately, progress has been limited for several reasons — among them, poor data quality, missing data labels, and the lack of high-quality models that can convert the data into actionable insights There is also a need for platforms that can handle analysis from the edge to the cloud, which will accelerate the development and deployment of these models.

We present a new large-scale dataset focusing on semantic understanding of person. The dataset is an order of magnitude larger and more challenge than similar previous attempts that contains 50,000 images with elaborated pixel-wise annotations with 19 semantic human part labels and 2D human poses with 16 key points. The images collected from the real-world scenarios contain human appearing with challenging poses and views, heavily occlusions, various appearances and low-resolutions.

Object detection is of significant value to the Computer Vision and Pattern Recognition communities as it is one of the fundamental vision problems. In this workshop, we will introduce two new benchmarks for the object detection task: Objects365 and CrowdHuman, both of which are designed and collected in the wild. Objects365 benchmark targets to address the large-scale detection with 365 object categories.

The domain of image compression has traditionally used approaches discussed in forums such as ICASSP, ICIP and other very specialized venues like PCS, DCC, and ITU/MPEG expert groups. This workshop and challenge will be the first computer-vision event to explicitly focus on these fields. Many techniques discussed at computer-vision meetings have relevance for lossy compression.


In-depth analysis of the state-of-the-art in video object segmentation.



SPS on Twitter

SPS Videos

Signal Processing in Home Assistants


Multimedia Forensics

Careers in Signal Processing             


Under the Radar