IEEE Transactions on Multimedia

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

This paper deals with the design of a sensing matrix along with a sparse recovery algorithm by utilizing the probability-based prior information for compressed sensing systems. With the knowledge of the probability for each atom of the dictionary being used, a diagonal weighted matrix is obtained and then the sensing matrix is designed by minimizing a weighted function such that the Gram of the equivalent dictionary is as close to the Gram of dictionary as possible.

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids.

Acoustic event detection deals with the acoustic signals to determine the sound type and to estimate the audio event boundaries. Multi-label classification based approaches are commonly used to detect the frame wise event types with a median filter applied to determine the happening acoustic events. However, the multi-label classifiers are trained only on the acoustic event types ignoring the frame position within the audio events.

Outdoor images are subject to degradation regarding contrast and color because atmospheric particles scatter incoming light to a camera. Existing haze models that employ model-based dehazing methods cannot avoid the dehazing artifacts. These artifacts include color distortion and overenhancement around object boundaries because of the incorrect transmission estimation from a depth error in the skyline and the wrong haze information, especially in bright objects.

The vector graphics with gradient mesh can be attributed to their compactness and scalability; however, they tend to fall short when it comes to real-time editing due to a lack of real-time rasterization and an efficient editing tool for image details. In this paper, we encode global manipulation geometries and local image details within a hybrid vector structure, using parametric patches and detailed features for localized and parallelized thin-plate spline interpolation in order to achieve good compressibility, interactive expressibility, and editability.

The analysis of sound information is helpful for audio surveillance, multimedia information retrieval, audio tagging, and forensic applications. Environmental audio scene recognition (EASR) and sound event recognition (SER) for audio surveillance are challenging tasks due to the presence of multiple sound sources, background noises, and the existence of overlapping or polyphonic contexts.

Smoke detection plays an important role in industrial safety warning systems and fire prevention. Due to the complicated changes in the shape, texture, and color of smoke, identifying the smoke from a given image still remains a substantial challenge, and this has accordingly aroused a considerable amount of research attention recently.

We propose an approach for digitally altering people's outfits in images. Given images of a person and a desired clothing style, our method generates a new clothing item image. The new item displays the color and pattern of the desired style while geometrically mimicking the person's original item. Through superimposition, the altered image is made to look as if the person is wearing the new item.

In this paper, a Hessian matrix based multi-focus image fusion method is proposed. First, the integral map is introduced for fast compute the Hessian matrix of source images at different scales, and the multi-scale Hessian matrix of source image is obtained. Second, the multi-scale Hessian matrix is used to decompose each source image into two kinds of regions: the feature and background regions.

To improve the parallel processing capability of video coding, the emerging high efficiency video coding (HEVC) standard introduces two parallel techniques, i.e., Wavefront Parallel Processing (WPP) and  Tiles , to make it much more parallel-friendly than its predecessors. However, these two techniques are designed to explore coarse-grained parallelism in HEVC encoding on multicore Central Processing Unit (CPU) platforms.


SPS on Twitter

  • On 15 September 2022, we are excited to partner with and to bring you a webinar and roundtable,…
  • The SPS Webinar Series continues on Monday, 22 August when Dr. Yu-Huan Wu and Dr. Shanghua Gao present “Towards Des…
  • CALL FOR PAPERS: The IEEE/ACM Transactions on Audio, Speech, and Language Processing is now accepting submissions f…
  • DEADLINE EXTENDED: The IEEE Journal of Selected Topics in Signal Processing is now accepting submissions for a Spec…
  • Our Information Forensics and Security Webinar Series continues on Tuesday, 23 August when Dr. Anderson Rocha prese…

SPS Videos

Signal Processing in Home Assistants


Multimedia Forensics

Careers in Signal Processing             


Under the Radar