Signal Processing in a Multimedia World: How the Science Behind our Digital Life Powers our Apps and Gadgets

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Signal Processing in a Multimedia World: How the Science Behind our Digital Life Powers our Apps and Gadgets

Monday, 29 August, 2016
By: 
Maximo Cobos

Maximo Cobos, professor, University of Valencia

Our modern life takes place in a world that is full of sounds, images and videos. You probably wake up by listening to some relaxing tone on your phone, listen to some music while you go running, and send a voice message to say happy birthday to a close relative. Then, you probably receive a notification to check the cool photos your friend has just sent you from yesterday's party and, while you’re checking the news, a trailer from a wonderful movie appears on your screen. It’s still early in the morning and you’ve already experienced the ubiquity of our multimedia world. The day, however, hasn’t even started and you'll experience more music, more photos, more movies and more documents that will be transferred through a data network to one or several of your electronic devices.

Just as humans take for granted that there is a heart continuously beating inside us that enables us to live our lives as we do, signal processing is the science that lets us enjoy our multimedia world transparently and naturally. We could even say signal processing is the science that makes our senses digital: a science that lets us capture the information we hear and see for being later stored, sent and reproduced as many times as we like through our laptops, smartphones or tablets. However, when it comes to sound, images and video, signal processing isn’t only about storing, sending or reproducing content. It’s also inside the content creation stage itself, helping musicians, photographers and videographers develop their creativity through the use of tools that use signal processing to manipulate multimedia information in a meaningful way.

Data Compression

One of the challenges that signal processing faces when dealing with multimedia applications is that audiovisual signals carry a lot of information, especially when they are digitally acquired at a very high quality. Analog-to-Digital conversion is the process that allows for a digital representation of the analog acoustic and light signals we perceive through our senses, translating sounds and images to numbers by means of an appropriate sensor and conversor. The higher the amount of numbers we take, the higher the quality of the digitized signal, but such quality comes at the expense of producing a huge amount of data. Storing, manipulating and sending such a high amount of data isn’t usually a good idea, since the memory, processing power and communication bandwidth of electronic devices is always limited. Signal processing for data compression is aimed at reducing the size of the data to allow applications to manipulate information without unnecessarily wasting resources and letting users experience multimedia content without annoying delays. Data compression allows us to store hundreds of music songs, photos and videos in our devices. Additionally, it helps to accelerate the media content present in most webpages and to experience music and videos in real-time from content providers. MP3 audio players, digital television or video streaming in YouTube exist thanks to the advances of signal processing in the data compression field.

Multimedia Applications ImageData compression can be achieved either without any loss of information (lossless compression) or by permitting some information loss (lossy compression).  In lossless formats, the information recovered from the compressed data is exactly the same as the raw original data. However, the reduction that can be achieved with lossless formats is usually quite limited. In contrast, lossy compression techniques reduce the size of audio, photos and videos by discarding information in such a way that humans cannot easily perceive the difference between the original and the compressed data. The good thing is that the reduction rate obtained by lossy formats is high and the data can be easily adapted to a range of applications and devices. But, how does signal processing distinguish between perceptually relevant and irrelevant data?

Audio Compression

When the data comes from a sound signal such as music, signal processing achieves data compression by following a sophisticated process where the sound data is analyzed in terms of the frequencies that make up the sound. The human hearing system is not equally sensitive to all the audible frequencies: that’s why two sounds having the same power are not perceived equally loud, especially if one of them has a very low or a very high pitch. Moreover, the interaction among the frequencies making up a given sound in our hearing system usually leads to masking effects, which make some of these frequencies inaudible in the presence of others. Perceptual audio coding takes advantage of this fact to reduce the size of audio data by discarding information corresponding to irrelevant frequencies that are being masked. Therefore, even though we’re losing information from our raw sound data, this process is performed in a smart way, making the compressed sound almost indistinguishable from the original. Signal processing provides the mathematical framework necessary to perform this process as efficiently as possible.

Digital audio streams in most current multimedia technologies and broadcasting formats make use of audio coding standards that are based on the above principles, such as the well-known Advanced Audio Coding (AAC), which is the default audio standard for YouTube, iPhone, or PlayStation.

Image and Video Compression

Image compression techniques also make use of the peculiarities of human perception to select the information that must be retained for compressing images without losing too much quality. In this case, signal processing also provides a framework for analyzing images in terms of the spatial frequencies that make up small areas of an image. Since the pixels corresponding to small areas tend to be quite similar, it’s possible to eliminate the fine spatial variations that the human visual system would hardly perceive. As a result of this process, information corresponding to very fine spatial detail can be selectively discarded without affecting too much the quality of the compressed version of the image. Moreover, the accuracy of the human eye in detecting detail for the brightness component of an image is higher than for the color information, a fact that is also used to reduce the amount of data in color images. Signal processing again, provides the mathematical tools for carrying out this analysis process, allowing our everyday applications to manage digital images comfortably and efficiently.

Lossy image compression is used in popular image formats, such as JPEG, which is broadly used to store photos in our digital cameras and to send images throughout the Internet.

You are probably now wondering how does signal processing reduce the data rate of videos. Of course, signal processing also provides methods to allow applications manipulate video information. A video is basically a set of images (or frames) shown rapidly one after another to create motion. The same principles discussed for image compression can be used to compress video information, but in this case, signal processing also exploits the fact that the small areas making up one frame are very similar to the ones of its neighboring frames. Fortunately, the changes from one frame to the next can be efficiently predicted and, combined with the above static image compression techniques, video coding standards are used all the time to store, send and receive video information in our electronic devices.

Digital video coding standards such as MPEG-2 Part 2 or H.264 are used in high-definition digital broadcast television, Blu-rays, Satellite TV and internet streaming applications such as YouTube or Vimeo.

Conclusion

We are immersed in a multimedia world, surrounded by a combination of sounds, images and videos that are an important part of our modern digital life. Signal processing has been a fundamental enabling technology for making all of our favorite gadgets a reality, providing methods for creating, storing, transforming, sending and receiving multimedia information. However, despite the advances in this field, our immersive multimedia world has just begun, and signal processing scientists are currently working on the future of multimedia, developing suitable frameworks for creating new and better immersive environments dealing with virtual and augmented reality, first-person point of view multimedia systems or free viewpoint television.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel