Industry Trends: Machine Learning for Commercial Audio Production

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Industry Trends: Machine Learning for Commercial Audio Production

By: 
Alessio Medda

In recent years, the commercial audio world has seen an increasing number of application of machine learning in commercial audio products. This should not be surprising, as the audio business was an early adopter data-driven tools aimed at performing standard operations on audio tracks normally performed by audio engineers. Nowadays, companies like CEDAR Audio, iZotope, and Accusonus lead the way in application of modern machine learning (ML) and artificial intelligence (AI) for audio products.

Traditional tasks usually performed by audio engineers are being slowly replaced by tools based on ML/AI using algorithms based on a combination of statistical models and neural networks. In 2014, the DNS One from CEDAR Audio, a multichannel dialog noise suppressor, was the first product in the company lineup to explicitly use machine learning by employing the LEARN algorithm. LEARN is designed to compute estimates of the background noise level and determine suitable noise attenuations at each frequency for optimum suppression. In 2016, CEDAR included the evolution of the LEARN algorithm in its CEDAR Cambridge 10 product, with the new FNR algorithm. According to the company, “FERN is an automated noise reduction system for speech recordings suffering from poor signal to noise ratios, and is capable of performance that would have seemed impossible just a few years ago.”

Another company that uses ML/AI in many of its product is iZotope, a company with products aimed a musicians, producers, and audio engineers. At iZotope, ML is used to automatically identify instruments, to automatically detect song structures and for improved waveform navigation. The company’s latest RX7 audio repair toolkit comprise the De-rustle module that uses a trained deep neural network to remove all varieties of rustle in recordings, the Spectral DeNoiser that leverages ML/AI to minimize disturbances from audio recorded in highly variable background noise situations like a stadiums, thunderstorms, or public places. In addition, the company Neutron 3 plugin features a Mixing Assistant that by using ML creates a balanced starting point for an initial-level mix saving time and energy when making creative mix decisions.

Furthermore, another example is Accusonus, a company with its own patented ML/AI technology which is applied to the ERA range of audio clean up tools. This include algorithms for denoise and de-reverberation, de-essing and audio repair, voice leveling, and de-clipping.

Hopefully future improvements will lead to more and more companies integrating ML/AI in their commercial products. This also will likely result in a shift in the music and audio industry, where bad audio does not have to be re-recorded but can be process to become usable. To identify precisely where the next innovation will be is difficult for a field advancing at such a rapid pace, but maybe the next big innovation could be the use of generative audio models like WaveNet for replacing audio that is missing or too corrupt to keep. Perhaps, in 20 years people may look back and see this as the beginning of a new way to think about audio.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel